I work at BIGAI as a senior research engineer now in Beijing, advised by Zilong Zheng (郑子隆). I am now working on diffusion language model, long context and long sequence generation research.

I graduated from Computer Technology, Tsinghua University (清华大学) with a master’s degree and from Computer Science and Technology, Beijing Institute and Technology (北京理工大学) with a bachelor’s degree.

I have interned at X-Tech, XiZi, SenseTime, IDEA, MSRA, Deepseek, and BIGAI. At X-Tech, I was advised by Yifei Jin (金逸飞) and Jian Li (李建). At IDEA, I was advised by Hao Wang (王昊) and Jiaxing Zhang (张家兴). At MSRA, I was advised by Zhihao Fan (范智昊) and Yeyun Gong (宫叶云). I have published some papers at the top international AI conferences such as NeurIPS, ICML.

🔥 News

  • 2024.09:  🎉🎉 Our paper is accepted by NIPS 2024

📝 Publications

NeurIPS 2023
sym

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Tong Wu^, Zhihao Fan^, Xiao Liu, Hai-Tao Zheng, Yeyun Gong, yelong shen, Jian Jiao, Juntao Li, zhongyu wei, Jian Guo, Nan Duan, Weizhu Chen

Github Page

ECAI 2023 Oral
sym

Enhancing Text Generation with Cooperative Training

Tong Wu^, Hao Wang^, Zhongshen Zeng, Wei Wang, Hai-Tao Zheng, Jiaxing Zhang

Github Page

📖 Educations

  • 2021.09 - 2024.06, Master, Tsinghua University, Beijing.
  • 2017.08 - 2021.06, Bachelor, Beijing Institute and Technology, Beijing.

💻 Internships

  • 2024.02 - 2024.06, BIGAI, NLCo, Beijing.
  • 2023.07 - 2023.11, Deepseek, Beijing.
  • 2022.11 - 2023.07, MSRA, NLC, Beijing.
  • 2022.04 - 2022.10, IDEA, CCNL, Shenzhen.
  • 2021.10 - 2022.02, SenseTime, Shenzhen.
  • 2021.01 - 2021.06, Xizi, Beijing.
  • 2020.01 - 2020.05, X-Tech, Beijing.