I am Kanzhi Cheng (程瞰之), a PhD student (2021.9 - ) in the NLP Group at Nanjing University, advised by Dr. Jiajun Chen & Dr. Jianbing Zhang. Previously, I worked as a research intern at Shanghai AI Lab, Tsinghua AIR, and Microsoft Research. I am deeply grateful for the opportunity to work with and learn from Dr. Zhiyong Wu , Dr. Hao Zhou, and Dr. Qianhui Wu.

Currently, I am broadly interested in multimodal intelligence, with a focus on:

I expect to graduate in 2026. Please feel free to reach out!

🔥 News

  • 2025.07:  🏖️🏖️ See you at Vienna 🇦🇹!
  • 2025.06:  🤖🤖 We release GUI-Actor to advance visual grounding for GUI Agents.
  • 2025.05:  🎉🎉 Four papers are accepeted by ACL 2025.

📝 Selected Publications

ACL 2024 (Main)
sym

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Yantao Li, Jianbing Zhang, Zhiyong Wu

Code   Models&Data    

NeurIPS 2025 (Poster)
sym

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Qianhui Wu*, Kanzhi Cheng*, Rui Yang*, Chaoyun Zhang, Jianwei Yang, Huiqiang Jiang, Jian Mu, Baolin Peng, Bo Qiao, Reuben Tan, Si Qin, Lars Liden, Qingwei Lin, Huan Zhang, Tong Zhang, Jianbing Zhang, Dongmei Zhang, Jianfeng Gao

Code   Project Page   Models&Data    

ACL 2025 (Main)
sym

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Qiushi Sun*, Kanzhi Cheng*, Zichen Ding*, Chuanyang Jin*, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu

Code   Project Page   Models&Data    

NAACL 2025 (Main)
sym

Vision-Language Models Can Self-Improve Reasoning via Reflection
Kanzhi Cheng*, Yantao Li*, Fangzhi Xu, Jianbing Zhang, Hao Zhou, Yang Liu

Code    

ACL 2025 (Findings)
sym

CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
Kanzhi Cheng*, Wenpo Song*, Jiaxin Fan*, Zheng Ma, Qiushi Sun, Fangzhi Xu, Chenyang Yan, Nuo Chen, Jianbing Zhang, Jiajun Chen

Code   Project Page    

🎖 Honors and Awards

  • 2025.05 Excellent Graduate Student, Nanjing University
  • 2024.10 First-Class Graduate Talent Scholarship, Nanjing University
  • 2023.10 Outstanding Graduate Award, Nanjing University
  • 2021.06 Outstanding Graduation Project, Nanjing University
  • 2019.10 Guanghua Scholarship (1%), Nanjing University

📖 Educations

  • 2021.09 - 2026 (now), Ph.D. Student at the Department of Computer Science and Technology, Nanjing University.
  • 2017.09 - 2021.06, B.E. at the School of Management and Engineering, Nanjing University.

💻 Internships

  • 2025.04 - 2025.06, Microsoft Research.
  • 2024.08 - 2025.01, Tsinghua AIR, China.
  • 2023.08 - 2024.04, Shanghai Artificial Intelligence Laboratory, China.

⚽️ Personal Interests

I am an amateur football player, primarily playing as an attacking midfielder. As a player, I’ve been fortunate to win the following honors:

  • Member of the Nanjing University official football team
  • 🏆 Champion of the 2022–2023 Nanjing University Caigen Cup, awarded Final MVP
  • 🏆 Champion of the 2018–2019 Nanjing University FA Cup

I am also a fan of Arsenal Football Club.

football1
football2
football3