Hi, I am a second-year Ph.D. student in Fudan University & SII (Joint Program), currently supervised by Prof. Pengfei Liu. I received my B.S. in Computer Science from Fudan University. Previously, I worked with Qi Zhang, Fei Liu, and Jiaqi Wang.
My research centers on Multimodal Interactive Intelligence, with a particular focus on Multimodal LLMs and LLM agents. Please feel free to contact me for discussion and collaboration!
[Mar. 2026] Three papers were accepted to ACL 2026: GeometryZero, ASVR, and VideoPro.
[Nov. 2025] We released GeoVista, a web-augmented agentic visual reasoning framework for geolocalization.
[Jul. 2025] I joined Tencent Hunyuan as a research intern, working on frontier visual reasoning models.
[Apr. 2025] VisuoThink was accepted to ACL 2025 Main.
[Aug. 2024] Two papers were presented at ACL 2024.
Topic: Frontier visual reasoning model; supervised finetuning and reinforcement learning