Zehui Chen   陈泽徽

PhD Candidate

Brain-Inspired Vision Laboratory
School of Automation,
USTC
Hefei, China

Email: lovesnow@mail.ustc.edu.cn;
             lovesnowbest@gmail.com;
Google Scholar: Google Scholar Link
Github: https://github.com/zehuichen123/

Biography

I am a final-year PhD candidate at University of Science and Technology of China (USTC), advised by Prof. Feng Zhao. I got a B.E. degree at Tongji University in 2020. Currently, I am leading the high-level vision language model group at USTC-BIVLab.

My research interests include language models, PEFT learning, and visual perception. Currently, I am super interested in language agents for scalable oversight and complex web search (mainly for Doubao App). Discussions and cooperations are welcomed! (Wechat: lovesnowbest)

NOTE: Our Lab [Link] is looking forward to having elegant students or researchers join us. Positions for Master’s, Ph.D., and post-doc are opening. If you are interested in our research and want to join us, just email me!

News

Experience

Awards

Selected Publications

* denotes equal contribution.

Preprint Papers

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Zehui Chen*, Kuikun Liu*, Qiuchen Wang, Jiangning Liu, Wenwei Zhang, Kai Chen, Feng Zhao
Arxiv 2024
[PDF] [Project] [Code]
MMSearch: Benchmarking the potential of large models as multi-modal search engines
Dongzhi Jiang*, Renrui Zhang*, Ziyu Guo, Yanmin Wu, Jiayi Lei, Pengshuo Qiu, Pan Lu, Zehui Chen, Guanglu Song, Peng Gao, Yu Liu, Chunyuan Li, Hongsheng Li
Arxiv 2024
[PDF] [Project] [Code]

Published Papers

♠ (Co-) First author Papers
PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition
Chenhongyi Yang*, Zehui Chen*, Miguel Espinosa*, Linus Ericsson, Zhenyu Wang, Jiaming Liu, Elliot J Crowley
The British Machine Vision Conference (BMVC), 2024
[PDF] [Code]
Graph-DETR4D: Spatio-Temporal Graph Modeling for Multi-View 3D Object Detection
Zehui Chen, Zheng Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Wu, Feng Zhao
IEEE Transactions on Image Processing (TIP), 2024
[PDF] [Code]
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models
Zehui Chen, Kuikun Liu, Qiuchen Wang, Wenwei Zhang, Jiangning Liu, Dahua Lin, Kai Chen, Feng Zhao
Findings of the Association for Computational Linguistics (ACL Findings) , 2024
[PDF] [Code] [Project]
T-Eval: Evaluating the Tool Utilization Capability Step by Step
Zehui Chen*, Weihua Du*, Wenwei Zhang*, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Gao, Songyang Zhang, Dahua Lin, Kai Chen, Feng Zhao
the Association for Computational Linguistics (ACL), 2024
[PDF] [Code] [Project]
Learning with Noisy Data for Semi-Supervised 3D Object Detection
Zehui Chen, Zhenyu Li, Shuo Wang, Dengpan Fu, Feng Zhao
International Conference on Computer Vision (ICCV), 2023
[PDF] [Code]
DDOD: Dive Deeper into the Disentanglement of Object Detector
Zehui Chen, Chenhongyi Yang, Jiahao Chang, Feng Zhao, Zheng-Jun Zha, Feng Wu
IEEE Transactions on Multimedia (TMM)
[PDF] [Code]
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
International Conference on Learning Representations (ICLR), 2023
[PDF] [Code]
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
ACM International Conference on Multimedia (ACM MM), 2022
[PDF] [Code]
AutoAlignV2: Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
European Conference on Computer Vision (ECCV), 2022
[PDF] [Code]
AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection
Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao, Bolei Zhou, Hang Zhao
International Joint Conference on Artificial Intelligence (IJCAI), 2022
[PDF]
Disentangle Your Dense Object Detector
Zehui Chen*, Chenhongyi Yang*, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, Feng Wu
ACM International Conference on Multimedia (ACM MM), 2021
[PDF] [Code]
♠ Co-author Papers
VFM-Adapter: Adapting Visual Foundation Models for Dense Prediction with Dynamic Hybrid Operation Mapping
Zheng Chen, Yu Zeng, Zehui Chen, Hongzhi Gao, Lin Chen, Jiaming Liu, Feng Zhao
AAAI Conference on Artificial Intelligence (AAAI), 2025
[PDF]
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding
Senqiao Yang, Jiaming Liu, Ray Zhang, Mingjie Pan, Zoey Guo, Xiaoqi Li, Zehui Chen, Peng Gao, Yandong Guo, Shanghang Zhang
AAAI Conference on Artificial Intelligence (AAAI), 2025
[PDF]
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Lin Chen*, Xilin Wei*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Bin Lin, Zhenyu Tang, Li Yuan, Yu Qiao, Dahua Lin2, Feng Zhao, Jiaqi Wang
Neural Information Processing Systems (NeurIPS), 2024, Dataset Track
[PDF] [Project]
Are We on the Right Way for Evaluating Large Vision-Language Models?
Lin Chen*, Jinsong Li*, Xiaoyi Dong, Pan Zhang, Yuhang Zang, Zehui Chen, Haodong Duan, Jiaqi Wang, Yu Qiao, Dahua Lin, Feng Zhao
Neural Information Processing Systems (NeurIPS), 2024
[PDF] [Project]
Stream Query Denoising for Vectorized HD Map Construction
Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao
European Conference on Computer Vision (ECCV), 2024
[PDF]
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
Jiaming Liu, Ran Xu, Senqiao Yang, Renrui Zhang, Qizhe Zhang, Zehui Chen, Yandong Guo, Shanghang Zhang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[PDF]
Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-Supervised 3D Object Detection
Hongzhi Gao, Zheng Chen, Zehui Chen, Lin Chen, Jiaming Liu, Shanghang Zhang, Feng Zhao
AAAI Conference on Artificial Intelligence (AAAI), 2024
[PDF]
Exploring Sparse Visual Prompt for Domain Adaptive Dense Prediction
Senqiao Yang, Jiarui Wu, Jiaming Liu, Xiaoqi Li, Qizhe Zhang, Mingjie Pan, Yulu Gan, Zehui Chen, Shanghang Zhang
AAAI Conference on Artificial Intelligence (AAAI), 2024
[PDF]
DETRDistill: A Universal Knowledge Distillation Framework for DETR-families
Jiahao Chang*, Shuo Wang*, Haiming Xu*, Zehui Chen, Chenhongyi Yang, Feng Zhao
International Conference on Computer Vision (ICCV), 2023
[PDF]
Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View
Shuo Wang*, Xinhai Zhao*, Haiming Xu, Zehui Chen, Dameng Yu, Jiahao Chang, Zhen Yang, Feng Zhao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[PDF]
Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training
Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang
European Conference on Computer Vision (ECCV), 2022
[PDF] [Code]
SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations
Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang, Bolei Zhou, Hang Zhao
AAAI Conference on Artificial Intelligence (AAAI), 2022
[PDF] [Code]