I am currently a researcher at Shanghai AI Labotory. My research interests include Embodied AI, Computer Vision, Robotic Manipulation and Autonomous Driving.
I received my PhD degree from Robotics Institute, Shanghai Jiao Tong University, supervised by Prof. Honghai Liu, and obtained my bachelorβs degree from Central South University. I am currently working with Jiangmiao Pang on Embodied AI. Over the preceding period, I have worked with Prof. Hongyang Li and Prof. Yu Qiao at Shanghai AI Labotory.
I am an interdisciplinary lifelong learner, with an academic journey that has evolved from bio-mechatronics, computer vision, and autonomous driving to embodied AI. I am currently dedicated to advancing Embodied AGI, with an emphasis on generalizable robotic manipulation.
π Selected Publications
π€ Embodied AI * indicates equal contribution

Learning Manipulation by Predicting Interaction
Jia Zengβ, Qingwen Buβ, Bangjun Wangβ, Wenke Xiaβ, Li Chen, Hao Dong, H.Song, D.Wang, D.Hu, P.Luo, H.Cui, B.Zhao, X.Li, Y.Qiao, Hongyang Li
- We propose a representation learning framework towards robotic manipulation that learns Manipulation by Predicting Interaction (MPI).
- RSS 2024 | Project Page | Code

Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
Qingwen Buβ, Jia Zengβ, Chen Liβ, Yanchao Yang, Guyue Zhou, Junchi Yan, Ping Luo, Heming Cui, D.Hu, Yi Ma, Hongyang Li
- We propose βCLOVER, which employs a text-conditioned video diffusion model for generating visual plans as reference inputs, then leverages these sub-goals to guide the feedback-driven policy to generate actions with an error measurement strategy.
- NeurIPS 2024 | Code
π Autonomous Driving

Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection
Jia Zeng, Li Chen, Hanming Deng, Lewei Lu, Junchi Yan, Yu Qiao, Hongyang Li
- We apply knowledge distillation to camera-only 3D object detection, investigate how to distill focal knowledge when confronted with an imperfect 3D object detector teacher.
- CVPR 2023

Geometric-aware Pretraining for Vision-centric 3D Object Detection
Linyan Huang, Huijie Wang, Jia Zeng, et al.
- We propose a geometric-aware pretraining method called GAPretrain, which distills geometric-rich information from LiDAR modality into camera-based 3D object detectors.
- arXiv

Delving Into the Devils of Birdβs-Eye-View Perception: A Review, Evaluation and Recipe
Hongyang Liβ, Chonghao Simaβ, Jifeng Daiβ, Wenhai Wangβ, Lewei Luβ, Huijie Wangβ, Jia Zengβ, Zhiqi Liβ, et al.
- we conduct a thorough review on Birdβs-Eye-View (BEV) perception in recent years and provide a practical recipe according to our analysis in BEV design pipeline.
- IEEE T-PAMI | Github

Open-sourced data ecosystem in autonomous driving: the present and future
Hongyang Liβ, Yang Liβ, Huijie Wangβ, Jia Zengβ, Huilin Xu, et al.
- We undertakes an exhaustive analysis and discourse regarding the characteristics and data scales that future third-generation autonomous driving datasets should possess.
- SCIENTIA SINICA Informationis | arXiv
πͺπ» Bio-Mechatronics & Human-Machine Interaction
- Jia Zeng, Yu Zhou, et al. Fatigue-sensitivity comparison of sEMG and A-Mode ultrasound based hand gesture recognition, IEEE Journal of Biomedical and Health Informatics (JBHI), 2021.
- Jia Zeng, Yu Zhou, et al. Cross modality knowledge distillation between A-mode ultrasound and surface electromyography, IEEE Transactions on Instrumentation and Measurement (TIM), 2022.
- Jia Zeng, Yixuan Sheng, et al. Adaptive learning against muscle fatigue for A-mode ultrasound based gesture recognition, IEEE Transactions on Instrumentation and Measurement (TIM), 2023.
- Yu Zhou, Jia Zeng, et al. Voluntary and FES-induced finger movement estimation using muscle deformation features, IEEE Transactions on Industrial Electronics (TIE), 2019.
π§βπ» Career Experience
Shanghai AI Labotory, Researcher
2023.09 - (present).
- Embodied foundation model and generalizable robotic manipulation.
Shanghai AI Labotory, Research intern
2022.04 - 2023.06, Supervisor: Prof. Hongyang Li.
- Birds-Eye-View perception and Knowledge distillation for 3D object detection.
π Education
Robotics Institute, Shanghai Jiao Tong University
2017.09 - 2023.8, Supervisor: Prof. Honghai Liu.
- Bio-signal based human motion recognition and human-machine interaction.
πΌ Service
- Reviewer for
CVPR 2024
,ECCV 2024
,NeurIPS 2024
,ICLR 2025
, etc. - Member of CAAI-Embodied AI Committee.