about image

Yong-Lu Li

Tenure-Track Assistant Professor

Email: yonglu_li[at]sjtu[dot]edu[dot]cn

Shanghai Jiao Tong University [上海交通大学]

Shanghai Innovation Institute[上海创智学院]

Team Website: RHOS

[School of Artificial Intelligence (SAI)] [人工智能学院]

[Zhiyuan Honors Program] [致远学院]

[Google Scholar] [Github] [LinkedIn] [ORCID]

[ResearchGate] [dblp] [Semantic Scholar]

About





2025.04: Our paper on human-robot joint learning has been selected as an ICRA 2025 Best Paper Award Finalist.

2025.03: Recieved AI100 Youth Pioneers (AI100青年先锋) from MIT Technology Review China.

2025.02: Our works on 3D HOI reconstruction, motion dynamics, garment generation/reconstruction, and dynamic object segmentation will appear at CVPR 2025!

2025.02: Pleasure to be an area chair of NeurIPS 2025!

2025.01: Our work on efficient robot teleoperation will appear at ICRA 2025.

2025.01: Two works on association ability of LLM and human motion will appear at ICLR 2025.

2024.09: Two works on articulated object image manipulation, humanoid-object interaction will appear at NeurIPS 2024.

2024.07: Five works on 4D human motion, dataset distillation, embodied AI, and visual reasoning will appear at ECCV 2024.

2024.06: Our work Visual-Text Dataset Distillation will appear at ICML 2024.

2024.03: Pleasure to be an area chair of NeurIPS 2024!

2024.02: Our work Pangea and Video Distillation will appear at CVPR 2024.

2023.09: The advanced HAKE reasoning engine based on LLM (Symbol-LLM) will appear at NeurIPS 2023!

2023.07: Our works on ego-centric video understanding and object concept learning will appear at ICCV 2023!

2023.07: The upgrade version of DCR will appear at IJCV!

2023.07: Recieved Yunfan Award: Shining Star (10 Chinese AI experts under age of 35) from WAIC 2023.

2023.03: Recieved Wu Wenjun Artificial Intelligence Science and Technology Award 2022, Excellent Doctoral Dissertation from Chinese Society for Artificial Intelligence.

2023.01: HAKE is accepted by TPAMI!

2022.11: We release the human body part states and interactive object bounding box annotations upon AVA (2.1 & 2.2): [HAKE-AVA], and a CLIP-based human part state & verb recognizer: [CLIP-Activity2Vec].

2022.11: AlphaPose will appear at TPAMI!

2022.10: Honored to be a top reviewer in NeurIPS'22!

2022.09: Joined SJTU as a tenure-track assistant professor.

2022.07: Two papers on longtailed learning, HOI detection are accepted by ECCV'22, arXivs and code are coming soon.

2022.03: Five papers on HOI detection/prediction, trajection prediction, 3D detection/keypoints are accepted by CVPR'22, papers and code are coming soon.

2022.02: We release the human body part state labels based on AVA: HAKE-AVA and HAKE 2.0.

2021.10: Recieved Outstanding Reviewer Award from NeurIPS'21.

2021.10: Learning Single/Multi-Attribute of Object with Symmetry and Group is accepted by TPAMI!.

2021.09: Our work Localization with Sampling-Argmax will appear at NeurIPS'21!

2021.05: Selected as the Chinese AI New Star Top-100 (Machine Learning).

2021.02: Upgraded HAKE-Activity2Vec is released! Images/Videos --> human box + ID + skeleton + part states + action + representation. [Demo] [Description]

2021.01: TIN (Transferable Interactiveness Network) is accepted by TPAMI!

2021.01: Recieved Baidu Scholarship (10 recipients globally).

2020.09: Our work HOI Analysis will appear at NeurIPS 2020.

2020.07: Fortunate to recieve WAIC YunFan Award and be among the 2nd A-Class Project.

2020.06: The larger HAKE-Large (>120K images with activity and part state labels) is released!

2020.02: Three papers Image-based HAKE: PaSta-Net, 2D-3D Joint HOI Learning, Symmetry-based Attribute-Object Learning are accepted in CVPR'20! Papers and corresponding resources (code, data) will be released soon.

2019.07: Our paper InstaBoost is accepted in ICCV'19.

2019.06: The Part I of our HAKE : HAKE-HICO which contains the image-level part-state annotations is released!

2019.04: Our project HAKE: Human Activity Knowledge Engine begins trial operation!

2019.02: Our paper on Interactiveness is accepted in CVPR'19.

2018.07: Our paper on GAN & Annotation Generation is accepted in ECCV'18.

2018.05: Presentation (Kaibot Team) in TIDY UP MY ROOM CHALLENGE | ICRA'18.

2018.02: Our paper on Object Part States is accepted in CVPR'18.








Motion Before Action: Diffusing Object Motion as Manipulation Condition

Yue Su, Xinyu Zhan, Hongjie Fang, Yong-Lu Li, Cewu Lu, Lixin Yang.

RA-L 2025  [arXiv] [PDF] [Project] [Code]

Dense Policy: Bidirectional Autoregressive Learning of Actions

Yue Su, Xinyu Zhan, Hongjie Fang, Han Xue, Hao-Shu Fang, Yong-Lu Li, Cewu Lu, Lixin Yang.

arXiv 2025  [arXiv] [PDF] [Project] [Code]

Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions

Boran Wen, Dingbang Huang, Zichen Zhang, Jiahong Zhou, Jianbin Deng, Jingyu Gong, Yulong Chen*, Lizhuang Ma*, Yong-Lu Li*.

CVPR 2025  [arXiv] [PDF] [Project] [Code]

GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction

Li Zhang, Mingliang Xu, Jianan Wang, Qiaojun Yu, Lixin Yang, Yong-Lu Li, Cewu Lu, RujingWang, Liu Liu.

CVPR 2025  [arXiv] [PDF] [Project] [Code]

Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis

Feng Zhou, Ruiyang Liu, Chen Liu, Gaofeng He, Yong-Lu Li, Xiaogang Jin, Huamin Wang.

CVPR 2025  [arXiv] [PDF] [Project] [Code]

M3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation

Zixuan Chen*, Jiaxin Li*, Liming Tan, Yejie Guo, Junxuan Liang, Cewu Lu, Yong-Lu Li*.

CVPR 2025  [arXiv] [PDF] [Project] [Code] [Annotation Tool]

Homogeneous Dynamics Space for Heterogeneous Humans

Xinpeng Liu, Junxuan Liang, Chenshuo Zhang, Zixuan Cai, Cewu Lu*, Yong-Lu Li*.

CVPR 2025  [arXiv] [PDF] [Project] [Code]

Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition

Shengcheng Luo*, Quanquan Peng*, Jun Lv, Kaiwen Hong, Katherine Rose Driggs-Campbell, Cewu Lu, Yong-Lu Li*.

ICRA 2025  [arXiv] [PDF] [Project] [Code]

The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

Hong Li, Nanxi Li, Yuanjie Chen, Jianbin Zhu, Qinlu Guo, Cewu Lu, Yong-Lu Li*.

ICLR 2025  [arXiv] [PDF] [Project] [Code]

ImDy: Human Inverse Dynamics from Imitated Observations

Xinpeng Liu, Junxuan Liang, Zili Lin, Haowen Hou, Yong-Lu Li*, Cewu Lu*.

ICLR 2025  [arXiv] [PDF] [Project] [Code]

exUMI: Extensible System for Robot Teaching with Precise Proprioception and Multi-Modalities

Yue Xu, Litao Wei, Pengyu An, Yong-Lu Li*.

arXiv 2025  [arXiv] [PDF] [Project] [Code]

Interacted Object Grounding in Spatio-Temporal Human-Object Interactions

Xiaoyang Liu*, Boran Wen*, Xinpeng Liu*, Zizheng Zhou, Hongwei Fan, Cewu Lu, Lizhuang Ma, Yulong Chen*, Yong-Lu Li*.

AAAI 2025  [arXiv] [PDF] [Project] [Code]

Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models

Zehao Wang, Xinpeng Liu, Xiaoqian Wu, Yudonglin Zhang, Zhou Fang, Yifan Fang, Junfu Pu, Cewu Lu*, Yong-Lu Li*.

arXiv 2024  [arXiv] [PDF] [Project] [Code]

General Articulated Objects Manipulation in Real Images via Part-Aware Diffusion Process

Zhou Fang, Yong-Lu Li*, Lixin Yang. Cewu Lu*.

NeurIPS 2024  [arXiv] [PDF] [Project]

HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid

Xinyu Xu, Yizheng Zhang, Yong-Lu Li, Lei Han, Cewu Lu.

NeurIPS 2024  [arXiv] [PDF] [Code]

Take A Step Back: Rethinking the Two Stages in Visual Reasoning

Mingyu Zhang, Jiting Cai, Mingyu Liu, Yue Xu, Cewu Lu, Yong-Lu Li*.

ECCV 2024  [arXiv] [PDF] [Project] [Code]

Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection

Yue Xu, Yong-Lu Li*, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi Keung Tang.

ECCV 2024  [arXiv] [PDF] [Project] [Code]

Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases

Xinpeng Liu, Yong-Lu Li*, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu*.

ECCV 2024  [arXiv] [PDF] [Project] [Code]

Revisit Human-Scene Interaction via Space Occupancy

Xinpeng Liu*, Haowen Hou*, Yanchao Yang, Yong-Lu Li*, Cewu Lu.

ECCV 2024  [arXiv] [PDF] [Project] [Code]

DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control

Xinyu Xu, Shengcheng Luo, Yanchao Yang, Yong-Lu Li*, Cewu Lu*.

ECCV 2024  [arXiv] [PDF] [Code]

Low-Rank Similarity Mining for Multimodal Dataset Distillation

Yue Xu, Zhilin Lin, Yusong Qiu, Cewu Lu, Yong-Lu Li*.

ICML 2024  [arXiv] [PDF] [Project] [Code]

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

Yong-Lu Li*, Xiaoqian Wu*, Xinpeng Liu, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Xudong Lu, Jingru Tan, Cewu Lu.

CVPR 2024 Highlight  [arXiv] [PDF] [Project] [Code]

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Ziyu Wang*, Yue Xu*, Cewu Lu, Yong-Lu Li*.

CVPR 2024  [arXiv] [PDF] [Project] [Code]

Primitive-based 3D Human-Object Interaction Modelling and Programming

Siqi Liu, Yong-Lu Li*, Zhou Fang, Xinpeng Liu, Yang You, Cewu Lu*.

AAAI 2024  [arXiv] [PDF] [Project] [Code]

Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Xiaoqian Wu, Yong-Lu Li*, Jianhua Sun, Cewu Lu*.

NeurIPS 2023  [arXiv] [PDF] [Project] [Code]

Beyond Object Recognition: A New Benchmark towards Object Concept Learning

Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Yuan Yao, Siqi Liu, Cewu Lu.

ICCV 2023  [arXiv] [PDF] [Project] [Data] [Code]

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

Yue Xu, Yong-Lu Li*, Zhemin Huang, Michael Xu LIU, Cewu Lu, Yu-Wing Tai, Chi Keung Tang.

ICCV 2023  [arXiv] [PDF] [Project] [Code]

Dynamic Context Removal: A General Training Strategy for Robust Models on Video Action Predictive Tasks

Xinyu Xu, Yong-Lu Li*, Cewu Lu*.

IJCV 2023  [arXiv] [PDF] [Code]

Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Yong-Lu Li*, Hongwei Fan*, Zuoyu Qiu, Yiming Dou, Liang Xu, Hao-Shu Fang, Peiyang Guo, Haisheng Su, Dongliang Wang, Wei Wu, Cewu Lu.

Tech Report  A part of the HAKE Project [arXiv] [PDF] [Code & Data]

AlphaTracker: A Multi-Animal Tracking and Behavioral Analysis Tool

Ruihan Zhang, Hao-Shu Fang, Zexin Chen, Yu E Zhang, Aneesh Bal, Haowen Zhou, Rachel R Rock, Nancy Padilla-Coreano, Laurel R Keyes, Haoyi Zhu, Yong-Lu Li, Takaki Komiyama, Kay M Tye, Cewu Lu.

Frontiers in Behavioral Neuroscience - Individual and Social Behaviors 2023  [arXiv] [PDF] [Project]

HAKE: A Knowledge Engine Foundation for Human Activity Understanding

Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Zuoyu Qiu, Liang Xu, Yue Xu, Hao-Shu Fang, Cewu Lu.

TPAMI 2023  HAKE 2.0 [arXiv] [PDF] [Project] [Code] [Press]

AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time

Hao-Shu Fang*, Jiefeng Li*, Hongyang Tang, Chao Xu, Haoyi Zhu, Yuliang Xiu, Yong-Lu Li, Cewu Lu.

TPAMI 2023  [arXiv] [PDF] [Code]

Constructing Balance from Imbalance for Long-tailed Image Recognition

Yue Xu*, Yong-Lu Li*, Jiefeng Li, Cewu Lu.

ECCV 2022  [arXiv] [PDF] [Code]

Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection

Xiaoqian Wu*, Yong-Lu Li*, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu.

ECCV 2022  [arXiv] [PDF] [Code]

Interactiveness Field of Human-Object Interactions

Xinpeng Liu*, Yong-Lu Li*, Xiaoqian Wu, Yu-Wing Tai, Cewu Lu, Chi Keung Tang.

CVPR 2022  [arXiv] [PDF] [Code]

Human Trajectory Prediction with Momentary Observation

Jianhua Sun, Yuxuan Li, Liang Chai, Hao-Shu Fang, Yong-Lu Li, Cewu Lu.

CVPR 2022  [PDF]

Learn to Anticipate Future with Dynamic Context Removal

Xinyu Xu, Yong-Lu Li, Cewu Lu.

CVPR 2022  [arXiv] [PDF] [Code]

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes

Yang You, Zelin Ye, Yujing Lou, Chengkun Li, Yong-Lu Li, Lizhuang Ma, Weiming Wang, Cewu Lu.

CVPR 2022  [arXiv] [PDF] [Code]

UKPGAN: Unsupervised KeyPoint GANeration

Yang You, Wenhai Liu, Yong-Lu Li, Weiming Wang, Cewu Lu.

CVPR 2022  [arXiv] [PDF] [Code]

Highlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection

Xinpeng Liu*, Yong-Lu Li*, Cewu Lu.

AAAI 2022  [arXiv] [PDF] [Code]

Learning Single/Multi-Attribute of Object with Symmetry and Group

Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Cewu Lu.

TPAMI 2022  [arXiv] [PDF] [Code]

An extension of our CVPR 2020 work (Symmetry and Group in Attribute-Object Compositions, SymNet).

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Xijie Huang, Liang Xu, Cewu Lu.

TPAMI 2022  [arXiv] [PDF] [Code]

An extension of our CVPR 2019 work (Transferable Interactiveness Network, TIN).

Localization with Sampling-Argmax

Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-Lu Li, Cewu Lu.

NeurIPS 2021  [arXiv] [PDF] [Code]

DecAug: Augmenting HOI Detection via Decomposition

Yichen Xie, Hao-Shu Fang, Dian Shao, Yong-Lu Li, Cewu Lu.

AAAI 2021  [arXiv] [PDF]

HOI Analysis: Integrating and Decomposing Human-Object Interaction

Yong-Lu Li*, Xinpeng Liu*, Xiaoqian Wu, Yizhuo Li, Cewu Lu.

NeurIPS 2020  [arXiv] [PDF] [Code] [Project: HAKE-Action-Torch]

PaStaNet: Toward Human Activity Knowledge Engine

Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu.

CVPR 2020  [arXiv] [PDF] [Video] [Slides] [Data] [Code]

Oral Talk, Compositionality in Computer Vision in CVPR 2020.

Detailed 2D-3D Joint Representation for Human-Object Interaction

Yong-Lu Li, Xinpeng Liu, Han Lu, Shiyi Wang, Junqi Liu, Jiefeng Li, Cewu Lu.

CVPR 2020  [arXiv] [PDF] [Video] [Slides] [Benchmark: Ambiguous-HOI] [Code]

Symmetry and Group in Attribute-Object Compositions

Yong-Lu Li, Yue Xu, Xiaohan Mao, Cewu Lu.

CVPR 2020  [arXiv] [PDF] [Video] [Slides] [Code]

HAKE: Human Activity Knowledge Engine

Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu.

Tech Report  HAKE 1.0 [arXiv] [PDF] [Project] [Code]

Main Repo:

Sub-repos: Torch TF HAKE-AVA Halpe List

InstaBoost: Boosting Instance Segmentation Via Probability Map Guided Copy-Pasting

Hao-Shu Fang*, Jianhua Sun*, Runzhong Wang*, Minghao Gou, Yong-Lu Li, Cewu Lu.

ICCV 2019  [arXiv] [PDF] [Code]

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu.

CVPR 2019  [arXiv] [PDF] [Code]

SRDA: Generating Instance Segmentation Annotation via Scanning, Reasoning and Domain Adaptation

Wenqiang Xu*, Yong-Lu Li*, Cewu Lu.

ECCV 2018  [arXiv] [PDF] [Dataset](Instance-60k & 3D Object Models) [Code]

Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

Cewu Lu, Hao Su, Yong-Lu Li, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas J. Guibas.

CVPR 2018  [PDF]

Optimization of Radial Distortion Self-Calibration for Structure from Motion from Uncalibrated UAV Images

Yong-Lu Li, Yinghao Cai, Dayong Wen, Yiping Yang.

ICPR 2016  [PDF]







Contents:

1) HAKE-Image (CVPR'18/20): Human body part state (PaSta) labels in images. HAKE-HICO, HAKE-HICO-DET, HAKE-Large, Extra-40-verbs.

2) HAKE-AVA: Human body part state (PaSta) labels in videos from AVA dataset. HAKE-AVA.

3) HAKE-Action-TF, HAKE-Action-Torch (CVPR'18/19/22, NeurIPS'20, TPAMI'22/23, ECCV'22, AAAI'22): SOTA action understanding methods and the corresponding HAKE-enhanced versions (TIN, IDN, IF, ParMap).

4) HAKE-3D (CVPR'20): 3D human-object representation for action understanding (DJ-RN).

5) HAKE-Object (CVPR'20, TPAMI'21): object knowledge learner to advance action understanding (SymNet).

6) HAKE-A2V (CVPR'20): Activity2Vec, a general activity feature extractor based on HAKE data, converts a human (box) to a fixed-size vector, PaSta and action scores.

7) Halpe: a joint project under Alphapose and HAKE, full-body human keypoints (body, face, hand, 136 points) of 50,000 HOI images.

8) HOI Learning List: a list of recent HOI (Human-Object Interaction) papers, code, datasets and leaderboard on widely-used benchmarks.

Transformer-in-Vision

Survery: recent Transformer-based CV and related works.

LLM-in-Vision

Survery: recent LLM-based CV and related works.




  • Area Chair, NeurIPS'24/25.
  • Executive Area Chairs, Vision And Learning SEminar, VALSE, 2024-Now.
  • Executive Member, Committee on CAD&CG, China Computer Federation (CCF), 2024-Now.
  • Executive Member and member of secretariat, Committee on Embodied Intelligence, China Association for Artificial Intelligence (CAAI), 2024-Now.
  • Organizer and host, Embodied Intelligence Forum, The third Conference on Machine Learning Algorithms and Natural Language Processing (MLNLP 2024).
  • Organizer, CAAI Embodied Intelligence Young Scholars Seminar, 5th.
  • Program Committee Member, Compositionality in Computer Vision, CVPR 2020.

  • Reviewer

  • Conference: CVPR'20/21/22/23/24/25, NeurIPS'20/21/22/23/24, ICCV'21/23/25, ICLR'22/23/24/25, ECCV'22/24, ICML'21/22/23/24/25, CoRL'25, AAAI'21/22/23/24/25.
  • Journal: TPAMI, IJCV, ACM Computing Surveys, TCSVT, Neurocomputing, Pattern Recognition, JVCI, IMAVIS, IoT, Science China Information Sciences, Automation in Construction, IET Image Processing, Neural Networks, Computer Science Review.
  • International Program Committee (IPC): CAD/Graphics 2025

  • Competition Judge

  • The Yangtze River Delta Youth Artificial Intelligence Olympic Challenge.

  • The 4th International Artificial Intelligence Fair, IAIF.
  • The 2nd Yangtze River Delta Youth Artificial Intelligence Olympic Challenge.






  • Teacher

  • AI3604: Computer Vision, Shanghai Jiao Tong University, 2023-2024 (Fall)
  • ACM Class

  • CS7352: Advanced Neural Network Theory and Applications, 23-24 (Spring)
  • Computer Science and Technology

  • AI3618: Virtual Reality, Shanghai Jiao Tong University, 22-23, 23-24 (Spring) [Website]
  • 3rd, 4th Artificial Intelligence Class


    Lecturer

  • CS348: Computer Vision, Shanghai Jiao Tong University, 2022-2023 (Fall)
  • ACM Class

  • Guided Porjects --> Papers:
  • [1] CymNet: CLIP boosted SymNet for Compositional Zero-shot Learning.

    Jiaming Shan, Zhaozi Wang, Mingshu Zhai. ICIPMC 2023, [Porject]

    [2] DiffAnnot: Improved Neural Annotator with Denoising Diffusion Model.

    Chanfan Lin, Tianyuan Qiu, Hanchong Yan, Muzi Tao. ICIPMC 2023, [Porject]

    [3] CPaStaNet: A CLIP-based Human Activity Knowledge Engine.

    Yijia Hong, Haotian Luo, Xialin He, Yaoqi Ye. ICIPMC 2023, [Porject]


    Assistant

  • AI1602: Problem Solving and Practice of Artificial Intelligence, Shanghai Jiao Tong University, 2020-2021 (Spring)
  • 2nd Artificial Intelligence Class

    Mentor

  • AI1602: Problem Solving and Practice of Artificial Intelligence, Shanghai Jiao Tong University, 2019-2020 (Spring)
  • 1st Artificial Intelligence Class

  • Guided Porject --> Paper:
  • Rb-PaStaNet: A Few-Shot Human-Object Interaction Detection Based on Rules and Part States

    Shenyu Zhang, Zichen Zhu, Qingquan Bao (freshmen)

    IMVIP 2020, Press Coverage: SEIEE of SJTU







  • 具身智能:人型机器人与大模型共同进化,为外脑提供“躯体”, 腾讯研究院大模型十大趋势报告, 2024.
  • ChatGPT与加强国际话语治理的多维路径, 中国社会科学(内部文稿), 2023, 8(4).






  • 2024.4.2: 视觉推理与具身智能. 智猩猩
  • 2023.12.20: RHOS: Robot, Human, Object, and Scene. CS3317 Artificial Intelligence (B), SJTU. Thank Prof. Panpan Cai for hosting.
  • 2023.10.12: RHOS: Robot, Human, Object, and Scene. HKUST AI Seminar series, HKUST (Guangzhou). Thank Prof. Junwei Liang for hosting.
  • 2023.2.24: RHOS: Robot, Human, Object, and Scene. Forum on Trends in cutting-edge Technology for YunFan Award AI Scholars, GAIDC, Global AI Developer Conference
  • 2022.12.7: Knowledge-driven Action Reasoning. HarmonyOS Technology Innovation Salon
  • 2022.8.17: Three Stages in Human-Object Interaction Detection. VALSE Webinar
  • 2022.6.1: Recent progress in Human Activity Knowledge Engine
  • IDEA. Thank Dr. Ailing Zeng for hosting.

  • 2021.12.9: Knowledge-driven Activity Understanding
  • CUMT "Image Analysis and Understanding" Frontier Forum. Thank Prof. Zhiwen Shao for hosting.

  • 2021.8.21: HAKE and Human-Object Interaction (HOI) Detection
  • CoLab. Thank Prof. Si Liu for hosting.

  • 2021.3.05: Human Activity Knowledge Engine (updated)
  • SJTU Computer Science Global Lunch Series. [Video]

  • 2020.8.23: Knowledge Driven Human Activity Understanding
  • The 3rd International conference on Image, Video Processing and Artificial Intelligence, IVPAI 2020.

  • 2020.7.12: Human Activity Knowledge Engine
  • Student Forum on Frontiers of AI, SFFAI. [Slides]

  • 2020.6.16: PaStaNet: Toward Human Activity Knowledge Engine
  • Compositionality in Computer Vision in CVPR 2020 Virtual. [Video] [Slides]







  • ICRA 2025 Best Paper Award Finalist
  • AI100 Youth Pioneers (AI100青年先锋), Deeptech and MIT Technology Review China.
  • Leading Technology Award at the World Internet Conference 2024, Visual Understanding Common Technologies and Applications for Intelligent Social Governance.
  • WAIC Yunfan Award: Shining Star (10 Chinese AI experts under the age of 35), Jul. 2023. Press Coverage: 机器之心
  • Wu Wenjun Artificial Intelligence Science and Technology Award 2022, Excellent Doctoral Dissertation, Mar. 2023.
  • Top Reviewer, NeurIPS'22, Oct. 2022.
  • Outstanding Reviewer, NeurIPS'21, Oct. 2021.
  • Shanghai Outstanding Doctoral Graduate, Aug. 2021.
  • PhD Fellowship, The 85th Computer Department Education Development Fund and Yang Yuanqing Education Fund, Jun. 2021.
  • Chinese AI New Star Top-100 (Machine Learning), May. 2021.
  • Baidu Scholarship (Top-10, worldwide), Jan. 2021. Press Coverage.
  • WAIC Outstanding Developer, Dec. 2020. Press Coverage: 机器之心, 上海临港
  • China National Scholarship, Sep. 2020.
  • WAIC YunFan Award, Rising Star, July. 2020 (World Artificial Intelligence Conference, Shanghai). Press Coverage: 机器之心
  • The 2nd A-Class , July. 2020.
  • Huawei Scholarship, Nov. 2019.






  • Chinese History Reading
  • Twenty-Four Histories: 6/24

  • Travel around China
  • 34 Provinces: 21/34