about image

Yong-Lu Li

Tenure-Track Assistant Professor

Email: yonglu_li[at]sjtu[dot]edu[dot]cn

Shanghai Jiao Tong University

Team Website: RHOS

[Google Scholar] [Github] [LinkedIn] [ORCID]

[ResearchGate] [dblp] [Semantic Scholar]


I'm a tenure-track assistant professor at Shanghai Jiao Tong University (SJTU), affiliated with the Qing Yuan Research Institute. I am a member of the Machine Vision and Intelligence Group (MVIG), working closely with Prof. Cewu Lu. I study Human Activity Understanding, Visual Reasoning, and Embodied AI. We are building HAKE, a knowledge-driven system that enables intelligent agents to perceive human activities, reason human behavior logics, learn skills from human activities, and interact with environment. Check out the HAKE site for more information.

Before joining SJTU, I worked closely with IEEE Fellow Prof. Chi Keung Tang and Yu-Wing Tai at the Hong Kong University of Science and Technology (HKUST) (2021-2022). I received a Ph.D. degree (2017-2021) in Computer Science from Shanghai Jiao Tong University (SJTU), under the supervision of Prof. Cewu Lu. Prior to that, I worked and studied at the Institute of Automation, Chinese Academy of Sciences (CASIA) (2014-2017) under the supervision of Prof. Yiping Yang and A/Prof. Yinghao Cai.

Research interests: Human-Robot-Scene

(S) Embodied AI: how to make agents learn skills from humans and interact with humans.

(S-1) Human Activity Understanding: how to learn and ground complex/ambiguous human activity concepts (body motion, human-object/human/scene interaction) and object concepts from multi-modal information (2D-3D-4D).

(S-2) Visual Reasoning: how to mine, capture, and embed the logics and causal relations from human activities.

(S-3) General Multi-Modal Foundation Models: especially for human-centric perception tasks.

(S-4) Activity Understanding from A Cognitive Perspective: work with multidisciplinary researchers to study how the brain perceives activities.

(E) Human-Robot Interaction for Smart Hospital: work with the healthcare team (doctors and engineers) in SJTU and Ruijin Hospital to develop intelligent robots to help people.

Recruitment: Actively looking for self-motivated students (master/PhD, 2024 spring & fall), interns/engineers/visitors (CV/ML/ROB/NLP background, always welcome) to join us in Machine Vision and Intelligence Group (MVIG). If you share same/similar interests, feel free to drop me an email with your resume. Click here for more details.

News and Olds

2023.09: The advanced HAKE reasoning engine based on LLM (OpenPaSta) will appear at NeurIPS 2023!

2023.07: Our works on ego-centric video understanding and object concept learning will appear at ICCV 2023!

2023.07: The upgrade version of DCR will appear at IJCV!

2023.07: Recieved Yunfan Award: Shining Star (10 Chinese AI experts under age of 35) from WAIC 2023.

2023.03: Recieved Wu Wenjun Artificial Intelligence Science and Technology Award 2022, Excellent Doctoral Dissertation from Chinese Society for Artificial Intelligence.

2023.01: HAKE is accepted by TPAMI!

2022.11: We release the human body part states and interactive object bounding box annotations upon AVA (2.1 & 2.2): [HAKE-AVA], and a CLIP-based human part state & verb recognizer: [CLIP-Activity2Vec].

2022.11: AlphaPose will appear at TPAMI!

2022.10: Honored to be a top reviewer in NeurIPS'22!

2022.09: Joined SJTU as a tenure-track assistant professor.

2022.07: Two papers on longtailed learning, HOI detection are accepted by ECCV'22, arXivs and code are coming soon.

2022.03: Five papers on HOI detection/prediction, trajection prediction, 3D detection/keypoints are accepted by CVPR'22, papers and code are coming soon.

2022.02: We release the human body part state labels based on AVA: HAKE-AVA and HAKE 2.0.

2021.12: Our work on HOI generalization will appear at AAAI'22.

2021.10: Recieved Outstanding Reviewer Award from NeurIPS'21.

2021.10: Learning Single/Multi-Attribute of Object with Symmetry and Group is accepted by TPAMI!.

2021.09: Our work Localization with Sampling-Argmax will appear at NeurIPS'21!

2021.05: Selected as the Chinese AI New Star Top-100 (Machine Learning).

2021.02: Upgraded HAKE-Activity2Vec is released! Images/Videos --> human box + ID + skeleton + part states + action + representation. [Demo] [Description]

2021.01: TIN (Transferable Interactiveness Network) is accepted by TPAMI!

2021.01: Recieved Baidu Scholarship (10 recipients globally).

2020.12: DecAug is accepted by AAAI'21.

2020.09: Our work HOI Analysis will appear at NeurIPS 2020.

2020.07: Fortunate to recieve WAIC YunFan Award and be among the 2nd A-Class Project.

2020.06: The larger HAKE-Large (>120K images with activity and part state labels) is released!

2020.02: Three papers Image-based HAKE: PaSta-Net, 2D-3D Joint HOI Learning, Symmetry-based Attribute-Object Learning are accepted in CVPR'20! Papers and corresponding resources (code, data) will be released soon.

2019.07: Our paper InstaBoost is accepted in ICCV'19.

2019.06: The Part I of our HAKE : HAKE-HICO which contains the image-level part-state annotations is released!

2019.04: Our project HAKE: Human Activity Knowledge Engine begins trial operation!

2019.02: Our paper on Interactiveness is accepted in CVPR'19.

2018.07: Our paper on GAN & Annotation Generation is accepted in ECCV'18.

2018.05: Presentation (Kaibot Team) in TIDY UP MY ROOM CHALLENGE | ICRA'18.

2018.02: Our paper on Object Part States is accepted in CVPR'18.


equal contribution: *

corresponding author: *

Rethinking the Symbolic System in Visual Human Activity Reasoning

Xiaoqian Wu, Yong-Lu Li*, Jianhua Sun, Cewu Lu*.

NeurIPS 2023  [arXiv] [PDF] [Project] [Code]

Beyond Object Recognition: A New Benchmark towards Object Concept Learning

Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Yuan Yao, Siqi Liu, Cewu Lu.

ICCV 2023  [arXiv] [PDF] [Project] [Code]

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

Yue Xu, Yong-Lu Li*, Zhemin Huang, Michael Xu LIU, Cewu Lu, Yu-Wing Tai, Chi Keung Tang.

ICCV 2023  [arXiv] [PDF] [Project] [Code]

Dynamic Context Removal: A General Training Strategy for Robust Models on Video Action Predictive Tasks

Xinyu Xu, Yong-Lu Li*, Cewu Lu*.

IJCV 2023  [arXiv] [PDF] [Code]

Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection

Yue Xu, Yong-Lu Li*, Kaitong Cui, Ziyu Wang, Cewu Lu, Yu-Wing Tai, Chi Keung Tang.

Preprint  [arXiv] [PDF]

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

Yong-Lu Li*, Xiaoqian Wu*, Xinpeng Liu, Yiming Dou, Yikun Ji, Junyi Zhang, Yixing Li, Jingru Tan, Xudong Lu, Cewu Lu.

Preprint  [arXiv] [PDF] [Project]

Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

Yong-Lu Li*, Hongwei Fan*, Zuoyu Qiu, Yiming Dou, Liang Xu, Hao-Shu Fang, Peiyang Guo, Haisheng Su, Dongliang Wang, Wei Wu, Cewu Lu.

Tech Report  A part of the HAKE Project [arXiv] [PDF] [Code & Data]

HAKE: A Knowledge Engine Foundation for Human Activity Understanding

Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Yizhuo Li, Zuoyu Qiu, Liang Xu, Yue Xu, Hao-Shu Fang, Cewu Lu.

TPAMI 2023  HAKE 2.0 [arXiv] [PDF] [Project] [Press]

AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time

Hao-Shu Fang*, Jiefeng Li*, Hongyang Tang, Chao Xu, Haoyi Zhu, Yuliang Xiu, Yong-Lu Li, Cewu Lu.

TPAMI 2023  [arXiv] [PDF] [Code]

Constructing Balance from Imbalance for Long-tailed Image Recognition

Yue Xu*, Yong-Lu Li*, Jiefeng Li, Cewu Lu.

ECCV 2022  [arXiv] [PDF] [Code]

Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection

Xiaoqian Wu*, Yong-Lu Li*, Xinpeng Liu, Junyi Zhang, Yuzhe Wu, Cewu Lu.

ECCV 2022  [arXiv] [PDF] [Code]

Interactiveness Field of Human-Object Interactions

Xinpeng Liu*, Yong-Lu Li*, Xiaoqian Wu, Yu-Wing Tai, Cewu Lu, Chi Keung Tang.

CVPR 2022  [arXiv] [PDF] [Code]

Human Trajectory Prediction with Momentary Observation

Jianhua Sun, Yuxuan Li, Liang Chai, Hao-Shu Fang, Yong-Lu Li, Cewu Lu.

CVPR 2022  [PDF]

Learn to Anticipate Future with Dynamic Context Removal

Xinyu Xu, Yong-Lu Li, Cewu Lu.

CVPR 2022  [arXiv] [PDF] [Code]

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes

Yang You, Zelin Ye, Yujing Lou, Chengkun Li, Yong-Lu Li, Lizhuang Ma, Weiming Wang, Cewu Lu.

CVPR 2022  [arXiv] [PDF] [Code]

UKPGAN: Unsupervised KeyPoint GANeration

Yang You, Wenhai Liu, Yong-Lu Li, Weiming Wang, Cewu Lu.

CVPR 2022  [arXiv] [PDF] [Code]

Highlighting Object Category Immunity for the Generalization of Human-Object Interaction Detection

Xinpeng Liu*, Yong-Lu Li*, Cewu Lu.

AAAI 2022  [arXiv] [PDF] [Code]

Learning Single/Multi-Attribute of Object with Symmetry and Group

Yong-Lu Li, Yue Xu, Xinyu Xu, Xiaohan Mao, Cewu Lu.

TPAMI 2022  [arXiv] [PDF] [Code]

An extension of our CVPR 2020 work (Symmetry and Group in Attribute-Object Compositions, SymNet).

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

Yong-Lu Li, Xinpeng Liu, Xiaoqian Wu, Xijie Huang, Liang Xu, Cewu Lu.

TPAMI 2022  [arXiv] [PDF] [Code]

An extension of our CVPR 2019 work (Transferable Interactiveness Network, TIN).

Localization with Sampling-Argmax

Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-Lu Li, Cewu Lu.

NeurIPS 2021  [arXiv] [PDF] [Code]

DecAug: Augmenting HOI Detection via Decomposition

Yichen Xie, Hao-Shu Fang, Dian Shao, Yong-Lu Li, Cewu Lu.

AAAI 2021  [arXiv] [PDF]

HOI Analysis: Integrating and Decomposing Human-Object Interaction

Yong-Lu Li*, Xinpeng Liu*, Xiaoqian Wu, Yizhuo Li, Cewu Lu.

NeurIPS 2020  [arXiv] [PDF] [Code] [Project: HAKE-Action-Torch]

PaStaNet: Toward Human Activity Knowledge Engine

Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Shiyi Wang, Hao-Shu Fang, Ze Ma, Mingyang Chen, Cewu Lu.

CVPR 2020  [arXiv] [PDF] [Video] [Slides] [Data] [Code]

Oral Talk, Compositionality in Computer Vision in CVPR 2020.

Detailed 2D-3D Joint Representation for Human-Object Interaction

Yong-Lu Li, Xinpeng Liu, Han Lu, Shiyi Wang, Junqi Liu, Jiefeng Li, Cewu Lu.

CVPR 2020  [arXiv] [PDF] [Video] [Slides] [Benchmark: Ambiguous-HOI] [Code]

Symmetry and Group in Attribute-Object Compositions

Yong-Lu Li, Yue Xu, Xiaohan Mao, Cewu Lu.

CVPR 2020  [arXiv] [PDF] [Video] [Slides] [Code]

HAKE: Human Activity Knowledge Engine

Yong-Lu Li, Liang Xu, Xinpeng Liu, Xijie Huang, Yue Xu, Mingyang Chen, Ze Ma, Shiyi Wang, Hao-Shu Fang, Cewu Lu.

Tech Report  HAKE 1.0 [arXiv] [PDF] [Project] [Code]

Main Repo:

Sub-repos: Torch TF HAKE-AVA Halpe List

InstaBoost: Boosting Instance Segmentation Via Probability Map Guided Copy-Pasting

Hao-Shu Fang*, Jianhua Sun*, Runzhong Wang*, Minghao Gou, Yong-Lu Li, Cewu Lu.

ICCV 2019  [arXiv] [PDF] [Code]

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yan-Feng Wang, Cewu Lu.

CVPR 2019  [arXiv] [PDF] [Code]

SRDA: Generating Instance Segmentation Annotation via Scanning, Reasoning and Domain Adaptation

Wenqiang Xu*, Yong-Lu Li*, Cewu Lu.

ECCV 2018  [arXiv] [PDF] [Dataset](Instance-60k & 3D Object Models) [Code]

Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

Cewu Lu, Hao Su, Yong-Lu Li, Yongyi Lu, Li Yi, Chi-Keung Tang, Leonidas J. Guibas.

CVPR 2018  [PDF]

Optimization of Radial Distortion Self-Calibration for Structure from Motion from Uncalibrated UAV Images

Yong-Lu Li, Yinghao Cai, Dayong Wen, Yiping Yang.

ICPR 2016  [PDF]



1) HAKE-Image (CVPR'18/20): Human body part state (PaSta) labels in images. HAKE-HICO, HAKE-HICO-DET, HAKE-Large, Extra-40-verbs.

2) HAKE-AVA: Human body part state (PaSta) labels in videos from AVA dataset. HAKE-AVA.

3) HAKE-Action-TF, HAKE-Action-Torch (CVPR'18/19/22, NeurIPS'20, TPAMI'22/23, ECCV'22, AAAI'22): SOTA action understanding methods and the corresponding HAKE-enhanced versions (TIN, IDN, IF, ParMap).

4) HAKE-3D (CVPR'20): 3D human-object representation for action understanding (DJ-RN).

5) HAKE-Object (CVPR'20, TPAMI'21): object knowledge learner to advance action understanding (SymNet).

6) HAKE-A2V (CVPR'20): Activity2Vec, a general activity feature extractor based on HAKE data, converts a human (box) to a fixed-size vector, PaSta and action scores.

7) Halpe: a joint project under Alphapose and HAKE, full-body human keypoints (body, face, hand, 136 points) of 50,000 HOI images.

8) HOI Learning List: a list of recent HOI (Human-Object Interaction) papers, code, datasets and leaderboard on widely-used benchmarks.


Survery: recent Transformer-based CV and related works.


Survery: recent LLM-based CV and related works.

Public Services


  • Conference: CVPR'20/21/22/23, NeurIPS'20/21/22/23, ICCV'21/23, ICLR'22/23/24, ECCV'22, ICML'21/22/23, AAAI'21/22/23/24, ACCV'20, WACV'21/22.
  • Journal: TPAMI, ACM Computing Surveys, TCSVT, Neurocomputing, Pattern Recognition, JVCI, Science China Information Sciences.

  • Program Committee Member

  • Compositionality in Computer Vision, CVPR 2020.

  • Competition Judge

  • The 4th International Artificial Intelligence Fair, IAIF
  • The 2nd Yangtze River Delta Youth Artificial Intelligence Olympic Challenge

  • Teaching


  • AI3604: Computer Vision, Shanghai Jiao Tong University, 2023-2024 (Fall)
  • ACM Class

  • AI3618: Virtual Reality, Shanghai Jiao Tong University, 2022-2023 (Spring)
  • 3rd Artificial Intelligence Class


  • CS348: Computer Vision, Shanghai Jiao Tong University, 2022-2023 (Fall)
  • ACM Class

  • Guided Porjects --> Papers:
  • [1] CymNet: CLIP boosted SymNet for Compositional Zero-shot Learning.

    Jiaming Shan, Zhaozi Wang, Mingshu Zhai. ICIPMC 2023, [Porject]

    [2] DiffAnnot: Improved Neural Annotator with Denoising Diffusion Model.

    Chanfan Lin, Tianyuan Qiu, Hanchong Yan, Muzi Tao. ICIPMC 2023, [Porject]

    [3] CPaStaNet: A CLIP-based Human Activity Knowledge Engine.

    Yijia Hong, Haotian Luo, Xialin He, Yaoqi Ye. ICIPMC 2023, [Porject]


  • AI1602: Problem Solving and Practice of Artificial Intelligence, Shanghai Jiao Tong University, 2020-2021 (Spring)
  • 2nd Artificial Intelligence Class


  • AI1602: Problem Solving and Practice of Artificial Intelligence, Shanghai Jiao Tong University, 2019-2020 (Spring)
  • 1st Artificial Intelligence Class

  • Guided Porject --> Paper:
  • Rb-PaStaNet: A Few-Shot Human-Object Interaction Detection Based on Rules and Part States

    Shenyu Zhang, Zichen Zhu, Qingquan Bao (freshmen)

    IMVIP 2020, Press Coverage: SEIEE of SJTU


  • 2023.2.24: RHOS: Robot, Human, Object, and Scene. Forum on Trends in cutting-edge Technology for YunFan Award AI Scholars, GAIDC, Global AI Developer Conference
  • 2022.12.7: Knowledge-driven Action Reasoning. HarmonyOS Technology Innovation Salon
  • 2022.8.17: Three Stages in Human-Object Interaction Detection. VALSE Webinar
  • 2022.6.1: Recent progress in Human Activity Knowledge Engine
  • IDEA. Thank Dr. Ailing Zeng for hosting.

  • 2021.12.9: Knowledge-driven Activity Understanding
  • CUMT "Image Analysis and Understanding" Frontier Forum. Thank Prof. Zhiwen Shao for hosting.

  • 2021.8.21: HAKE and Human-Object Interaction (HOI) Detection
  • CoLab. Thank Prof. Si Liu for hosting.

  • 2021.3.05: Human Activity Knowledge Engine (updated)
  • SJTU Computer Science Global Lunch Series. [Video]

  • 2020.8.23: Knowledge Driven Human Activity Understanding
  • The 3rd International conference on Image, Video Processing and Artificial Intelligence, IVPAI 2020.

  • 2020.7.12: Human Activity Knowledge Engine
  • Student Forum on Frontiers of AI, SFFAI. [Slides]

  • 2020.6.16: PaStaNet: Toward Human Activity Knowledge Engine
  • Compositionality in Computer Vision in CVPR 2020 Virtual. [Video] [Slides]


  • WAIC Yunfan Award: Shining Star (10 Chinese AI experts under the age of 35), Jul. 2023. Press Coverage: 机器之心
  • Wu Wenjun Artificial Intelligence Science and Technology Award 2022, Excellent Doctoral Dissertation, Mar. 2023.
  • Top Reviewer, NeurIPS'22, Oct. 2022.
  • Outstanding Reviewer, NeurIPS'21, Oct. 2021.
  • Shanghai Outstanding Doctoral Graduate, Aug. 2021.
  • PhD Fellowship, The 85th Computer Department Education Development Fund and Yang Yuanqing Education Fund, Jun. 2021.
  • Chinese AI New Star Top-100 (Machine Learning), May. 2021.
  • Baidu Scholarship (Top-10, worldwide), Jan. 2021. Press Coverage.
  • WAIC Outstanding Developer, Dec. 2020. Press Coverage: 机器之心, 上海临港
  • China National Scholarship, Sep. 2020.
  • WAIC YunFan Award, Rising Star, July. 2020 (World Artificial Intelligence Conference, Shanghai). Press Coverage: 机器之心
  • The 2nd A-Class , July. 2020.
  • Huawei Scholarship, Nov. 2019.

  • Personal Interests

  • Chinese History Reading
  • Twenty-Four Histories: 5/24

  • Travel around China
  • 34 Provinces: 21/34