Research
Research topics of our group mainly cover three aspects: 1) Object recognition, e.g. zero-shot learning, incremental/life-long learning, image retrieval, image classification, etc. 2) Scene understanding, e.g. object detection/segmentation, scene classification, relationship detection, scene graph generation, etc., and 3) Language/knowledge-based cognition, e.g. image/video captioning (description), visual question answering, visual concept learning, knowledge graph, etc.
All
2025

Cross-Domain Few-Shot 3D Point Cloud Semantic Segmentation
Pattern Recognition Letters
·
01 Nov 2025
·
10.1016/j.patrec.2025.07.001

Generic Scene Graph Generation Model with Hierarchical Prompt Learning
International Journal of Computer Vision
·
27 Jun 2025
·
10.1007/s11263-025-02499-z

Dynamic Behavior Cloning With Temporal Feature Prediction: Enhancing Robotic Arm Manipulation in Moving Object Tasks
IEEE Robotics and Automation Letters
·
01 Jun 2025
·
10.1109/LRA.2025.3557746

UniFa: A unified feature hallucination framework for any-shot object detection
Pattern Recognition Letters
·
01 Mar 2025
·
10.1016/j.patrec.2025.01.015
GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots
arXiv
·
01 Jan 2025
·
10.48550/arXiv.2503.17733

Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation
arXiv
·
01 Jan 2025
·
10.48550/arXiv.2501.04268
2024
Think Before Placement: Common Sense Enhanced Transformer for Object Placement
Lecture Notes in Computer Science
·
04 Dec 2024
·
10.1007/978-3-031-73464-9_3

Local context attention learning for fine-grained scene graph generation
Pattern Recognition
·
01 Dec 2024
·
10.1016/j.patcog.2024.110708
HiFi-Score: Fine-Grained Image Description Evaluation with Hierarchical Parsing Graphs
Lecture Notes in Computer Science
·
31 Oct 2024
·
10.1007/978-3-031-73033-7_25
Calibration for Long-tailed Scene Graph Generation
Proceedings of the 32nd ACM International Conference on Multimedia
·
28 Oct 2024
·
10.1145/3664647.3680818

Introspective GAN: Learning to grow a GAN for incremental generation and classification
Pattern Recognition
·
01 Jul 2024
·
10.1016/j.patcog.2024.110383
Point2Real: Bridging the Gap between Point Cloud and Realistic Image for Open-World 3D Recognition
Proceedings of the AAAI Conference on Artificial Intelligence
·
24 Mar 2024
·
10.1609/aaai.v38i4.28088
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
arXiv
·
01 Jan 2024
·
10.48550/arXiv.2409.01560
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
arXiv
·
01 Jan 2024
·
10.48550/arXiv.2405.15638
EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
arXiv
·
01 Jan 2024
·
10.48550/arXiv.2410.05938

Event Graph Guided Compositional Spatial–Temporal Reasoning for Video Question Answering
IEEE Transactions on Image Processing
·
01 Jan 2024
·
10.1109/TIP.2024.3358726
RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator
arXiv
·
01 Jan 2024
·
10.48550/arXiv.2411.11839
2023

Importance First: Generating Scene Graph of Human Interest
International Journal of Computer Vision
·
09 Jun 2023
·
10.1007/s11263-023-01817-7