VIPL-VSU

Research

Research topics of our group mainly cover three aspects: 1) Object recognition, e.g. zero-shot learning, incremental/life-long learning, image retrieval, image classification, etc. 2) Scene understanding, e.g. object detection/segmentation, scene classification, relationship detection, scene graph generation, etc., and 3) Language/knowledge-based cognition, e.g. image/video captioning (description), visual question answering, visual concept learning, knowledge graph, etc.

All

2025

Cross-Domain Few-Shot 3D Point Cloud Semantic Segmentation
Cross-Domain Few-Shot 3D Point Cloud Semantic Segmentation
Jiwei Xiao, Ruiping Wang, Chen He, Xilin Chen
Pattern Recognition Letters  ·  01 Nov 2025  ·  10.1016/j.patrec.2025.07.001
Generic Scene Graph Generation Model with Hierarchical Prompt Learning
Generic Scene Graph Generation Model with Hierarchical Prompt Learning
Xuhan Zhu, Yifei Xing, Ruiping Wang, Yaowei Wang, Xiangyuan Lan
International Journal of Computer Vision  ·  27 Jun 2025  ·  10.1007/s11263-025-02499-z
Dynamic Behavior Cloning With Temporal Feature Prediction: Enhancing Robotic Arm Manipulation in Moving Object Tasks
Dynamic Behavior Cloning With Temporal Feature Prediction: Enhancing Robotic Arm Manipulation in Moving Object Tasks
Yifan Zhang, Ruiping Wang, Xilin Chen
IEEE Robotics and Automation Letters  ·  01 Jun 2025  ·  10.1109/LRA.2025.3557746
UniFa: A unified feature hallucination framework for any-shot object detection
UniFa: A unified feature hallucination framework for any-shot object detection
Hui Nie, Ruiping Wang, Xilin Chen
Pattern Recognition Letters  ·  01 Mar 2025  ·  10.1016/j.patrec.2025.01.015
GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots
GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots
Bin Fu, Jialin Li, Bin Zhang, Ruiping Wang, Xilin Chen
arXiv  ·  01 Jan 2025  ·  10.48550/arXiv.2503.17733
Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation
Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation
Senwei Xie, Hongyu Wang, Zhanqi Xiao, Ruiping Wang, Xilin Chen
arXiv  ·  01 Jan 2025  ·  10.48550/arXiv.2501.04268

2024

Think Before Placement: Common Sense Enhanced Transformer for Object Placement
Think Before Placement: Common Sense Enhanced Transformer for Object Placement
Yaxuan Qin, Jiayu Xu, Ruiping Wang, Xilin Chen
Lecture Notes in Computer Science  ·  04 Dec 2024  ·  10.1007/978-3-031-73464-9_3
Local context attention learning for fine-grained scene graph generation
Local context attention learning for fine-grained scene graph generation
Xuhan Zhu, Ruiping Wang, Xiangyuan Lan, Yaowei Wang
Pattern Recognition  ·  01 Dec 2024  ·  10.1016/j.patcog.2024.110708
HiFi-Score: Fine-Grained Image Description Evaluation with Hierarchical Parsing Graphs
HiFi-Score: Fine-Grained Image Description Evaluation with Hierarchical Parsing Graphs
Ziwei Yao, Ruiping Wang, Xilin Chen
Lecture Notes in Computer Science  ·  31 Oct 2024  ·  10.1007/978-3-031-73033-7_25
Calibration for Long-tailed Scene Graph Generation
Calibration for Long-tailed Scene Graph Generation
Xuhan Zhu, Yifei Xing, Ruiping Wang, Yaowei Wang, Xiangyuan Lan
Proceedings of the 32nd ACM International Conference on Multimedia  ·  28 Oct 2024  ·  10.1145/3664647.3680818
Introspective GAN: Learning to grow a GAN for incremental generation and classification
Introspective GAN: Learning to grow a GAN for incremental generation and classification
Chen He, Ruiping Wang, Shiguang Shan, Xilin Chen
Pattern Recognition  ·  01 Jul 2024  ·  10.1016/j.patcog.2024.110383
Point2Real: Bridging the Gap between Point Cloud and Realistic Image for Open-World 3D Recognition
Point2Real: Bridging the Gap between Point Cloud and Realistic Image for Open-World 3D Recognition
Hanxuan Li, Bin Fu, Ruiping Wang, Xilin Chen
Proceedings of the AAAI Conference on Artificial Intelligence  ·  24 Mar 2024  ·  10.1609/aaai.v38i4.28088
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
Bin Fu, Qiyang Wan, Jialin Li, Ruiping Wang, Xilin Chen
arXiv  ·  01 Jan 2024  ·  10.48550/arXiv.2409.01560
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
Hongyu Wang, Jiayu Xu, Senwei Xie, Ruiping Wang, Jialin Li, Zhaojie Xie, Bin Zhang, Chuyan Xiong, Xilin Chen
arXiv  ·  01 Jan 2024  ·  10.48550/arXiv.2405.15638
EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
Yifei Xing, Xiangyuan Lan, Ruiping Wang, Dongmei Jiang, Wenjun Huang, Qingfang Zheng, Yaowei Wang
arXiv  ·  01 Jan 2024  ·  10.48550/arXiv.2410.05938
Event Graph Guided Compositional Spatial Temporal Reasoning for Video Question Answering
Event Graph Guided Compositional Spatial–Temporal Reasoning for Video Question Answering
Ziyi Bai, Ruiping Wang, Difei Gao, Xilin Chen
IEEE Transactions on Image Processing  ·  01 Jan 2024  ·  10.1109/TIP.2024.3358726
RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator
RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator
Xinhai Li, Jialin Li, Ziheng Zhang, Rui Zhang, Fan Jia, Tiancai Wang, Haoqiang Fan, Kuo-Kun Tseng, Ruiping Wang
arXiv  ·  01 Jan 2024  ·  10.48550/arXiv.2411.11839

2023

Importance First: Generating Scene Graph of Human Interest
Importance First: Generating Scene Graph of Human Interest
Wenbin Wang, Ruiping Wang, Shiguang Shan, Xilin Chen
International Journal of Computer Vision  ·  09 Jun 2023  ·  10.1007/s11263-023-01817-7