VIPL-VSU

Projects

This page showcases the datasets, systems, and other resources developed by our group.

Dataset

CRIC
CRIC A VQA Dataset for Compositional Reasoning on Vision and Commonsense

CRIC contains compositional questions to evaluate the ability of a model on alternatively inferring on vision and commonsense.

Env-QA
Env-QA A Video QA Dataset for Dynamic Environments Understanding

Visual understanding goes well beyond the study of images or videos on the web. Env-QA is a new video QA dataset to evaluate the ability of understanding the composition, layout, and state changes of the environment presented by the events in videos.

ComBo
ComBo Dissecting Categorization Ability of Large Multimodal Models

We build a large-scale repository of Composite Blocks by disentangling object attributes such as shape, material, color, and contact points. This results in 9,504 objects, each rendered from random viewpoints to create photorealistic images.

ParaEval
ParaEval A paragraph-level evaluation dataset for long texts

At the paragraph level, we construct a novel dataset ParaEval and demonstrate the accuracy of our proposed HiFi-Score in evaluating long texts.

Large-scale synthetic point cloud datasets
Large-scale synthetic point cloud datasets

We simulate the process of human scanning and collecting point cloud data in real-world scenes and construct three large-scale synthetic point cloud datasets using synthetic scenes. The scale of these three datasets is more than ten times that of the currently available real-world data.

More