Projects
This page showcases the datasets, systems, and other resources developed by our group.
Dataset

CRIC contains compositional questions to evaluate the ability of a model on alternatively inferring on vision and commonsense.

Visual understanding goes well beyond the study of images or videos on the web. Env-QA is a new video QA dataset to evaluate the ability of understanding the composition, layout, and state changes of the environment presented by the events in videos.

We build a large-scale repository of Composite Blocks by disentangling object attributes such as shape, material, color, and contact points. This results in 9,504 objects, each rendered from random viewpoints to create photorealistic images.

At the paragraph level, we construct a novel dataset ParaEval and demonstrate the accuracy of our proposed HiFi-Score in evaluating long texts.

We simulate the process of human scanning and collecting point cloud data in real-world scenes and construct three large-scale synthetic point cloud datasets using synthetic scenes. The scale of these three datasets is more than ten times that of the currently available real-world data.