- NYU Dataset v1 ☆
- NYU Dataset v2 ☆
- SUN 3D ☆
- SUN RGB-D ☆
- ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset ☆
- SceneNN: A Scene Meshes Dataset with aNNotations ☆
- Stanford 2D-3D-Semantics Dataset ☆
- ScanNet ☆
- SceneNet RGB-D ☆
- SUNCG ☆
- ‘Object Detection and Classification from Large-Scale Cluttered Indoor Scans’
- Active Vision Dataset (AVD)
- RGB-D Semantic Segmentation Dataset
- RGBD Scenes dataset v2
- Object Disappearance for Object Discovery
Gruond Truthとは : 正確さや整合性をチェックするためのデータ。各部分の真のカテゴリー。
【 所感 】
- NYU Dataset
- SUN 系
- ScanNet 系
がSemanticSegmentation x Indoorのデータセットとして良さそう。
- Stanford 2D-3D-Semantics Dataset
NYU Dataset v1 ☆
Around 51,000 RGBD frames from indoor scenes such as bedrooms and living rooms.
NYU Dataset v2 ☆
~408,000 RGBD images from 464 indoor scenes, of a somewhat larger diversity than NYU v1. Per-frame accelerometer data.
SUN 3D ☆
Labelling: Polygons of semantic class and instance labels on frames propagated through video.
SUN RGB-D ☆
Introduced: CVPR 2015
Description: New images, plus images taken from NYUv2, B3DO and SUN3D. All of indoor scenes.
Labelling: 10,335 images with polygon annotation, and 3D bounding boxes around objects
The dataset contains RGB-D images from NYU depth v2 , Berkeley B3DO , and SUN3D . Besides this paper, you are required to also cite the following papers if you use this dataset.
ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset ☆
Introduced: IJRR 2015
Device: Kinect v1
Description: Five sequences (total 22454 frames) captured from a robot moving through an office environment
Labelling: Scene type of each frame, plus presence/absence of each of a set of 15 objects.
SceneNN: A Scene Meshes Dataset with aNNotations ☆
We introduce an RGB-D scene dataset consisting of more than 100 indoor scenes. Our scenes are captured at various places, e.g., offices, dormitory, classrooms, pantry, etc., from University of Massachusetts Boston and Singapore University of Technology and Design.
Stanford 2D-3D-Semantics Dataset ☆
Device: Matterport Camera (360 degree rotation RGBD sensor)
Description: 360 degree RGBD images captured from 6 large areas in municipal buildings, together with mesh and point cloud reconstructions.
Labelling: Semantic labelling on the mesh (13 classes, plus instance labels), and 3D volumentric reconstruction labels
Description: 2.5 million frames from 1513 scenes
Labelling: Automatically computed (and human verified) camera poses and surface reconstructions. Instance and semantic segmentations provided on reconstructed mesh. 3D CAD models + alignment also provided for each scene.
SceneNet RGB-D ☆
Description: 5 million images rendered of 16,895 indoor scenes. Room configuration randomly generated with physics simulator.
Labelling: Camera pose, plus per-pixel instance, class labelling and optical flow.
Description: 45,622 scenes with manually created room and furniture layouts. Images can be rendered from the geometry, but are not provided by default.
Labelling: Object semantic class and instance labelling.
‘Object Detection and Classification from Large-Scale Cluttered Indoor Scans’
Scene Understanding for Personal Robots
Active Vision Dataset (AVD)
Description: Dense sampling of images in home and office scenes, captured from a robot. Dataset designed for simulation of motion and instance detection.
Labelling: Per-frame camera pose, object instance bounding boxes, movement pointers between images.
RGB-D Semantic Segmentation Dataset
.ply: the 3D mesh; can be viewed by means of , e.g., MeshLab.
RGBD Scenes dataset v2
Description: A second set of real indoor scenes featuring objects from the RGBD object dataset.
Object Disappearance for Object Discovery
- 5 000 annotated images with fine annotations
- 20 000 annotated images with coarse annotations