The Scenes With Occluded Regions Dataset The Scenes With Occluded Regions Dataset The Scenes With Occluded Regions Dataset
Figure 1: Results of StereoLayers for SWORD Scenes.
About dataset
The Scenes With Occluded Regions Dataset (SWORD) contains around 1500 videos with 50 frames per video on average. The dataset was obtained after processing the manually captured video sequences of static real-life urban scenes. The main property of the dataset is the abundance of close objects and, consequently, larger prevalence of occlusions. For each video, the poses form a trajectory where each pose specifies the camera position and orientation along the trajectory.
Figure 2: Samples from dataset with estimated occlusion masks for the given stereo pairs. See supplementary of the paper.
We share the overall dataset with two cloud sources (Y- and G- drives). Besides, we publish a tiny version of the additional test scenes with a nice appearance and small size (collected on different devices). We suggest using them to test your method. We design the camera format and processing same as for RealEstate10k. We recommend you checking more details in the original paper. The camera intrinsics are expressed in normalized image coordinates, where the top left corner is (0,0), and the bottom right is (1,1).
You may download a .zip file with the data using the following links:
The data splits .zip file contains following data:
  • dataset.csv -- full list of the dataset with the following information: File paths, Scene params (from colmap) and Intrinsic (relative). Original resolution of images is full HD (1920x1080).
  • views -- folder with txt that contains extrinsics and intrinsics of the each image on scene,
  • videos -- folder with frames for each scene, all frames named according to corresponded timestamp.