Skip to content

Latest commit

 

History

History
68 lines (54 loc) · 3.56 KB

File metadata and controls

68 lines (54 loc) · 3.56 KB

Data Specification

This document describes the data format recorded by the app.

The collected datasets are each contained in a folder, named after a random hash, for example 71de12f9. A dataset folder has the following directory structure:

camera_matrix.csv
odometry.csv
imu.csv
depth/
  - 000000.png
  - 000001.png
  - ...
confidence/
  - 000000.png
  - 000001.png
  - ...
distortion/          (optional, present when lens distortion data is available)
  - 000000.bin
  - 000001.bin
  - ...
rgb.mp4

rgb.mp4 is an HEVC encoded video, which contains the recorded data from the iPhone's camera.

The depth/ directory contains the depth maps. One .png file per rgb frame. Each of these is a 16 bit grayscale png image. They have a height of 192 elements and width of 256 elements. The values are the measured depth in millimeters, for that pixel position. In OpenCV, these can be read with cv2.imread(depth_frame_path, -1).

The confidence/ directory contains confidence maps corresponding to each depth map. They are grayscale png files encoding 192 x 256 element matrices. The values are either 0, 1 or 2. A higher value means a higher confidence.

The camera_matrix.csv is a 3 x 3 matrix containing the camera intrinsic parameters from the final recorded frame. This file is kept for backwards compatibility; for per-frame intrinsics use the fx, fy, cx, cy columns in odometry.csv.

The odometry.csv file contains the camera pose and intrinsics for each frame. The first line is a header. The meaning of the fields are:

Field
Meaning
timestamp Timestamp in seconds
frame Frame number to which this pose corresponds to e.g. 000005
x x coordinate in meters from when the session was started
y y coordinate in meters from when the session was started
z z coordinate in meters from when the session was started
qx x component of quaternion representing camera pose rotation
qy y component of quaternion representing camera pose rotation
qz z component of quaternion representing camera pose rotation
qw w component of quaternion representing camera pose rotation
fx Horizontal focal length in pixels
fy Vertical focal length in pixels
cx Principal point x coordinate in pixels
cy Principal point y coordinate in pixels
distortion_center_x x coordinate of the lens distortion center in pixels (empty if unavailable)
distortion_center_y y coordinate of the lens distortion center in pixels (empty if unavailable)

The distortion/ directory contains per-frame lens distortion lookup tables, present only when the device exposes calibration data. Each .bin file is a raw array of little-endian float32 values mapping radial distance from the distortion center to a correction factor. The number of entries is file_size_in_bytes / 4. The filenames correspond to the frame field in odometry.csv.

The imu.csv file contains timestamps, linear acceleration readings and angular rotation readings. The first line is a header. The meaning of the fields are:

Field
Meaning
timestamp Timestamp in seconds
a_x Acceleration in m/s^2 in x direction
a_y Acceleration in m/s^2 in y direction
a_z Acceleration in m/s^2 in z direction
alpha_x Rotation in rad/s around the x-axis
alpha_y Rotation in rad/s around the y-axis
alpha_z Rotation in rad/s around the z-axis