This document describes the data format recorded by the app.
The collected datasets are each contained in a folder, named after a random hash, for example 71de12f9. A dataset folder has the following directory structure:
camera_matrix.csv
odometry.csv
imu.csv
depth/
- 000000.png
- 000001.png
- ...
confidence/
- 000000.png
- 000001.png
- ...
distortion/ (optional, present when lens distortion data is available)
- 000000.bin
- 000001.bin
- ...
rgb.mp4
rgb.mp4 is an HEVC encoded video, which contains the recorded data from the iPhone's camera.
The depth/ directory contains the depth maps. One .png file per rgb frame. Each of these is a 16 bit grayscale png image. They have a height of 192 elements and width of 256 elements. The values are the measured depth in millimeters, for that pixel position. In OpenCV, these can be read with cv2.imread(depth_frame_path, -1).
The confidence/ directory contains confidence maps corresponding to each depth map. They are grayscale png files encoding 192 x 256 element matrices. The values are either 0, 1 or 2. A higher value means a higher confidence.
The camera_matrix.csv is a 3 x 3 matrix containing the camera intrinsic parameters from the final recorded frame. This file is kept for backwards compatibility; for per-frame intrinsics use the fx, fy, cx, cy columns in odometry.csv.
The odometry.csv file contains the camera pose and intrinsics for each frame. The first line is a header. The meaning of the fields are:
| Field | Meaning |
|---|---|
| timestamp | Timestamp in seconds |
| frame | Frame number to which this pose corresponds to e.g. 000005 |
| x | x coordinate in meters from when the session was started |
| y | y coordinate in meters from when the session was started |
| z | z coordinate in meters from when the session was started |
| qx | x component of quaternion representing camera pose rotation |
| qy | y component of quaternion representing camera pose rotation |
| qz | z component of quaternion representing camera pose rotation |
| qw | w component of quaternion representing camera pose rotation |
| fx | Horizontal focal length in pixels |
| fy | Vertical focal length in pixels |
| cx | Principal point x coordinate in pixels |
| cy | Principal point y coordinate in pixels |
| distortion_center_x | x coordinate of the lens distortion center in pixels (empty if unavailable) |
| distortion_center_y | y coordinate of the lens distortion center in pixels (empty if unavailable) |
The distortion/ directory contains per-frame lens distortion lookup tables, present only when the device exposes calibration data. Each .bin file is a raw array of little-endian float32 values mapping radial distance from the distortion center to a correction factor. The number of entries is file_size_in_bytes / 4. The filenames correspond to the frame field in odometry.csv.
The imu.csv file contains timestamps, linear acceleration readings and angular rotation readings. The first line is a header. The meaning of the fields are:
| Field | Meaning |
|---|---|
| timestamp | Timestamp in seconds |
| a_x | Acceleration in m/s^2 in x direction |
| a_y | Acceleration in m/s^2 in y direction |
| a_z | Acceleration in m/s^2 in z direction |
| alpha_x | Rotation in rad/s around the x-axis |
| alpha_y | Rotation in rad/s around the y-axis |
| alpha_z | Rotation in rad/s around the z-axis |