Panoptic-Depth Forecasting

This repository is the official implementation of the paper:

Panoptic-Depth Forecasting

Juana Valeria Hurtado*, Riya Mohan*, and Abhinav Valada.

IEEE International Conference on Robotics and Automation (ICRA), 2025

If you find our work useful, please consider citing our paper:

@inproceedings{hurtado2025panoptic,
  title={Panoptic-depth forecasting},
  author={Hurtado, Juana Valeria and Mohan, Riya and Valada, Abhinav},
  booktitle={2025 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={01--07},
  year={2025},
  organization={IEEE}
}

📔 Abstract

Forecasting the semantics and 3D structure of scenes is essential for robots to navigate and plan actions safely. Recent methods have explored semantic and panoptic scene forecasting; however, they do not consider the geometry of the scene. In this work, we propose the panoptic-depth forecasting task for jointly predicting the panoptic segmentation and depth maps of unobserved future frames, from monocular camera images. To facilitate this work, we extend the popular KITTI-360 and Cityscapes benchmarks by computing depth maps from LiDAR point clouds and leveraging sequential labeled data. We also introduce a suitable evaluation metric that quantifies both the panoptic quality and depth estimation accuracy of forecasts in a coherent manner. Furthermore, we present two baselines and propose the novel PDcast architecture that learns rich spatio-temporal representations by incorporating a transformer- based encoder, a forecasting module, and task-specific decoders to predict future panoptic-depth outputs. Extensive evaluations demonstrate the effectiveness of PDcast across two datasets and three forecasting tasks, consistently addressing the primary challenges.

🏗 Setup

💻 System Requirements

Linux
Python 3.8
PyTorch 1.12.1
CUDA 11.3
GCC 7 or higher

IMPORTANT NOTE: These requirements are not necessarily mandatory. However, we have only tested the code under the above settings and cannot provide support for other setups.

⚙️ Installation

Create conda environment from file:
```
conda env create -f environment.yml
```
Activate environment:
```
conda activate pdcast
```

💾 Data preparation

KITTI-360

Download the following files from the KITTI-360 website:

Perspective Images for Train & Val (128G): You can remove "01" in line 12 in download_2d_perspective.sh to only download the relevant images.
Test Semantic (1.5G)
Semantics (1.8G)
Calibrations (3K)

Additionally, download our preprocessed train/val split files:

train - Training split file
val - Validation split file

Place the downloaded files in the kitti_360/ folder.

After extraction and copying of the perspective images, one should obtain the following file structure:

── kitti_360
   ├── 2013_05_28_drive_train_frames_full.txt
   ├── 2013_05_28_drive_val_frames_full.txt
   ├── calibration
   │    ├── calib_cam_to_pose.txt
   │    └── ...
   ├── data_2d_raw
   │   ├── 2013_05_28_drive_0000_sync
   │   └── ...
   ├── data_2d_semantics
   │    └── train
   │        ├── 2013_05_28_drive_0000_sync
   │        └── ...
   └── data_2d_test
        ├── 2013_05_28_drive_0008_sync
        └── 2013_05_28_drive_0018_sync

🏃 Running the Code

Configuration

Before running training or evaluation, update the dataset path in the configuration file:

Open cfg/pdcast_kitti360.yaml
Replace <PATH_TO_KITTI360_DATASET> with the path to your KITTI-360 dataset root directory

Training

To train PDcast on KITTI-360:

Open scripts/train.sh and fill in the placeholder values:
- <CONDA_ENV>: Your conda environment name (e.g., pdcast)
- <GPU_IDS>: GPU IDs to use (e.g., 0,1,2,3 for 4 GPUs)
- <NUM_GPUS>: Number of GPUs (e.g., 4)
- <MASTER_PORT>: Port for distributed training (e.g., 22019)
- <RUN_NAME>: Name for your training run (e.g., pdcast_kitti360_exp1)
- <PROJECT_ROOT>: Absolute path to this repository
- <CONFIG_FILE>: Config filename (e.g., pdcast_kitti360.yaml)
Run the training script:
```
bash scripts/train.sh
```

Evaluation

To evaluate a trained model:

Open scripts/eval.sh and fill in the placeholder values:
- <CONDA_ENV>: Your conda environment name (e.g., pdcast)
- <GPU_ID>: Single GPU ID for evaluation (e.g., 0)
- <MASTER_PORT>: Port number (e.g., 22019)
- <RUN_NAME>: Name for evaluation run (e.g., pdcast_eval)
- <PROJECT_ROOT>: Absolute path to this repository
- <CONFIG_FILE>: Config filename (e.g., pdcast_kitti360.yaml)
- <CHECKPOINT_PATH>: Absolute path to your model checkpoint (.pth file)
Run the evaluation script:
```
bash scripts/eval.sh
```

The evaluation will output PDC-Q (Panoptic Depth Consistency Quality) metrics at forecasting horizons of 1, 3, and 5 frames.

🎯 Pre-trained Models

We provide pre-trained model weights for PDcast on KITTI-360. The model achieves the following performance on the validation set:

Dataset	PDC-Q Δ1	PDC-Q Δ3	PDC-Q Δ5	Download
KITTI-360	0.429	0.377	0.338	checkpoint

PDC-Q (Panoptic Depth Consistency Quality) measures the joint forecasting quality of panoptic segmentation and depth estimation across different temporal horizons (1, 3, and 5 frames ahead).

To use the pre-trained model:

Download the checkpoint from the link above
Update the <CHECKPOINT_PATH> in scripts/eval.sh with the path to the downloaded file
Run evaluation as described in the Evaluation section

👩‍⚖️ License

For academic usage, the code is released under the GPLv3 license. For any commercial purpose, please contact the authors.

🙏 Acknowledgment

This codebase is built upon and adapted from the open-source
CoDEPS repository:
https://github.com/robot-learning-freiburg/CoDEPS

We sincerely thank the authors for making their implementation publicly available.

This work was funded by the German Research Foundation (DFG) Emmy Noether Program,
grant number 468878300.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
algos		algos
cfg		cfg
datasets		datasets
eval		eval
io_utils		io_utils
misc		misc
models		models
pdcast		pdcast
scripts		scripts
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Panoptic-Depth Forecasting

📔 Abstract

🏗 Setup

💻 System Requirements

⚙️ Installation

💾 Data preparation

KITTI-360

🏃 Running the Code

Configuration

Training

Evaluation

🎯 Pre-trained Models

👩‍⚖️ License

🙏 Acknowledgment

📬 Contacts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Panoptic-Depth Forecasting

📔 Abstract

🏗 Setup

💻 System Requirements

⚙️ Installation

💾 Data preparation

KITTI-360

🏃 Running the Code

Configuration

Training

Evaluation

🎯 Pre-trained Models

👩‍⚖️ License

🙏 Acknowledgment

📬 Contacts

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages