R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation

This repository provides a PyTorch implementation of R2-Dreamer (ICLR 2026), a computationally efficient world model that achieves high performance on continuous control benchmarks. It also includes an efficient PyTorch DreamerV3 reproduction that trains ~5x faster than a widely used codebase, along with other baselines. Selecting R2-Dreamer via the config provides an additional ~1.6x speedup over this baseline.

Instructions

Install dependencies. This repository is tested with Ubuntu 24.04 and Python 3.11.

If you prefer Docker, follow docs/docker.md.

# Installing via a virtual env like uv is recommended.
pip install -r requirements.txt

Run training on default settings:

python3 train.py logdir=./logdir/test

Monitoring results:

tensorboard --logdir ./logdir

Switching algorithms:

# Choose an algorithm via model.rep_loss:
# r2dreamer|dreamer|infonce|dreamerpro|nedreamer
python3 train.py model.rep_loss=r2dreamer

nedreamer (NE-Dreamer, Bredis et al., 2026) replaces the pixel decoder with a causal temporal transformer that predicts the next-step encoder embedding and aligns it to a stop-gradient target via Barlow Twins. Hyper- parameters live under model.nedreamer.*; the three Sec. 4.3 ablations are exposed as model.nedreamer.use_transformer, model.nedreamer.use_shift, and model.nedreamer.use_projector. The implementation plan is in docs/nedreamer_plan.md.

Curious Replay

Curious Replay (Kauvar & Doyle et al., ICML 2023) is available as a prioritized sampling option, orthogonal to the choice of rep_loss. Enable with:

python3 train.py model.rep_loss=nedreamer model.curious_replay.enabled=True env=crafter

The buffer's per-transition priority follows Eq. 1 of the paper:

p_i = c * beta^v_i + (|L_i| + eps)^alpha

where v_i is the visit count and L_i = |dyn + rew + cont| is the per-step world-model loss (computed in dreamer.py:_cal_grad and threaded back through buffer.update_priority). All five c, beta, alpha, eps, p_max knobs are exposed under model.curious_replay.* with the paper's defaults.

Validation: Atari-100k (size12M, 5 seeds collapsed to 1, 410k env steps each)

Trained on a single A100 80GB; ~380 env-steps/sec with model.compile=True, ~25 minutes wall-clock per game. 3 eval episodes per checkpoint (env.eval_episode_num=3, trainer.eval_every=2e4).

Game	Init eval	Best eval	Final eval (400k)	Notes
Pong	-20.7	-11.3 @ 400k	-11.3	Monotonic improvement; `loss/ne` 1025 → 40
Breakout	0.0	7.3 @ 380k	3.7	Eval oscillates between scoring and the no-FIRE time-limit stall (a known Atari-100k Breakout pathology when `autostart: False`)
Boxing	-11.0	65.7 @ 380k	61.0	Strong learning; agent dominates the bot late in training

For easier code reading, inline tensor shape annotations are provided. See docs/tensor_shapes.md.

Available Benchmarks

At the moment, the following benchmarks are available in this repository.

Environment	Observation	Action	Budget	Description
Meta-World	Image	Continuous	1M	Robotic manipulation with complex contact interactions.
DMC Proprio	State	Continuous	500K	DeepMind Control Suite with low-dimensional inputs.
DMC Vision	Image	Continuous	1M	DeepMind Control Suite with high-dimensional images inputs.
DMC Subtle	Image	Continuous	1M	DeepMind Control Suite with tiny task-relevant objects.
Atari 100k	Image	Discrete	400K	26 Atari games.
Crafter	Image	Discrete	1M	Survival environment to evaluates diverse agent abilities.
Memory Maze	Image	Discrete	100M	3D mazes to evaluate RL agents' long-term memory.

Use Hydra to select a benchmark and a specific task using env and env.task, respectively.

python3 train.py ... env=dmc_vision env.task=dmc_walker_walk

Headless rendering

If you run MuJoCo-based environments (DMC / MetaWorld) on headless machines, you may need to set MUJOCO_GL for offscreen rendering. Using EGL is recommended as it accelerates rendering, leading to faster simulation throughput.

# For example, when using EGL (GPU)
export MUJOCO_GL=egl
# (optional) Choose which GPU EGL uses
export MUJOCO_EGL_DEVICE_ID=0

More details: Working with MuJoCo-based environments

Code formatting

If you want automatic formatting/basic checks before commits, you can enable pre-commit:

pip install pre-commit
# This sets up a pre-commit hook so that checks are run every time you commit
pre-commit install
# Manual pre-commit run on all files
pre-commit run --all-files

Citation

If you find this code useful, please consider citing:

@inproceedings{
morihira2026rdreamer,
title={R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation},
author={Naoki Morihira and Amal Nahar and Kartik Bharadwaj and Yasuhiro Kato and Akinobu Hayashi and Tatsuya Harada},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=Je2QqXrcQq}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
docs		docs
envs		envs
optim		optim
runs		runs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
buffer.py		buffer.py
distributions.py		distributions.py
dreamer.py		dreamer.py
networks.py		networks.py
requirements.txt		requirements.txt
rssm.py		rssm.py
tools.py		tools.py
train.py		train.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation

Instructions

Curious Replay

Validation: Atari-100k (size12M, 5 seeds collapsed to 1, 410k env steps each)

Available Benchmarks

Headless rendering

Code formatting

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation

Instructions

Curious Replay

Validation: Atari-100k (size12M, 5 seeds collapsed to 1, 410k env steps each)

Available Benchmarks

Headless rendering

Code formatting

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages