Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
6198fa1
Fix README examples and align default model URI in example scripts
MalarzDawid Feb 22, 2026
991a861
Add examples extra dependencies for OpenCV and benchmark tooling
MalarzDawid Feb 22, 2026
3ab6475
implement state loading / saving
lapp0 Feb 23, 2026
f1c93e1
moe + fbgemm optimization
lapp0 Feb 23, 2026
c6f95be
wp-1.5 staging
lapp0 Mar 4, 2026
7cf8c25
clean up and fix ae
lapp0 Mar 4, 2026
586a3c0
fix temporal compression rope bugs
lapp0 Mar 4, 2026
5125dc1
vae reset in world_engine.reset
lapp0 Mar 4, 2026
9ba9b4d
reduce peak memory
lapp0 Mar 5, 2026
4c5ecb5
Implements the orthorope angles computation instead of precomputing (…
Clydingus Mar 9, 2026
bf90520
test: revert direct device init (#28)
Clydingus Mar 9, 2026
177101f
feat: use built triton-windows fork to fix long-path issue
philpax Mar 6, 2026
fe5873d
update gen_sample
lapp0 Mar 9, 2026
1935b64
better quant
lapp0 Mar 9, 2026
facd12a
avoid warning when creating mouse / scroll tensors
lapp0 Mar 9, 2026
b2b3fb6
disable unimportant compile options
lapp0 Mar 9, 2026
48d5f68
Merge remote-tracking branch 'origin/wp-1.5' into wp1.5
lapp0 Mar 9, 2026
8f84795
clean up model loading
lapp0 Mar 9, 2026
3df610b
remove unnecessary push_to_hub
lapp0 Mar 9, 2026
a437614
remove unnecessary save_pretrained
lapp0 Mar 9, 2026
c076f88
Merge pull request #29 from Overworldai/use-patched-triton-windows
philpax Mar 9, 2026
39630e4
reduce cpu memory
lapp0 Mar 9, 2026
5ba3689
Merge remote-tracking branch 'origin/wp-1.5' into wp1.5
lapp0 Mar 9, 2026
235276f
pass device
lapp0 Mar 10, 2026
d4fd76a
fix #27 - use triton-windows longpath fix
philpax Mar 10, 2026
4469f3e
cleanup dead code
lapp0 Mar 18, 2026
aa02df5
Merge remote-tracking branch 'refs/remotes/origin/wp-1.5' into wp-1.5
lapp0 Mar 18, 2026
4511a7b
auto 720p
lapp0 Mar 18, 2026
844c44c
ensure correct device
lapp0 Mar 19, 2026
3612b8b
no internal model URIs, document requirements in docstring at top
lapp0 Mar 19, 2026
f5cc301
update readme to document WP1.5
lapp0 Mar 19, 2026
5c910b7
update readme to document WP1.5
lapp0 Mar 19, 2026
274d685
no fbgemm dep
lapp0 Mar 20, 2026
ae9ac61
benchmark dont force AE
lapp0 Mar 20, 2026
25503c1
move kv cache to appropriate device
lapp0 Mar 20, 2026
da66a6d
improve example w/ four_frames var
lapp0 Mar 20, 2026
f482fc2
improve example w/ four_frames var
lapp0 Mar 20, 2026
e8cd112
improve example w/ four_frames var
lapp0 Mar 20, 2026
9308e9e
dev dependency group for examples, uv docs
lapp0 Mar 20, 2026
c2261dd
dev dependency group for examples, uv docs
lapp0 Mar 20, 2026
e2060f0
fix a missing word
lapp0 Mar 20, 2026
c487603
Credit PR #20
lapp0 Mar 20, 2026
3d8b327
remove rotary embedding pytorch dependency
lapp0 Mar 21, 2026
ebe31bb
improve throughput by 4%
lapp0 Mar 21, 2026
3ff84dd
add non-blocking benchmark
lapp0 Mar 21, 2026
60c8864
no compile prep inputs
lapp0 Mar 23, 2026
f5bf64e
compile prep inputs after converting to tensor, avoid blocking
lapp0 Mar 23, 2026
2926cfb
fix, don't pass device to prompt encoder
lapp0 Mar 23, 2026
946be2c
fix button incorrect
lapp0 Mar 23, 2026
f0be311
fix button incorrect
lapp0 Mar 23, 2026
a236313
benchmark w/ ctrls
lapp0 Mar 25, 2026
9607bcf
benchmark w/ ctrls
lapp0 Mar 25, 2026
1d286e1
update config defaults
lapp0 Mar 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 18 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ export HF_TOKEN=<your access token>
from world_engine import WorldEngine, CtrlInput

# Create inference engine
engine = WorldEngine("Overworld/Waypoint-1-Small", device="cuda")
engine = WorldEngine("Overworld/Waypoint-1.5-1B", device="cuda")

# Specify a prompt
engine.set_prompt("A fun game")
Expand All @@ -77,14 +77,25 @@ for controller_input in [
img = engine.gen_frame(ctrl=controller_input)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably needs to be updated for 4-frame use, or this snippet should be deleted entirely and pointed at one of the examples

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the Waypoint-1.5 clarification below on the nature of img is sufficient

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a comment pointing to the clarification below, so they have an idea of what to expect for the shape of img

```

## Waypoint-1.5 Behavior
All interfaces and handling for Waypoint-1 (or 1.1) and Waypoint-1.5 remain the same **except** the following:

In Waypoint-1.5, the `img` passed to `append_frame(...)` and returned by `gen_frame(...)` is now a sequence of 4 frames. Waypoint-1.5 applies temporal compression and generates 4 frames for every controller input.
Copy link

@philpax philpax Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

describe the implications this has for frame pacing; what's the correct way to feed inputs and display the rendered frames to the user?


Whereas previously, `img` was a uint8 rgb array of shape `[Height, Width, 3]`, **in Waypoint-1.5 it is of shape `[4, Height, Width, 3]`**.

Additionally, Waypoint-1.5 expects 720p inputs / outputs, therefore `img` is `[4, 720, 1280, 3]`.

See [examples/gen_sample.py](./examples/gen_sample.py) for reference.

## Usage
```
from world_engine import WorldEngine, CtrlInput
```

Load model to GPU
```
engine = WorldEngine("Overworld/Waypoint-1-Small", device="cuda")
engine = WorldEngine("Overworld/Waypoint-1.5-1B", device="cuda")
```

Specify a prompt which will be used until this function is called again
Expand Down Expand Up @@ -118,11 +129,13 @@ Note: returned `img` is always on the same device as `engine.device`
@dataclass
class CtrlInput:
button: Set[int] = field(default_factory=set) # pressed button IDs
mouse: Tuple[float, float] = (0.0, 0.0) # (x, y) position
mouse: Tuple[float, float] = (0.0, 0.0) # (dx, dy) position change
scroll_wheel: int = 0 # down, stationary, or up -> (-1, 0, 1)
```

- `button` keycodes are defined by [Owl-Control](https://github.com/Overworldai/owl-control/blob/main/src/system/keycode.rs)
- `mouse` is the raw mouse velocity vector
- `mouse` is the the amount the change in mouse since last frame
- `scroll_wheel` is the ternary scroll wheel movement identifier


## Showcase and Examples
Expand All @@ -138,5 +151,5 @@ class CtrlInput:

### Examples and Reference Code

- ["Hello (Over)World" client](./examples/simple_client.py)
- ["Generate MP4 Sample Given Controller Inputs](./examples/gen_sample.py)
- [Run Performance Benchmarks (`pytest examples/benchmark.py`)](./examples/benchmark.py)
41 changes: 30 additions & 11 deletions examples/benchmark.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# MODEL_URI="Overworld/Waypoint-1.5-1B" uv run --dev pytest examples/benchmark.py

import os
import pytest
import torch
import random

from world_engine import WorldEngine
from world_engine import WorldEngine, CtrlInput


# TODO
# - benchmark encode img
# - benchmark encode prompt
MODEL_URI = os.environ.get("MODEL_URI", "Overworld/Waypoint-1-Small")


def version_with_commit(pkg):
Expand Down Expand Up @@ -49,7 +51,7 @@ def print_env_info():


def get_warm_engine(model_uri, model_overrides=None):
model_config_overrides = {"ae_uri": "OpenWorldLabs/owl_vae_f16_c16_distill_v0_nogan"}
model_config_overrides = {}
model_config_overrides.update(model_overrides or {})
engine = WorldEngine(
model_uri,
Expand All @@ -65,8 +67,8 @@ def get_warm_engine(model_uri, model_overrides=None):


@pytest.fixture(scope="session")
def engine(model_uri="Overworld/Waypoint-1-Small"):
return get_warm_engine(model_uri)
def engine():
return get_warm_engine(MODEL_URI)


@pytest.fixture(scope="session")
Expand All @@ -86,23 +88,40 @@ def run():
MODEL_OVERRIDES = [None]


@pytest.mark.parametrize("blocking", [True, False])
@pytest.mark.parametrize("dit_only", [True])
@pytest.mark.parametrize("n_frames", [256])
@pytest.mark.parametrize(
"model_overrides", MODEL_OVERRIDES,
ids=lambda d: (",".join(f"{k}={v}" for k, v in d.items()) or "") if d else ""
)
def test_ar_rollout(benchmark, dit_only, n_frames, model_overrides):
engine = get_warm_engine("Overworld/Waypoint-1-Small", model_overrides=model_overrides)
def test_ar_rollout(benchmark, dit_only, n_frames, model_overrides, blocking):
engine = get_warm_engine(MODEL_URI, model_overrides=model_overrides)

try:
total_params = sum(p.numel() for p in engine.model.parameters())
active_params = int(engine.model.get_active_parameters())
benchmark.name = f"{benchmark.name} | params={total_params:,} | active={active_params:,}"
except Exception:
pass

def setup():
engine.reset()
engine.gen_frame(return_img=not dit_only)
torch.cuda.synchronize()

def target():
for _ in range(n_frames):
ctrls = [
CtrlInput(
button=set(random.sample(range(1, 65), random.randint(0, 10))),
mouse=(random.random(), random.random()),
scroll_wheel=random.choice((-1, 0, 1))
)
for _ in range(n_frames)
]
for ctrl in ctrls:
engine.gen_frame(return_img=not dit_only)
torch.cuda.synchronize()
if blocking:
torch.cuda.synchronize()

benchmark.pedantic(target, setup=setup, rounds=20)
63 changes: 47 additions & 16 deletions examples/gen_sample.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,53 @@
# uv run --dev examples/gen_sample.py Overworld/Waypoint-1.5-1B

import cv2
from world_engine import WorldEngine
import imageio.v3 as iio
import random
import sys
import urllib.request
import numpy as np
import torch

from world_engine import WorldEngine, CtrlInput


# Create inference engine
engine = WorldEngine(sys.argv[1], device="cuda")


# Define sequence of controller inputs applied
controller_sequence = [
# move mouse, jump, do nothing, trigger, do nothing, trigger+jump, do nothing
CtrlInput(mouse=[0.2, 0.2]), CtrlInput(button={32}), CtrlInput(), CtrlInput(), CtrlInput(),
CtrlInput(button={1}), CtrlInput(), CtrlInput(), CtrlInput(button={1, 32}),
CtrlInput(), CtrlInput(), CtrlInput(), CtrlInput(), CtrlInput(), CtrlInput(),
] * 4
controller_sequence += [CtrlInput()] * 8
controller_sequence += (
[CtrlInput(button={32})] * 10 + # forward
[CtrlInput(button={65})] * 10 + # left
[CtrlInput(button={68})] * 10 + # right
[CtrlInput(button={83})] * 10 # backwards
)
controller_sequence += [CtrlInput()] * 10

def gen_vid():
engine = WorldEngine("OpenWorldLabs/CoDCtl-Causal-Flux-SelfForcing", device="cuda")
writer = None
for _ in range(240):
frame = engine.gen_frame().cpu().numpy()[:, :, ::-1] # RGB -> BGR for OpenCV
writer = writer or cv2.VideoWriter(
"out.mp4",
cv2.VideoWriter_fourcc(*"mp4v"),
60,
(frame.shape[1], frame.shape[0])
)
writer.write(frame)

writer.release()
# Set seed frame
url = random.choice([
"https://gist.github.com/user-attachments/assets/d81c6d26-a838-4afe-9d13-fd67677043c3",
"https://gist.github.com/user-attachments/assets/b6d18c38-098e-43b0-8e61-66a16e5d8946",
"https://gist.github.com/user-attachments/assets/0734a8c1-3eb4-4ffe-8c37-5665c45ab559",
"https://gist.github.com/user-attachments/assets/f9c20d4d-7565-452d-8b02-42a85ea175ed",
"https://gist.github.com/user-attachments/assets/68c943a4-008a-4c25-948c-c81ab4c47d21",
])
seed_frame = cv2.imdecode(np.frombuffer(urllib.request.urlopen(url).read(), np.uint8), cv2.IMREAD_COLOR)
seed_frame_x4 = torch.from_numpy(np.repeat(seed_frame[None], 4, axis=0))


if __name__ == "__main__":
gen_vid()
# Generate frames conditioned on controller inputs
with iio.imopen("out.mp4", "w", plugin="pyav") as out:
engine.append_frame(seed_frame_x4)
out.write(seed_frame_x4, fps=60, codec="libx264")
for ctrl in controller_sequence:
four_frames = engine.gen_frame(ctrl=ctrl).cpu().numpy()
out.write(four_frames)
8 changes: 7 additions & 1 deletion examples/prof.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
"""
Additional Dependencies: N/A
Run: `python3 examples/prof.py Overworld/Waypoint-1.5-1B`
"""
import sys

import torch
from torch.profiler import profile, ProfilerActivity

from world_engine import WorldEngine


def do_profile(n_frames=64, row_limit=20):
engine = WorldEngine("OpenWorldLabs/CoDCtl-Causal-Flux-SelfForcing", device="cuda")
engine = WorldEngine(sys.argv[1], device="cuda")
# warmup
for _ in range(4):
engine.gen_frame()
Expand Down
64 changes: 0 additions & 64 deletions examples/simple_client.py

This file was deleted.

19 changes: 14 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ build-backend = "setuptools.build_meta"

[project]
name = "world_engine"
version = "1.0.0"
requires-python = ">=3.9"
version = "1.5.0"
requires-python = ">=3.10"
dependencies = [
"taehv @ git+https://github.com/madebyollin/taehv.git@7dc60ec6601af2e668e31bc70acc4cb3665e4c22",
"torch==2.10.0",
"torchvision==0.25.0",
"torchaudio==2.10.0",
"einops",
"rotary-embedding-torch>=0.8.8",
"tensordict==0.10.0",
"transformers==4.57.3",
"ftfy",
Expand All @@ -21,8 +21,8 @@ dependencies = [
"accelerate==1.12.0",

# Triton (platform-specific)
"triton; sys_platform == 'linux'",
"triton-windows; sys_platform == 'win32'",
"triton==3.6.0; sys_platform == 'linux'",
"triton-windows==3.6.0.post26; sys_platform == 'win32'",
]

[tool.setuptools]
Expand All @@ -33,3 +33,12 @@ packages = [

[tool.setuptools.package-dir]
world_engine = "src"

[dependency-groups]
dev = [
"pytest",
"pytest-benchmark",
"opencv-python",
"imageio[pyav]",
"numpy",
]
Loading