Skip to content

ci(gpu): point LIBERO at bundled package assets in nightly test#322

Merged
shuheng-liu merged 1 commit into
mainfrom
claude/magical-banach-eb58a8
May 22, 2026
Merged

ci(gpu): point LIBERO at bundled package assets in nightly test#322
shuheng-liu merged 1 commit into
mainfrom
claude/magical-banach-eb58a8

Conversation

@shuheng-liu
Copy link
Copy Markdown
Member

@shuheng-liu shuheng-liu commented May 22, 2026

What this does

Fixes #319.

The nightly GPU test test_control_freq_reaches_real_robosuite_sim (added in #312) builds a real OffScreenRenderEnv to verify the configured control_freq actually reaches robosuite — the mocked sibling tests can't catch robosuite silently dropping the kwarg. Building a real env needs LIBERO's bddl_files/init_files/assets, but gpu_test.yml set LIBERO_CONFIG_PATH to an empty /tmp tree, so the test failed with:

FileNotFoundError: .../init_files/libero_10/LIVING_ROOM_SCENE2_..._in_the_basket.pruned_init

The empty-tree config is correct for the mocked CPU tests (they patch get_libero_path), but the real-sim GPU test needs actual assets. Note that just setting init_states=False in the test is not sufficient on its own: the env build in src/opentau/envs/libero.py reads the bddl file via get_libero_path("bddl_files"), so it would then fail on the missing bddl file instead. The workflow fix is the necessary part.

Changes:

  • .github/workflows/gpu_test.yml: point LIBERO_CONFIG_PATH at the assets bundled inside the installed LIBERO package (the fork's MANIFEST.in does graft libero, shipping bddl_files/init_files/assets) via LIBERO's own set_libero_default_path(), instead of the empty /tmp tree. As a bonus this also drops a pre-existing SC2155 shellcheck warning on the old export VAR="$(pwd)/..." line.
  • tests/envs/test_libero_control_freq.py: set init_states=False to focus the test on its control_freq assertion (matching the two mocked sibling tests); the env build still exercises the real bddl/asset path.

How it was tested

  • actionlint on the edited workflow: the Run Tests step is shellcheck-clean, and the change removes a pre-existing SC2155 warning.
  • Verified the LIBERO fork bundles the exact failing init_files/libero_10/*.pruned_init (~23 KB real file, not an LFS pointer) and the bddl_files/libero_10/*.bddl for the task under test, and that MANIFEST.in grafts them into the installed package.
  • Could not run the GPU test locally (no NVIDIA GPU; LIBERO not installed). Validation is the nightly job — gpu_test.yml was manually dispatched on this branch via workflow_dispatch; see the linked run in the checks.

How to checkout & try? (for the reviewer)

# Dispatch the nightly GPU workflow on this branch:
gh workflow run gpu_test.yml --ref claude/magical-banach-eb58a8
# On a CUDA machine with the libero extra installed, run just the fixed test:
pytest -sx tests/envs/test_libero_control_freq.py::test_control_freq_reaches_real_robosuite_sim

Checklist

  • I have added Google-style docstrings to important functions and ensured function parameters are typed.
  • My PR includes policy-related changes.
    • If the above is checked: I have run the GPU pytests (pytest -m "gpu") and regression tests.

test_control_freq_reaches_real_robosuite_sim builds a real
OffScreenRenderEnv, which needs LIBERO's bddl/init/asset files. The
workflow set LIBERO_CONFIG_PATH to an empty /tmp tree (correct for the
mocked CPU tests, which patch get_libero_path), so the test failed with
FileNotFoundError on a .pruned_init file.

Generate the config from the assets bundled in the installed package (the
fork's MANIFEST grafts bddl_files/init_files/assets) via
set_libero_default_path(). Also set init_states=False in the test to focus
it on the control_freq assertion; the env build still exercises the real
bddl/asset path, so init_states alone would not have fixed it.
@shuheng-liu shuheng-liu added the bug Something isn't working label May 22, 2026
@shuheng-liu shuheng-liu self-assigned this May 22, 2026
@shuheng-liu shuheng-liu marked this pull request as ready for review May 22, 2026 19:27
Copy link
Copy Markdown
Member Author

@shuheng-liu shuheng-liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: the LIBERO-assets CI fix is correct and empirically validated.

I traced the workflow incantation end-to-end against the pinned LIBERO fork (45c5a01) and the mechanics hold:

  • touch "$LIBERO_CONFIG_PATH/config.yaml"libero/libero/__init__.py only prompts on if not os.path.exists(config_file). Pre-creating the file skips the interactive first-run input(), which would otherwise hit EOF and crash the import under CI. The empty file is harmless because nothing reads the config at import time (get_libero_path is lazy).
  • set_libero_default_path() — reads LIBERO_CONFIG_PATH from the env and writes package-relative paths (benchmark_root = the installed libero/libero/ dir) into that same config.yaml, overwriting the empty file. The subsequent pytest inherits the exported LIBERO_CONFIG_PATH, so get_libero_path("bddl_files") resolves into the installed package.
  • The assets actually land in the install: MANIFEST.in does graft libero, and — the load-bearing part — setup.py sets include_package_data=True, so the grafted bddl_files/init_files/assets end up in site-packages, not just the sdist.

Blast radius: the env-var change applies to the whole pytest -m "gpu" run, but test_control_freq_reaches_real_robosuite_sim is the only GPU-marked test that touches LIBERO config paths — every other LIBERO test is CPU/mocked and patches get_libero_path. No collateral on the other GPU policy tests.

init_states=False: doesn't weaken the test — the control_freq read-back is independent of init states, it matches the two mocked siblings, and the real bddl path is still exercised by _make_envs_task. Belt-and-suspenders with the workflow fix, which is fine.

SC2155: accurate — switching export VAR="$(pwd)/..." to a literal removes the command-substitution-masks-return-value warning, and the new quoted mkdir/touch lines don't introduce any.

Validation: the manually-dispatched GPU run (26306227758) shows Run Pytest on GPU = success; CPU / pre-commit / check-checklist are all green.

Two optional, non-blocking notes:

  1. The workflows now use two different LIBERO config strategies — gpu points at the installed package via set_libero_default_path(), while cpu/regression still point at the empty .github/assets/libero tree. Intentional and explained in the inline comment, but a future reader editing one might not notice the other differs; a one-line cross-pointer would help.
  2. set_libero_default_path() trusts that the install contains the assets. If a future LIBERO bump dropped a file (or moved assets to LFS without materializing them), this would resurface as a FileNotFoundError at env-build time rather than anything clearer. Not worth guarding now — the nightly is the right place to catch it.

LGTM.


Generated by Claude Code

@shuheng-liu shuheng-liu merged commit b6316dd into main May 22, 2026
11 checks passed
@shuheng-liu shuheng-liu deleted the claude/magical-banach-eb58a8 branch May 22, 2026 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nightly GPU test failure: test_control_freq_reaches_real_robosuite_sim — missing LIBERO init_files in CI

1 participant