Skip to content

Exp: warm start CI before tests#565

Draft
xyao-nv wants to merge 2 commits intomainfrom
xyao/exp/ci_warm_start
Draft

Exp: warm start CI before tests#565
xyao-nv wants to merge 2 commits intomainfrom
xyao/exp/ci_warm_start

Conversation

@xyao-nv
Copy link
Copy Markdown
Collaborator

@xyao-nv xyao-nv commented Apr 9, 2026

Summary

Short description of the change (max 50 chars)

Detailed description

  • What was the reason for the change?
  • What has been changed?
  • What is the impact of this change?
image

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea — pre-warming the shader cache so every subsequent SimulationApp in the CI run starts fast. A few observations:

Overall

  • The approach is sound: one throwaway SimulationApp launch compiles all RTX shaders into the on-disk cache, so the 4-5 test steps that follow skip the compilation tax. Clean and self-documenting.

scripts/warm_shader_cache.py

  1. argparse.Namespace construction — Manually assembling a Namespace(headless=True, enable_cameras=True, visualizer=[]) is fragile; if AppLauncher ever adds a required arg or changes its defaults, this will silently break. Consider using AppLauncher.add_app_launcher_args() to build the parser, then parse known args, or at minimum document why the hand-rolled namespace is sufficient.
  2. enable_cameras=True — Good call: this forces the rendering/RTX path so the shader compilation actually happens. Worth a short inline comment explaining why cameras are enabled (it's not obvious since this script doesn't render anything).
  3. Error handling — If AppLauncher or app.close() throws (e.g., no GPU, missing Kit runtime), the CI step will still fail with a non-zero exit, which is fine. But a try/except with a descriptive message could save debugging time on new runner configurations.
  4. Minor: The visualizer kwarg is passed as [] — is this needed? If AppLauncher defaults visualizer to None or [] already, this can be dropped for clarity.

.github/workflows/ci.yml

  • The warm step is correctly placed before the first test step in both test and test_policy jobs. Since the two jobs run in separate containers (different images), each needs its own warm-up — good.
  • The warm step doesn't set a timeout-minutes at the step level. Given shader compilation can take 60-180s (per the docstring), consider adding a step-level timeout (~5-10 min) as a safety net against hangs.

No blocking issues — this is a clean, well-scoped CI optimization. 👍

def main():
t0 = time.monotonic()
args = argparse.Namespace(headless=True, enable_cameras=True, visualizer=[])
launcher = AppLauncher(args)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Hand-rolling argparse.Namespace couples this to AppLauncher's internal expectations. If the launcher ever adds validation or required fields, this will break without a clear error.

Consider:

parser = argparse.ArgumentParser()
AppLauncher.add_app_launcher_args(parser)
args = parser.parse_args(["--headless", "--enable_cameras"])

This also self-documents the available flags.

def main():
t0 = time.monotonic()
args = argparse.Namespace(headless=True, enable_cameras=True, visualizer=[])
launcher = AppLauncher(args)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a brief comment explaining why enable_cameras=True — the point is to trigger the RTX/rendering shader compilation path, not to actually use cameras. Future readers might otherwise wonder.

#
# To restore sanity here is a goal for Isaac Lab Arena v0.3.

- name: Warm Kit shader cache
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a step-level timeout-minutes: 10 here. The docstring says shader compilation can take 60-180s, but if Kit hangs (no GPU, license issues, etc.) you'd want the step to fail fast rather than burn the full job timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant