revert test by xyao-nv · Pull Request #559 · isaac-sim/IsaacLab-Arena

xyao-nv · 2026-04-08T20:30:20Z

Summary

Short description of the change (max 50 chars)

Detailed description

What was the reason for the change?
What has been changed?
What is the impact of this change?

kellyguo11

Review: PR #559 — revert test

Summary: This PR re-enables 5 eval runner tests that were previously skipped with @pytest.mark.skip(reason="Skipping because of CI stalling") by replacing those skip markers with @pytest.mark.with_subprocess.

Context understood: The base branch xyao/exp/ci_stall added infrastructure to separate subprocess-spawning tests from persistent SimulationApp tests (via the with_subprocess marker and CI workflow changes). This PR is the natural follow-up that actually reverts the skips and uses the new marker.

✅ What looks good

The change is straightforward: 5 mechanical replacements of skip → with_subprocess.
The with_subprocess marker is properly registered in pytest.ini and the CI workflow already has a dedicated step that runs -m with_subprocess tests with ISAACLAB_ARENA_SUBPROCESS_TIMEOUT=900.
The marker semantics are correct — these tests all use run_eval_runner_and_check_no_failures(), which spawns eval_runner.py via subprocess.run().

⚠️ Potential concern: missing timeout / process group isolation in `run_eval_runner_and_check_no_failures`

The utility run_subprocess() in isaaclab_arena/tests/utils/subprocess.py was carefully written with:

start_new_session=True to isolate GPU child processes
timeout=_SUBPROCESS_TIMEOUT_SEC to prevent hangs

However, run_eval_runner_and_check_no_failures() in this test file uses raw subprocess.run(args, capture_output=True, text=True, check=True) — no timeout, no start_new_session=True. If the original CI stalling was caused by subprocess hangs or orphaned GPU processes, these tests could still stall even with the marker separation, since the subprocess itself has no timeout.

Suggestion: Consider either:

Refactoring run_eval_runner_and_check_no_failures() to use run_subprocess() from the utils module, or
At minimum, adding timeout=_SUBPROCESS_TIMEOUT_SEC and start_new_session=True to the subprocess.run() call.

This isn't a blocker if the CI workflow-level timeout-minutes: 60 is sufficient, but it would be more robust to have per-subprocess timeouts too.

PR description

The PR description still has the default template text. A brief note about reverting the skips now that the CI stalling fix is in place would help future readers.

Overall: The change itself is correct and minimal. Approving since the marker infrastructure is in place and CI will validate.

xyao-nv added 4 commits April 8, 2026 10:17

timing stats

8d8e198

subprocess group

f5e4d20

lint

4442545

revert test

998b509

xyao-nv force-pushed the xyao/exp/ci_stall branch 2 times, most recently from bc9bb9d to e80dca2 Compare April 8, 2026 23:12

xyao-nv added 3 commits April 9, 2026 16:19

Merge branch 'xyao/exp/ci_stall' into xyao/exp/revert_eval_runner_tests

f10c77f

Update subprocess.py

12c4d2b

Update test_eval_runner.py

b59c562

kellyguo11 approved these changes Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

revert test#559

revert test#559
xyao-nv wants to merge 7 commits intoxyao/exp/ci_stallfrom
xyao/exp/revert_eval_runner_tests

xyao-nv commented Apr 8, 2026

Uh oh!

kellyguo11 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xyao-nv commented Apr 8, 2026

Summary

Detailed description

Uh oh!

kellyguo11 left a comment

Choose a reason for hiding this comment

Review: PR #559 — revert test

✅ What looks good

⚠️ Potential concern: missing timeout / process group isolation in run_eval_runner_and_check_no_failures

PR description

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

⚠️ Potential concern: missing timeout / process group isolation in `run_eval_runner_and_check_no_failures`