Seun/nist gear insertion task example by seun-doherty · Pull Request #566 · isaac-sim/IsaacLab-Arena

seun-doherty · 2026-04-09T23:24:05Z

Summary

Add a complete NIST gear insertion RL workflow: a Franka Panda robot inserts a medium gear onto a peg on the NIST assembly board using operational-space control and RL Games PPO
Include step-by-step documentation (environment setup, policy training, evaluation) mirroring the existing Franka lift task pages
Add RL Games policy wrapper, training script, and YAML config for LSTM-based asymmetric actor-critic

What's Included

Core task (isaaclab_arena/tasks/): task definition, keypoint-squashing rewards, 24-D policy observations with wrist-force feedback, insertion success / gear-drop terminations, domain randomization events

Environment (isaaclab_arena_environments/): OSC action term (asset-relative, EMA smoothing, dead-zones), Franka mimic OSC robot config, environment definition wiring scene + embodiment + task + RL Games

RL infrastructure (isaaclab_arena/policy/, scripts/): RL Games action policy wrapper, training script, PPO hyperparameter config

Documentation (docs/pages/example_workflows/nist_gear_insertion/): 4-page workflow (overview, environment setup, training, evaluation) with GIFs

Asset registry (isaaclab_arena/assets/object_library.py): NIST board, gear, peg, and connector asset definitions

NOTE: Object library paths for added assets will be updated when assets are uploaded to nucleus

## Summary Closedloop GN1.5 observed low SR in multi-episode rollout, esp in parallel-env where more contacts are being introduced / more episodes are being observed. ## Detailed description ### Static manip - Issue: At the beginning of episode, hands have close-open motions in recorded trajectories. Given microwave joint is not stiff enough, small deviations during first few inferences cause the door closed by mistake. And this closed door is hard to pull with static GR1, causing it to fail the task. [Screencast from 12-02-2025 03:16:36 PM.webm](https://github.com/user-attachments/assets/da06de60-8f01-47e7-ae26-a48e08cb523f) - Fix: a. Shorten task_episode_length_s to introduce more frequent resets once the door is closed. Tradeoff is introducing more episodes. b. Also tried with shorter `action_horizon` but ended up getting worse SR. My hypothesis is that it's hard for VLA to tell from visuals/states whether the door is closed to 0.2 (success) vs 0.21 (fail). > 16 -- Metrics: {'success_rate': 0.605, 'door_moved_rate': 0.955, 'num_episodes': 200} > 8 -- Metrics: {'success_rate': 0.225, 'door_moved_rate': 0.615, 'num_episodes': 200} > 1 -- Metrics: {'success_rate': 0.0, 'door_moved_rate': 0.985, 'num_episodes': 200} c. Switching to CPU PhyX does not solve above issues. So keep it on GPU for faster parallelization (in theory). ### Loco manip - Issue: After each reset, the left arm tends to have fast motions and the box is tilted. Also observed significant penetration among fingers. See 00:15 VS 0:30 for 5 parallel env closedloop in below video. [Screencast from 11-25-2025 03:42:36 PM.webm](https://github.com/user-attachments/assets/c4934817-65fa-412f-a88c-af143d25d7c2) - Fix: switch to CPU phyX, keep the policy on GPU Arms open first and G1 starts moving, box is placed with expected pose. [Screencast from 12-02-2025 10:15:59 PM.webm](https://github.com/user-attachments/assets/4a02e6cd-7baf-441b-8c0f-7146051e5c9a) ### Minor fixes Update doc on cmds & metrics.

## Summary expose env spacing parameter

## Summary Modify docs to show that this is manual annotation

## Summary Add ground plane and light objects

## Summary Users might want to modify env cfg components such as sim config. This lets them do it.

## Summary Move our CI infra to public runners ## Detailed description - As part of our open-source release, we can no longer run on internal infra. - This MR moves our runners to public runners. - I also took the chance to refactor and modularize the workflow file.

## Summary Update link to the docs in README.md to the new public location. ## Detailed description - Docs url has changed now that the repo is public.

## Summary Fixes an issue that our tags requesting for public CI were incorrect. ## Detailed description - Corrects the tag `[gpu]` -> `[self-hosted, gpu]`

## Summary Revert to mapping the whole repo in the dev docker. ## Detailed description - Previously we changed to mapping only specific folders in the repo. - This was done for docker build speed (I believe?) - The issue is that we want (even if occasionally) to work on all folders in the repo, within the dev docker. - This reverts to mapping the whole repo.

## Summary Re-enables pre-commit in CI. ## Detailed description - During the refactoring and switch to public CI, `pre-commit` was broken. - This MR fixes it.

## Summary Language prompts are fetched from Task's data member, populated from ArenaEnv creation. ## Detailed description - `task_description` is automatically populated into atomic task, and users have the freedom to overwrite it when instantiating the task class - `Policy` sets its `task_description` attribute thru a setter func - `Policy_runner.py` connects the `task_description` from Task to `task_description` setter in `Policy` - GR00T consumes description either thru `task_description` data member or `policy_config`

## Summary All example environments.py are repackaged into isaaclab_arena_environments ## Reason In prep for multi-task evaluation, as we may introduce more example envs.

## Summary Move the multi-versioned docs now that we have multiple version of Isaac Lab Arena. ## Detailed description - Means users can read the version of the docs that shipped the release that they're using. <img width="1130" height="625" alt="version_sidebar" src="https://github.com/user-attachments/assets/b06372cd-9bed-4a1d-99b8-9480c279ebb4" />

## Summary Tear down simulation app func could be useful both in core & tests. Prep for it to be consumed. ## Detailed description - Add USD `get_new_stage()` to jupyter notebook example, resolving issue where USDs from previous run are not removed - Add optional `suppress_exception `to let exception raised by default, or ignored in tests - Move this func to `isaaclab_utils`

## Summary Fix git ownership issues in deployment pipeline. ## Detailed description - Multi-version docs now require git during documentation build. - This revealed git ownership issues in the page deployment pipeline (previously seen in our pre-merge pipeline) - This MR applies the same fix that's used in pre-merge.

## Summary Update object library to use ISAAC_NUCLEUS_DIR prefix for YCB object usd paths

## Summary Refactor `mimic_env_cfg` building logic in `arena_env_builder`. ## Detailed description - What was the reason for the change? - Originally we need to maintain a list of embodiment_names in each task's MimicEnvCfg, given its single_arm or dual_arm. It is not efficient and scalable. - What has been changed? - creates a new enum class `MimicArmMode` to represent the arm mode for the mimic environment: can select from ["single_arm", "dual_arm", "left", "right"] - assigns a property of 'mimic_arm_mode' to embodiment_base - changes task.get_mimic_env_cfg() method's input from 'embodiment_name' to 'mimic_arm_mode' - refactors the SubTaskConfigs configuration logic in each MimicEnvCfg based on embodiment.mimic_arm_mode - What is the impact of this change? - all existing embodiments and tasks with MimicEnvCfg.

## Summary Add settings file

## Summary `RigidObjectSet` inherited from `Object` to enable users provide a list of assets, and sim app spawn each `env_id` with one obj from this set. ## Detailed description - Introduced `RigidObjectSet(Obejct)` class for handle rigid body object set construction - The order of each obj in the set to load in each env_id could be configured as following func args order, or being random. - Introduced `--object_set` in `kitchen_pick_and_place.py` cli to allow spawning for each env_id - Added tests for empty/single/multi object sets & checker each env_id's usd is referenced in expected sequence ## TODO - Pipe clean & verify other task-centered obj metrics/ attributes access (Done in test) - Introduce this concept in other sample envs & multi-task eval ## Note - Naming to `set` instead of `collection` is to differentiate what [`RigidBodyCollection`](https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab/isaaclab/assets/rigid_object_collection/rigid_object_collection_data.py) from IssacLab provides. In our use case, we need to spawn 1 obj from N objs, where `RigidBodyCollection` API is to spawn all N objs for each id. - MultiAssetSpawnerCfg for articulated objs will be tricky (/buggy) as PhyX APIs require it has the same joint prim path. It puts too much constraints on what could be added into the set - [MultiAssetSpawnerCfg](https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab/isaaclab/sim/spawners/wrappers/wrappers_cfg.py#L16) for rigid objs require the same type of collision meshes, as written in Lab's doc. <img width="720" height="295" alt="image" src="https://github.com/user-attachments/assets/71983e83-d586-427b-a1dd-3eb047be817f" />

## Summary This PR adds initial support for composite sequential tasks via the SequentialTaskBase class. The SequentialTaskBase class takes a list of atomic tasks (TaskBase) and automatically assembles them into a composite task with unified termination/event configs. Adds: 1. SequentialTaskBase class 2. Test case to validate class methods 3. Test case with example task (sequential open door task) to validate unified success check and events 4. Two new functions in `isaac_arena/utils/configclass.py` to perform config transformation and duplicate checking

## Summary Fixes a typo which caused a misnaming of EEFs in the Mimic Env Configs of tasks. The name of the eefs was being set as an Enum instead of the value of the Enum. This caused data generation to fail using our existing datasets.

## Summary Implement `OpenDoorTask` and `CloseDoorTask` inherited from `RotateRevoluteJointTask` ## Detailed description - Generalize the task to common articulated objects with revolute joint. E.g. Cabinet door; Window panel (rotate outward or inward relative to the fixed frame using a hinge); Scissor blade at the pivot pin. - Open vs Close Door task differ in terminations & reset events, but share the same underlying logics handling revolute joint. - Add `is_closed` member function to `Openable` affordance, using threshold to decide either open or close. Basically use it as a bi-state object. ## TODO - Test with other articulated objects, than the overly-used microwave

## Summary Simplify device registry and add a retargeter registry

## Summary Add warning to docs

## Summary Adds a new embodiment of Agibot Renames MimicArmMode to ArmMode, since arm_mode property is not for mimic_env_cfg building purpose only ## Detailed description - What was the reason for the change? - Agibot is a widely applied mobile-base bimanual humanoid in China. - Renames MimicArmMode to ArmMode, since in AgibotCfg, 'arm_mode' property is not for mimic_env_cfg building purpose only, but for SceneCfg/ActionsCfg definition as well - What has been changed? - Adds a new agibot.py - Renames MimicArmMode to ArmMode

## Summary Add instruction to build multi docs

## Summary Basic RL training workflow --------- Co-authored-by: Alex Millane <amillane@nvidia.com> Co-authored-by: peterd-NV <peterd@nvidia.com> Co-authored-by: Xinjie Yao <xyao@nvidia.com>

## Summary We allow addings spawn_cfg_addon and asset_cfg_addon. This exposes and gives possiblitly for a user to create any supported assets with various settings.

…#293) ## Summary Adds new new affordance: **placeable** and new atomic task: **place upright task** ## Detailed description - What was the reason for the change? - Place upright task is a common inhouse task for humanoids. We believe '**placeable**' should be a basic affordance to support. So we add related atomic task and example to showcase how to use this affordance. - What has been changed? - adds new atomic task: place upright task (and a test script for this task) - adss new affordance: placeable (placeable object can be placed upright, like mug, bottle, etc.) - adds new example: agibot left_arm place upright mug - adds a unitest for 'place_upright_task' - What is the impact of this change? - fixed the bug in agibot.py: since SceneCfg and ActionsCfg cannot accept a new type of property like ArmMode in manager-based workflow ## Test Pipeline 1. Zero Action Policy: `python isaaclab_arena/examples/policy_runner.py --policy_type zero_action tabletop_place_upright --object mug` 2. Record demos: `python isaaclab_arena/scripts/imitation_learning/record_demos.py --dataset_file datasets/dataset_agibot_left_arm_rel.hdf5 --num_demos 1 tabletop_place_upright --teleop_device keyboard` 3. Replay demos: `python isaaclab_arena/scripts/imitation_learning/replay_demos.py --dataset_file datasets/dataset_agibot_left_arm_rel.hdf5 tabletop_place_upright` 4. Test place_upright_mug task: `pytest isaaclab_arena/tests/test_place_upright_mug.py`

## Summary Move dependencies from Dockerfile to setup.py ## Detailed description - This simplifies the installation process for external users because now `pip install ./isaaclab_arena` will install arenas dependencies. - Prior to this change, arena made assumptions about the environment was was being installed in. These assumptions were met through our dockerfile. - For an external user this will make the installation process much easier.

## Summary Upgrade IsaacLab interop for Isaac Lab 3.0 ## Detailed description - In IsaacLab 3.0 kit is optional, whereas in Isaac Lab-Arena, it's compulsory. - We therefore add a step to our interop. callback to detect if kit has been started and if not, start it. - Update docs with new Lab 3.0 visualizer arguments - Reactivate the RSL-RL test.

## Summary [MR](#531) introduces a torch conflict in the docker. Resolve it in this MR. ## Detailed description - Deepspeed pulls in torch 2.11+cu13 as deps. - After the fix, deepspeed's transitive deps (hjson, msgpack, psutil, py-cpuinfo, ninja, etc.) are all resolved automatically by pip, and the duplicate torch is cleaned up right after.

@alexmillane

…#530) ## Summary Prevent [RslRlVecEnvWrapper](https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_rl/isaaclab_rl/rsl_rl/vecenv_wrapper.py#L66) from calling a duplicate env.reset() during inference, which inflated `num_episodes` by one and miscomputed success_rate. ## Detailed description - What was wrong: `RslRlVecEnvWrapper.__init__ `unconditionally calls `env.reset()` (intended for training, where the runner never resets). During eval, the rollout loop in `policy_runner.py` already calls `env.reset() `before the first step. When the RSL-RL policy is lazily loaded on the first get_action call, the wrapper's second reset records a phantom failed episode via the `RecorderManager`, producing `num_episodes = N+1 * num_envs` and diluting `success_rate`. - What was changed: Introduced `_RslRlInferenceEnvWrapper`, a subclass of `RslRlVecEnvWrapper` that temporarily replaces env.reset with a no-op during `__init__,` then restores it. `RslRlActionPolicy._load_policy` now uses this wrapper instead of `RslRlVecEnvWrapper` directly. Avoid modifying Lab core code - Add a TODO to add the test case once the test_rsl_rl is enabled by @alexmillane.

## Summary Update the lift object RL example to have a high success rate model ## Detailed description - What was the reason for the change? Existing lift object RL training model success rate is low (~30%), and the arm motion is unnatural. - What has been changed? Add a franka joint control embodiment for RL training to avoid the weird arm motion from IK version Update the observation to include joint and target poses only Fix a bug in the base rls policy so the target pose (task_obs in addition to policy) is passed to the actor/critic model Fix a bug resulting ~0 success rate for parallel eval due to incorrect object/target frame in the success term Update RL docs with latest models and commands - What is the impact of this change? The RL model now gets 70~80% success rate within 1.5h --------- Co-authored-by: Xinjie Yao <xyao@nvidia.com>

## Summary - Add Fabio Ramos and Xuning Yang ([Robolab authors](https://gitlab-master.nvidia.com/xuningy/robolab)) to the Contributors section - Fix alphabetical ordering of Hui Kang Signed-off-by: Clemens Volk <cvolk@nvidia.com> Co-authored-by: Xinjie Yao <xyao@nvidia.com>

## Summary Fixes some issues that surfaced after upgrading to IsaacLab3: - Add `--visualizer kit` to all `policy_runner.py` and `eval_runner.py` commands in the docs - Fix DROID robot spawning upside-down due to stale WXYZ quaternion convention - The viewer camera was stuck far away from the robot and any `ViewerCfg` set in the environment had no effect and the view could not be overridden: Fix viewer camera position not applied when using the `kit` visualizer backend --------- Signed-off-by: Clemens Volk <cvolk@nvidia.com> Co-authored-by: Xinjie Yao <xyao@nvidia.com>

## Summary Port the IsaacLab dexsuite_kuka_allegro_env_cfg.py example with Newton to arena for evaluation python isaaclab_arena/evaluation/policy_runner.py --policy_type rsl_rl --checkpoint_path /models/isaaclab_arena/dexsuite_kuka_allegro/model_14999.pt --visualizer newton --num_steps 1000 dexsuite_lift ![dexsuite_newton](https://github.com/user-attachments/assets/83abf2dd-1d2f-4de7-bf4f-9575ebadd09d) --------- Co-authored-by: Xinjie Yao <xyao@nvidia.com>

## Summary Previously, non-anchor objects were initialized anywhere in a large fixed box. Objects with `On(table)` constraints would start outside the table, get pushed to the edge by the solver, and cluster at the boundary. With this change they start distributed across the whole surface. - Replace fixed initialization box with per-object initialization guided by `On` relations - Objects with `On(parent)` now start uniformly within the parent's X/Y footprint at the correct Z height, giving the solver a valid warm start and producing more even surface coverage - Remove `init_bounds` and `init_bounds_size` from `ObjectPlacerParams` - Use explicit `torch.Generator` for seeding instead of global `torch.manual_seed`, so placement seed does not pollute Isaac Sim's global RNG state --------- Signed-off-by: Clemens Volk <cvolk@nvidia.com>

@sangeetas-nv

## Summary Add a "Publishing Your Own Benchmark" section to the README Ecosystem page with a three-step workflow (own repo, cite in papers, PR a link here) using RoboTwin as the reference example. Following @sangeetas-nv's suggestion in slack [thread](https://nvidia.slack.com/archives/C097CP8FG67/p1775070050814859?thread_ts=1775068933.390519&cid=C097CP8FG67).

## Summary Fixes documentation on how to run the zero_action policy. - Argument order: `--num_steps` & `--distributed --headless` - Missing required arguments:` --object ` - Updated the path to be wrt. to the root of the repo as we do in other examples (eg isaaclab_arena/evaluation/policy_runner.py) --------- Co-authored-by: Xinjie Yao <xyao@nvidia.com>

…518) ## Summary - Adds a new **Getting Started** page (`arena_in_your_repo.rst`) documenting the recommended pattern for consuming Arena as an unmodified git submodule from an external project, inspired by and based on the integration pattern used in [nvblox_next](https://github.com/nvidia-isaac/nvblox_next/tree/e3d4fec646004956ac24ed3446dbb41c531d5908/datagen) ## Notes - **Not yet tested** end-to-end — the examples reflect the observed nvblox_next patterns but have not been validated in a fresh external repo --------- Signed-off-by: Clemens Volk <cvolk@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: alex <amillane@nvidia.com>

…550) ## Summary Adds documentation of more advanced external usage of arena - custom tasks and embodiments. ## Detailed description - Adds a new example external environment that introduces a custom task and a custom environment. - Documents this - Adds a test that covers this new env. Copy of #545 --------- Signed-off-by: Clemens Volk <cvolk@nvidia.com> Co-authored-by: Clemens Volk <cvolk@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

## Summary Fix SQA bug https://nvbugspro.nvidia.com/bug/6059661

#555) ## Summary Remove outdated `--enable_pinnoccio` from our generation commands. ## Detailed description - Addresses [6059248](https://nvbugspro.nvidia.com/bug/6059248) Co-authored-by: Xinjie Yao <xyao@nvidia.com>

## Summary `--visualizer` arg is deprecated in Lab 3.0 in favour of `--viz,` this PR replaces all occurences of it. Also fixes docs that instruct users to remove `--headless` flag to see the output but didn't specify that `--viz kit` needed to be added.

## Summary Fix https://nvbugspro.nvidia.com/bug/6058876, https://nvbugspro.nvidia.com/bug/6060494 on non-optimal camera viewing angles ## Detailed description - Root cause: `_reapply_viewer_cfg(env)` only runs inside `make_registered()` / `make_registered_and_return_cfg().` Any script that calls `build_registered()` + `gym.make()` directly bypasses it, so the viewer camera stays at Kit's default position instead of the configured eye/lookat. - Fix: Among `record/replay/generate/annotate_dataset,py` call `reapply_viewer_cfg(env)` on the wrapped env (before `.unwrapped)` right after `gym.make()` <img width="864" height="529" alt="image" src="https://github.com/user-attachments/assets/667a24df-e68f-4c1d-9668-adcd7746c443" /> from replay Co-authored-by: peterd-NV <peterd@nvidia.com>

…r Task (#560) ## Summary Fixes HF dataset version (v2.3 -> v3.0) in download command for step 1 of GR1 Open Microwave Door Task

## Summary Switches `.. tabs::` for `.. tab-set::` in the docs to fix readability issues in dark mode. ## Detailed description - For some reason `tab-set` handles dark mode better. - Addresses: [5727965](https://nvbugspro.nvidia.com/bug/5727965) Before: <img width="1350" height="1630" alt="image" src="https://github.com/user-attachments/assets/1d25a745-2ef2-4105-8405-96e01e8b60c8" /> After: <img width="830" height="799" alt="image" src="https://github.com/user-attachments/assets/4dc6652d-b2cf-4b50-9fea-490b75b4498b" />

## Summary The G1 whole-body controller currently only supports the Homie V2 lower-body policy. This MR adds the WBC-AGILE end-to-end velocity policy as an alternative lower-body controller for the G1 robot, and wires it into the environment so it can be used via a new `g1_wbc_agile_joint` embodiment. ## Changes ### AGILE policy implementation - Add `G1AgilePolicy` ONNX-based end-to-end velocity policy (`g1_agile_policy.py`) - Add `AgileConfig` dataclass and `g1_agile.yaml` joint ordering / model I/O config - Register `"agile"` variant in `wbc_policy_factory.py` ### Model download - Add `docker/setup/download_wbc_models.sh` to download and verify the AGILE ONNX model at Docker build time (SHA256 checked), removing the need for runtime download ### Environment integration - Add `AgileConfig` branch in `G1DecoupledWBCJointAction` so `wbc_version="agile"` is accepted by the action term - Register `G1WBCAgileJointEmbodiment` (`g1_wbc_agile_joint`) — mirrors `g1_wbc_joint` but uses the AGILE lower-body policy ### Tests - Add unit tests and stability tests for `G1AgilePolicy` in `test_g1_agile_policy.py` ## Results Run the AGILE policy with the G1 robot: ```bash /isaac-sim/python.sh isaaclab_arena/evaluation/policy_runner.py \ --policy_type zero_action \ --num_envs 1 \ --num_steps 1000 \ --enable_cameras \ --visualizer kit \ galileo_g1_locomanip_pick_and_place \ --embodiment g1_wbc_agile_joint ``` ## Test plan - [x] `pytest isaaclab_arena_g1/g1_whole_body_controller/wbc_policy/tests/test_g1_agile_policy.py` — unit and stability tests pass - [x] Run `policy_runner.py` with `--embodiment g1_wbc_agile_joint` — environment loads and steps without errors - [x] Run `policy_runner.py` with `--embodiment g1_wbc_joint` — existing Homie V2 path is unaffected --------- Signed-off-by: Lionel Gulich <lgulich@nvidia.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Summary - Deletes `isaaclab_arena/reinforcement_learning/`: A module that existed solely for an `RLFramework` enum whose only method produced strings like `"rsl_rl_cfg_entry_point"` - Replaces `rl_framework: RLFramework` with `rl_framework_entry_point: str` on `IsaacLabArenaEnvironment` and let the user pass in the correct string directly Signed-off-by: Clemens Volk <cvolk@nvidia.com>

NIST peg-insert gear assembly environment: custom OSC action term, keypoint squashing rewards, IK gripper reset, and domain randomization. Includes RL-Games PPO config, train/play scripts, and policy runner wrapper.

Isaac Lab 3.0 changed from wxyz to xyzw quaternion ordering. This caused the robot to spawn upside down, leading to IK solver failure, NaN observations, and training crashes. Key fixes: - Robot init rotation: (1,0,0,0) -> (0,0,0,0,1) in robot_configs.py - Grasp rotation offset: wxyz -> xyzw in environment config - Quaternion canonicalization: check w at index 3 (not 0) everywhere - Replace torch_utils (wxyz) with math_utils (xyzw) in OSC action - Wrap all warp arrays with wp.to_torch() for PyTorch compatibility - Update deprecated IsaacLab API calls to _index variants - Increase gpu_collision_stack_size to 4 GB for contact-heavy scenes

- Consolidate 3 observation files into single gear_insertion_observations.py - Replace 4 custom obs functions with Isaac Lab built-ins (root_pos_w, root_quat_w) - Slim NistGearInsertionTask constructor (40 → 17 params) via GraspConfig dataclass - Hardcode reward weights in configclass instead of passing through constructor - Delete bespoke play_rl_games.py; use generic policy_runner.py for evaluation - Genericise RlGamesActionPolicy (remove NIST-specific defaults) - Move RL-Games YAML config to isaaclab_arena_examples/policy/ - Clean up mdp/__init__.py re-exports - Add merge readiness report and NIST vs Lift comparison doc

- Keep success term registered (returns all-False during training) so SuccessRateMetric can query it, matching Lift task pattern - Loosen success_z_fraction from 0.05 to 0.15 (3mm depth threshold) - Add new NIST asset definitions to object_library

Add step-by-step documentation for the NIST gear insertion RL workflow (environment setup, policy training, evaluation) mirroring the existing Franka lift task pages. Include task overview GIFs, register the new workflow in the RL workflows index, and clamp NaN/inf values in force-torque observations to prevent training instabilities.

isaaclab-review-bot · 2026-04-09T23:54:58Z

🤖 IsaacLab Review Bot — PR #566

Note: This PR was closed and reopened as #567. Posting review here for the record; the same findings apply to #567.

Overview

Large PR adding a complete NIST gear insertion RL workflow: task definition, keypoint-squashing rewards, 24-D policy observations with wrist-force feedback, OSC action term (asset-relative, EMA smoothing, dead-zones), Franka mimic OSC robot config, RL Games policy wrapper, training script, YAML config, documentation, and asset registry entries. 541 files changed — this includes many changes unrelated to the gear insertion feature (CI, docs restructuring, new embodiments, etc.), which makes isolated review of just the gear insertion feature difficult.

Findings

1. 🔴 Critical: `_pred_scale` is global state — breaks multi-env semantics

File: isaaclab_arena/tasks/rewards/gear_insertion_rewards.py — success_prediction_error.__call__

if true_success.float().mean() >= self._delay_until_ratio:
    self._pred_scale = 1.0

_pred_scale is a scalar that applies to all environments but is flipped to 1.0 based on mean() across all envs. Once flipped, it never resets to 0.0 — not on episode reset, not on environment reset. This means:

Early episodes where delay_until_ratio hasn't been reached will correctly suppress the penalty
But once any batch of environments triggers it, all future episodes (even newly reset ones) will be penalized from step 0

Suggestion: Track per-env episode progress or make _pred_scale per-env and reset it in a reset() method.

2. 🟡 Warning: Repeated tensor allocation in hot paths

File: isaaclab_arena/tasks/rewards/gear_insertion_rewards.py — _check_gear_position

held_off = (
    torch.tensor(held_gear_base_offset, device=env.device, ...).unsqueeze(0).expand(...)
)
offset = torch.tensor(peg_offset, device=env.device, ...).unsqueeze(0).expand(...)

_check_gear_position is called by gear_insertion_engagement_bonus, gear_insertion_success_bonus, and indirectly by success_prediction_error — all every step. Each call creates new tensors from Python lists. For thousands of envs at 60+ Hz this creates GC pressure.

Suggestion: Pre-compute these tensors once (e.g., in __init__ of the calling classes) and reuse them, similar to how gear_peg_keypoint_squashing already caches self.peg_offset and self.held_gear_base_offset.

3. 🟡 Warning: Observation class directly accesses `env.action_manager._terms["arm_action"]` private API

File: isaaclab_arena/tasks/observations/gear_insertion_observations.py — NistGearInsertionPolicyObservations.__call__

arm_osc_action = env.action_manager._terms["arm_action"]

This accesses the private _terms dict of the action manager. If the action term name changes or the API evolves, this will silently break. The same pattern appears in multiple reward classes in gear_insertion_rewards.py.

Suggestion: Consider adding a public accessor on the action manager or passing the action term via config params rather than reaching into internals. At minimum, add a clear error message if the key isn't found.

4. 🟡 Warning: Quaternion canonicalization zeros out z/w components

File: isaaclab_arena/tasks/observations/gear_insertion_observations.py — NistGearInsertionPolicyObservations.__call__

noisy_quat[:, [2, 3]] = 0.0
noisy_quat = noisy_quat * self._flip_quats.unsqueeze(-1)

Zeroing the z and w components of a quaternion after rotation noise application, then multiplying by a random sign, produces a non-unit quaternion that represents a projection rather than a valid rotation. If this is intentional (e.g., only tracking x,y for a 2-DOF orientation), it should be documented. Otherwise, this silently corrupts the rotation information fed to the policy.

5. 🟡 Warning: No tests for the new NIST gear insertion task

Despite this being a complete new task workflow, there are no test files specific to the gear insertion task (no test_nist_gear_insertion*.py). The PR adds many other tests (test_assembly_task, test_sorting_task, etc.) but the core new feature — the gear insertion task, its rewards, observations, and OSC action — has no test coverage.

Suggestion: Add at least:

Unit tests for reward functions (especially the squashing-fn keypoint rewards with known geometry)
Unit tests for _check_gear_position with controlled inputs
Integration test that the environment can be created and stepped

6. 💡 Nit: `body_quat_canonical` sign convention may produce discontinuities

File: isaaclab_arena/tasks/observations/gear_insertion_observations.py

sign = torch.where(quat[:, 3:4] < 0, -1.0, 1.0)
return quat * sign

This canonicalizes by flipping so w >= 0. But for orientations near w ≈ 0, small noise flips the sign, creating observation discontinuities. This is a known issue with quaternion canonicalization and may hurt LSTM-based policies. Consider using the make_quat_unique utility from Isaac Lab which handles this more robustly.

7. 💡 Nit: `gear_peg_height` default unused in environment

The NistGearInsertionTask constructor defaults gear_peg_height=0.02, but the environment never passes this parameter — it uses the default. Meanwhile, success_z_fraction=0.20 is explicitly set. The actual success threshold is 0.02 * 0.20 = 0.004m (4mm Z tolerance). This should be documented since it's a critical tuning parameter that's easy to miss.

8. 💡 Nit: Duplicate copyright headers

File: isaaclab_arena/scripts/reinforcement_learning/train_rl_games.py

# Copyright (c) 2026, The Isaac Lab Arena Project Developers ...
# Copyright (c) 2025-2026, The Isaac Lab Arena Project Developers.

Summary

The gear insertion task is well-structured and follows established patterns (Factory-style keypoint rewards, OSC action space, assembly peg-insert benchmarks). The multi-file decomposition (task / observations / rewards / action / environment) is clean and consistent with the existing Arena architecture.

Main concerns: The _pred_scale global state bug (Finding #1) could cause training instabilities. The lack of tests for the core new feature (Finding #5) is a significant gap given the complexity of the reward/observation logic. The hot-path tensor allocations (Finding #2) will hurt training throughput at scale.

The PR's scope is very large (541 files) — consider breaking unrelated changes (CI, docs restructuring, new embodiments) into separate PRs for easier review.

xyao-nv and others added 30 commits December 3, 2025 13:33

expose the env spacing parameter (#260)

553d4ea

## Summary expose env spacing parameter

Add comment to show that this is manual annotation (#257)

6c305d4

## Summary Modify docs to show that this is manual annotation

Add/ground plane anbd light (#253)

778177d

## Summary Add ground plane and light objects

Add env cfg callback to modify env cfg (#259)

fdc26ac

## Summary Users might want to modify env cfg components such as sim config. This lets them do it.

Update docs link in the README. (#270)

c7b7077

## Summary Update link to the docs in README.md to the new public location. ## Detailed description - Docs url has changed now that the repo is public.

Correct public CI tags. (#276)

e544861

## Summary Fixes an issue that our tags requesting for public CI were incorrect. ## Detailed description - Corrects the tag `[gpu]` -> `[self-hosted, gpu]`

Reenable pre-commit. (#278)

4b94723

## Summary Re-enables pre-commit in CI. ## Detailed description - During the refactoring and switch to public CI, `pre-commit` was broken. - This MR fixes it.

Package example_environments into isaaclab_arena_environments (#281)

43ed13f

## Summary All example environments.py are repackaged into isaaclab_arena_environments ## Reason In prep for multi-task evaluation, as we may introduce more example envs.

Update README.md (#288)

dba0995

Update object library paths to use ISAAC_NUCLEUS_DIR prefix (#291)

b4feca3

## Summary Update object library to use ISAAC_NUCLEUS_DIR prefix for YCB object usd paths

add settings file to pick up packages for easier development (#294)

3bb3ede

## Summary Add settings file

Fix mis-named mimic eef in tasks (#297)

cf797d0

## Summary Fixes a typo which caused a misnaming of EEFs in the Mimic Env Configs of tasks. The name of the eefs was being set as an Enum instead of the value of the Enum. This caused data generation to fail using our existing datasets.

Feature/teleop design (#286)

518af83

## Summary Simplify device registry and add a retargeter registry

Add warning in docs (#302)

f6c9d91

## Summary Add warning to docs

Add instruction to build multi docs (#304)

6c057f8

## Summary Add instruction to build multi docs

Basic RL workflow example (#287)

e8e83e1

## Summary Basic RL training workflow --------- Co-authored-by: Alex Millane <amillane@nvidia.com> Co-authored-by: peterd-NV <peterd@nvidia.com> Co-authored-by: Xinjie Yao <xyao@nvidia.com>

Add add ons to set configurations (#308)

2255e54

## Summary We allow addings spawn_cfg_addon and asset_cfg_addon. This exposes and gives possiblitly for a user to create any supported assets with various settings.

alexmillane and others added 21 commits April 2, 2026 18:41

add pythonpath in doc for gr00t (#553)

2dee451

## Summary Fix SQA bug https://nvbugspro.nvidia.com/bug/6059661

Fix HF dataset version in download command for GR1 Open Microwave Doo…

61d3b14

…r Task (#560) ## Summary Fixes HF dataset version (v2.3 -> v3.0) in download command for step 1 of GR1 Open Microwave Door Task

seun-doherty requested a review from alexmillane April 9, 2026 23:24

seun-doherty added 5 commits April 9, 2026 16:26

Add NIST gear insertion task with RL-Games training pipeline

012b8e3

NIST peg-insert gear assembly environment: custom OSC action term, keypoint squashing rewards, IK gripper reset, and domain randomization. Includes RL-Games PPO config, train/play scripts, and policy runner wrapper.

seun-doherty force-pushed the seun/nist_gear_insertion_task_example branch from 0bd15bd to 93e071e Compare April 9, 2026 23:27

seun-doherty closed this Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seun/nist gear insertion task example#566

Seun/nist gear insertion task example#566
seun-doherty wants to merge 184 commits intorelease/0.1.1from
seun/nist_gear_insertion_task_example

seun-doherty commented Apr 9, 2026

Uh oh!

isaaclab-review-bot bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

seun-doherty commented Apr 9, 2026

Summary

What's Included

Uh oh!

isaaclab-review-bot bot commented Apr 9, 2026

🤖 IsaacLab Review Bot — PR #566

Overview

Findings

1. 🔴 Critical: _pred_scale is global state — breaks multi-env semantics

2. 🟡 Warning: Repeated tensor allocation in hot paths

3. 🟡 Warning: Observation class directly accesses env.action_manager._terms["arm_action"] private API

4. 🟡 Warning: Quaternion canonicalization zeros out z/w components

5. 🟡 Warning: No tests for the new NIST gear insertion task

6. 💡 Nit: body_quat_canonical sign convention may produce discontinuities

7. 💡 Nit: gear_peg_height default unused in environment

8. 💡 Nit: Duplicate copyright headers

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

1. 🔴 Critical: `_pred_scale` is global state — breaks multi-env semantics

3. 🟡 Warning: Observation class directly accesses `env.action_manager._terms["arm_action"]` private API

6. 💡 Nit: `body_quat_canonical` sign convention may produce discontinuities

7. 💡 Nit: `gear_peg_height` default unused in environment