From d461c5dfb606f5a567d97146649980a7997271a6 Mon Sep 17 00:00:00 2001 From: Qian Lin Date: Wed, 8 Apr 2026 23:26:34 -0700 Subject: [PATCH 1/4] Add docs for resuming for lift object --- .../step_2_policy_training.rst | 27 ++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst index c0e68c051..70fa5e6d9 100644 --- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst +++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst @@ -101,6 +101,31 @@ During training, each iteration prints a summary to the console: ETA: 00:00:49 +Resuming from a Checkpoint +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To resume training from a previously saved checkpoint, use the ``--resume`` flag +together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename). +Both arguments are optional — when omitted, the most recent run and latest checkpoint +are used automatically. + +.. code-block:: bash + + python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ + --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ + --task lift_object \ + --rl_training_mode \ + --num_envs 4096 \ + --max_iterations 4000 \ + --resume \ + --load_run \ + --checkpoint model_1999.pt + +Replace ```` with the run folder name under ``logs/rsl_rl/generic_experiment/``. +If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, +the latest checkpoint in that run is loaded. + + Multi-GPU Training ^^^^^^^^^^^^^^^^^^ @@ -112,7 +137,7 @@ Add ``--distributed`` to spread environments across all available GPUs: --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ --task lift_object \ --rl_training_mode \ - --num_envs 4096\ + --num_envs 4096 \ --max_iterations 2000 \ --distributed From 62cbec36f0c64e521244c0e4e70852703e52a76d Mon Sep 17 00:00:00 2001 From: Qian Lin Date: Wed, 8 Apr 2026 23:34:41 -0700 Subject: [PATCH 2/4] add doc for resume to the dexsuite example --- .../dexsuite_lift/step_2_policy_training.rst | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst b/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst index e55073460..c81454d47 100644 --- a/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst +++ b/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst @@ -64,6 +64,29 @@ Hyperparameters can be overridden with Hydra-style CLI arguments: agent.max_iterations=20000 agent.save_interval=500 agent.algorithm.learning_rate=0.0005 +Resuming from a Checkpoint +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To resume training from a previously saved checkpoint, use the ``--resume`` flag +together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename). +Both arguments are optional — when omitted, the most recent run and latest checkpoint +are used automatically. + +.. code-block:: bash + + python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ + --task Isaac-Dexsuite-Kuka-Allegro-Lift-v0 \ + --num_envs 512 \ + --resume \ + --load_run \ + --checkpoint model_5000.pt \ + presets=newton presets=cube + +Replace ```` with the run folder name under ``logs/rsl_rl/dexsuite_kuka_allegro/``. +If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, +the latest checkpoint in that run is loaded. + + Monitoring Training ^^^^^^^^^^^^^^^^^^^ From a479985e88baebbe6baf9d76eb91ccbd7dc6872d Mon Sep 17 00:00:00 2001 From: Qian Lin Date: Wed, 8 Apr 2026 23:40:04 -0700 Subject: [PATCH 3/4] reorder --- .../step_2_policy_training.rst | 65 ++++++++++++------- 1 file changed, 41 insertions(+), 24 deletions(-) diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst index 70fa5e6d9..7608cda9a 100644 --- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst +++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst @@ -61,6 +61,47 @@ For example, to train with relu activation and a higher learning rate: agent.algorithm.learning_rate=0.001 +Resuming from a Checkpoint +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To resume training from a previously saved checkpoint, use the ``--resume`` flag +together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename). +Both arguments are optional — when omitted, the most recent run and latest checkpoint +are used automatically. + +.. code-block:: bash + + python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ + --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ + --task lift_object \ + --rl_training_mode \ + --num_envs 4096 \ + --max_iterations 4000 \ + --resume \ + --load_run \ + --checkpoint model_1999.pt + +Replace ```` with the run folder name under ``logs/rsl_rl/generic_experiment/``. +If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, +the latest checkpoint in that run is loaded. + +.. tip:: + + You can also combine resume with Hydra overrides to change hyperparameters mid-training, + e.g. lowering the learning rate for fine-tuning: + + .. code-block:: bash + + python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ + --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ + --task lift_object \ + --rl_training_mode \ + --num_envs 4096 \ + --max_iterations 4000 \ + --resume \ + agent.algorithm.learning_rate=0.00005 + + Monitoring Training ^^^^^^^^^^^^^^^^^^^ @@ -101,30 +142,6 @@ During training, each iteration prints a summary to the console: ETA: 00:00:49 -Resuming from a Checkpoint -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To resume training from a previously saved checkpoint, use the ``--resume`` flag -together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename). -Both arguments are optional — when omitted, the most recent run and latest checkpoint -are used automatically. - -.. code-block:: bash - - python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ - --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ - --task lift_object \ - --rl_training_mode \ - --num_envs 4096 \ - --max_iterations 4000 \ - --resume \ - --load_run \ - --checkpoint model_1999.pt - -Replace ```` with the run folder name under ``logs/rsl_rl/generic_experiment/``. -If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, -the latest checkpoint in that run is loaded. - Multi-GPU Training ^^^^^^^^^^^^^^^^^^ From 21421830bd386e3861cb8d0a4151fbf3ed00b59f Mon Sep 17 00:00:00 2001 From: Qian Lin Date: Thu, 9 Apr 2026 00:18:51 -0700 Subject: [PATCH 4/4] small cleanup --- .../dexsuite_lift/step_3_evaluation.rst | 2 +- .../reinforcement_learning/index.rst | 2 +- .../step_2_policy_training.rst | 16 ---------------- .../reinforcement_learning/step_3_evaluation.rst | 2 +- 4 files changed, 3 insertions(+), 19 deletions(-) diff --git a/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst b/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst index 7e3439b2d..703621432 100644 --- a/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst +++ b/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst @@ -9,7 +9,7 @@ Once inside the container, set the models directory: .. code-block:: bash - export MODELS_DIR=models/isaaclab_arena/dexsuite_lift + export MODELS_DIR=/models/isaaclab_arena/dexsuite_lift mkdir -p $MODELS_DIR This step evaluates a checkpoint using Arena's ``dexsuite_lift`` environment. diff --git a/docs/pages/example_workflows/reinforcement_learning/index.rst b/docs/pages/example_workflows/reinforcement_learning/index.rst index 0ca679321..be15b2b2a 100644 --- a/docs/pages/example_workflows/reinforcement_learning/index.rst +++ b/docs/pages/example_workflows/reinforcement_learning/index.rst @@ -72,7 +72,7 @@ You'll need to create folders for logs, checkpoints, and models: export LOG_DIR=logs/rsl_rl mkdir -p $LOG_DIR - export MODELS_DIR=models/isaaclab_arena/reinforcement_learning + export MODELS_DIR=/models/isaaclab_arena/reinforcement_learning mkdir -p $MODELS_DIR Workflow Steps diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst index 7608cda9a..d9c748d3c 100644 --- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst +++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst @@ -85,22 +85,6 @@ Replace ```` with the run folder name under ``logs/rsl_rl/generic_exp If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, the latest checkpoint in that run is loaded. -.. tip:: - - You can also combine resume with Hydra overrides to change hyperparameters mid-training, - e.g. lowering the learning rate for fine-tuning: - - .. code-block:: bash - - python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ - --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ - --task lift_object \ - --rl_training_mode \ - --num_envs 4096 \ - --max_iterations 4000 \ - --resume \ - agent.algorithm.learning_rate=0.00005 - Monitoring Training ^^^^^^^^^^^^^^^^^^^ diff --git a/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst b/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst index 601ae47ab..fed818ec3 100644 --- a/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst +++ b/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst @@ -9,7 +9,7 @@ Once inside the container, set the models directory if you plan to download pre- .. code:: bash - export MODELS_DIR=models/isaaclab_arena/reinforcement_learning + export MODELS_DIR=/models/isaaclab_arena/reinforcement_learning mkdir -p $MODELS_DIR This tutorial assumes you've completed :doc:`step_2_policy_training` and have a trained checkpoint,