From d461c5dfb606f5a567d97146649980a7997271a6 Mon Sep 17 00:00:00 2001
From: Qian Lin <qianl@nvidia.com>
Date: Wed, 8 Apr 2026 23:26:34 -0700
Subject: [PATCH 1/4] Add docs for resuming for lift object

---
 .../step_2_policy_training.rst                | 27 ++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
index c0e68c051..70fa5e6d9 100644
--- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
+++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
@@ -101,6 +101,31 @@ During training, each iteration prints a summary to the console:
                                      ETA: 00:00:49
 
 
+Resuming from a Checkpoint
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To resume training from a previously saved checkpoint, use the ``--resume`` flag
+together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename).
+Both arguments are optional — when omitted, the most recent run and latest checkpoint
+are used automatically.
+
+.. code-block:: bash
+
+   python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
+     --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
+     --task lift_object \
+     --rl_training_mode \
+     --num_envs 4096 \
+     --max_iterations 4000 \
+     --resume \
+     --load_run <timestamp> \
+     --checkpoint model_1999.pt
+
+Replace ``<timestamp>`` with the run folder name under ``logs/rsl_rl/generic_experiment/``.
+If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted,
+the latest checkpoint in that run is loaded.
+
+
 Multi-GPU Training
 ^^^^^^^^^^^^^^^^^^
 
@@ -112,7 +137,7 @@ Add ``--distributed`` to spread environments across all available GPUs:
      --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
      --task lift_object \
      --rl_training_mode \
-     --num_envs 4096\
+     --num_envs 4096 \
      --max_iterations 2000 \
      --distributed
 

From 62cbec36f0c64e521244c0e4e70852703e52a76d Mon Sep 17 00:00:00 2001
From: Qian Lin <qianl@nvidia.com>
Date: Wed, 8 Apr 2026 23:34:41 -0700
Subject: [PATCH 2/4] add doc for resume to the dexsuite example

---
 .../dexsuite_lift/step_2_policy_training.rst  | 23 +++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst b/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst
index e55073460..c81454d47 100644
--- a/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst
+++ b/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst
@@ -64,6 +64,29 @@ Hyperparameters can be overridden with Hydra-style CLI arguments:
      agent.max_iterations=20000 agent.save_interval=500 agent.algorithm.learning_rate=0.0005
 
 
+Resuming from a Checkpoint
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To resume training from a previously saved checkpoint, use the ``--resume`` flag
+together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename).
+Both arguments are optional — when omitted, the most recent run and latest checkpoint
+are used automatically.
+
+.. code-block:: bash
+
+   python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
+     --task Isaac-Dexsuite-Kuka-Allegro-Lift-v0 \
+     --num_envs 512 \
+     --resume \
+     --load_run <timestamp> \
+     --checkpoint model_5000.pt \
+     presets=newton presets=cube
+
+Replace ``<timestamp>`` with the run folder name under ``logs/rsl_rl/dexsuite_kuka_allegro/``.
+If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted,
+the latest checkpoint in that run is loaded.
+
+
 Monitoring Training
 ^^^^^^^^^^^^^^^^^^^
 

From a479985e88baebbe6baf9d76eb91ccbd7dc6872d Mon Sep 17 00:00:00 2001
From: Qian Lin <qianl@nvidia.com>
Date: Wed, 8 Apr 2026 23:40:04 -0700
Subject: [PATCH 3/4] reorder

---
 .../step_2_policy_training.rst                | 65 ++++++++++++-------
 1 file changed, 41 insertions(+), 24 deletions(-)

diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
index 70fa5e6d9..7608cda9a 100644
--- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
+++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
@@ -61,6 +61,47 @@ For example, to train with relu activation and a higher learning rate:
      agent.algorithm.learning_rate=0.001
 
 
+Resuming from a Checkpoint
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To resume training from a previously saved checkpoint, use the ``--resume`` flag
+together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename).
+Both arguments are optional — when omitted, the most recent run and latest checkpoint
+are used automatically.
+
+.. code-block:: bash
+
+   python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
+     --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
+     --task lift_object \
+     --rl_training_mode \
+     --num_envs 4096 \
+     --max_iterations 4000 \
+     --resume \
+     --load_run <timestamp> \
+     --checkpoint model_1999.pt
+
+Replace ``<timestamp>`` with the run folder name under ``logs/rsl_rl/generic_experiment/``.
+If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted,
+the latest checkpoint in that run is loaded.
+
+.. tip::
+
+   You can also combine resume with Hydra overrides to change hyperparameters mid-training,
+   e.g. lowering the learning rate for fine-tuning:
+
+   .. code-block:: bash
+
+      python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
+        --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
+        --task lift_object \
+        --rl_training_mode \
+        --num_envs 4096 \
+        --max_iterations 4000 \
+        --resume \
+        agent.algorithm.learning_rate=0.00005
+
+
 Monitoring Training
 ^^^^^^^^^^^^^^^^^^^
 
@@ -101,30 +142,6 @@ During training, each iteration prints a summary to the console:
                                      ETA: 00:00:49
 
 
-Resuming from a Checkpoint
-^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-To resume training from a previously saved checkpoint, use the ``--resume`` flag
-together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename).
-Both arguments are optional — when omitted, the most recent run and latest checkpoint
-are used automatically.
-
-.. code-block:: bash
-
-   python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
-     --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
-     --task lift_object \
-     --rl_training_mode \
-     --num_envs 4096 \
-     --max_iterations 4000 \
-     --resume \
-     --load_run <timestamp> \
-     --checkpoint model_1999.pt
-
-Replace ``<timestamp>`` with the run folder name under ``logs/rsl_rl/generic_experiment/``.
-If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted,
-the latest checkpoint in that run is loaded.
-
 
 Multi-GPU Training
 ^^^^^^^^^^^^^^^^^^

From 21421830bd386e3861cb8d0a4151fbf3ed00b59f Mon Sep 17 00:00:00 2001
From: Qian Lin <qianl@nvidia.com>
Date: Thu, 9 Apr 2026 00:18:51 -0700
Subject: [PATCH 4/4] small cleanup

---
 .../dexsuite_lift/step_3_evaluation.rst          |  2 +-
 .../reinforcement_learning/index.rst             |  2 +-
 .../step_2_policy_training.rst                   | 16 ----------------
 .../reinforcement_learning/step_3_evaluation.rst |  2 +-
 4 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst b/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst
index 7e3439b2d..703621432 100644
--- a/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst
+++ b/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst
@@ -9,7 +9,7 @@ Once inside the container, set the models directory:
 
 .. code-block:: bash
 
-   export MODELS_DIR=models/isaaclab_arena/dexsuite_lift
+   export MODELS_DIR=/models/isaaclab_arena/dexsuite_lift
    mkdir -p $MODELS_DIR
 
 This step evaluates a checkpoint using Arena's ``dexsuite_lift`` environment.
diff --git a/docs/pages/example_workflows/reinforcement_learning/index.rst b/docs/pages/example_workflows/reinforcement_learning/index.rst
index 0ca679321..be15b2b2a 100644
--- a/docs/pages/example_workflows/reinforcement_learning/index.rst
+++ b/docs/pages/example_workflows/reinforcement_learning/index.rst
@@ -72,7 +72,7 @@ You'll need to create folders for logs, checkpoints, and models:
 
     export LOG_DIR=logs/rsl_rl
     mkdir -p $LOG_DIR
-    export MODELS_DIR=models/isaaclab_arena/reinforcement_learning
+    export MODELS_DIR=/models/isaaclab_arena/reinforcement_learning
     mkdir -p $MODELS_DIR
 
 Workflow Steps
diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
index 7608cda9a..d9c748d3c 100644
--- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
+++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst
@@ -85,22 +85,6 @@ Replace ``<timestamp>`` with the run folder name under ``logs/rsl_rl/generic_exp
 If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted,
 the latest checkpoint in that run is loaded.
 
-.. tip::
-
-   You can also combine resume with Hydra overrides to change hyperparameters mid-training,
-   e.g. lowering the learning rate for fine-tuning:
-
-   .. code-block:: bash
-
-      python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \
-        --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \
-        --task lift_object \
-        --rl_training_mode \
-        --num_envs 4096 \
-        --max_iterations 4000 \
-        --resume \
-        agent.algorithm.learning_rate=0.00005
-
 
 Monitoring Training
 ^^^^^^^^^^^^^^^^^^^
diff --git a/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst b/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst
index 601ae47ab..fed818ec3 100644
--- a/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst
+++ b/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst
@@ -9,7 +9,7 @@ Once inside the container, set the models directory if you plan to download pre-
 
 .. code:: bash
 
-    export MODELS_DIR=models/isaaclab_arena/reinforcement_learning
+    export MODELS_DIR=/models/isaaclab_arena/reinforcement_learning
     mkdir -p $MODELS_DIR
 
 This tutorial assumes you've completed :doc:`step_2_policy_training` and have a trained checkpoint,