diff --git a/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst b/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst index e55073460..c81454d47 100644 --- a/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst +++ b/docs/pages/example_workflows/dexsuite_lift/step_2_policy_training.rst @@ -64,6 +64,29 @@ Hyperparameters can be overridden with Hydra-style CLI arguments: agent.max_iterations=20000 agent.save_interval=500 agent.algorithm.learning_rate=0.0005 +Resuming from a Checkpoint +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To resume training from a previously saved checkpoint, use the ``--resume`` flag +together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename). +Both arguments are optional — when omitted, the most recent run and latest checkpoint +are used automatically. + +.. code-block:: bash + + python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ + --task Isaac-Dexsuite-Kuka-Allegro-Lift-v0 \ + --num_envs 512 \ + --resume \ + --load_run \ + --checkpoint model_5000.pt \ + presets=newton presets=cube + +Replace ```` with the run folder name under ``logs/rsl_rl/dexsuite_kuka_allegro/``. +If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, +the latest checkpoint in that run is loaded. + + Monitoring Training ^^^^^^^^^^^^^^^^^^^ diff --git a/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst b/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst index 7e3439b2d..703621432 100644 --- a/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst +++ b/docs/pages/example_workflows/dexsuite_lift/step_3_evaluation.rst @@ -9,7 +9,7 @@ Once inside the container, set the models directory: .. code-block:: bash - export MODELS_DIR=models/isaaclab_arena/dexsuite_lift + export MODELS_DIR=/models/isaaclab_arena/dexsuite_lift mkdir -p $MODELS_DIR This step evaluates a checkpoint using Arena's ``dexsuite_lift`` environment. diff --git a/docs/pages/example_workflows/reinforcement_learning/index.rst b/docs/pages/example_workflows/reinforcement_learning/index.rst index 0ca679321..be15b2b2a 100644 --- a/docs/pages/example_workflows/reinforcement_learning/index.rst +++ b/docs/pages/example_workflows/reinforcement_learning/index.rst @@ -72,7 +72,7 @@ You'll need to create folders for logs, checkpoints, and models: export LOG_DIR=logs/rsl_rl mkdir -p $LOG_DIR - export MODELS_DIR=models/isaaclab_arena/reinforcement_learning + export MODELS_DIR=/models/isaaclab_arena/reinforcement_learning mkdir -p $MODELS_DIR Workflow Steps diff --git a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst index c0e68c051..d9c748d3c 100644 --- a/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst +++ b/docs/pages/example_workflows/reinforcement_learning/step_2_policy_training.rst @@ -61,6 +61,31 @@ For example, to train with relu activation and a higher learning rate: agent.algorithm.learning_rate=0.001 +Resuming from a Checkpoint +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To resume training from a previously saved checkpoint, use the ``--resume`` flag +together with ``--load_run`` (run folder name) and ``--checkpoint`` (model filename). +Both arguments are optional — when omitted, the most recent run and latest checkpoint +are used automatically. + +.. code-block:: bash + + python submodules/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py \ + --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ + --task lift_object \ + --rl_training_mode \ + --num_envs 4096 \ + --max_iterations 4000 \ + --resume \ + --load_run \ + --checkpoint model_1999.pt + +Replace ```` with the run folder name under ``logs/rsl_rl/generic_experiment/``. +If ``--load_run`` is omitted, the latest run is selected. If ``--checkpoint`` is omitted, +the latest checkpoint in that run is loaded. + + Monitoring Training ^^^^^^^^^^^^^^^^^^^ @@ -101,6 +126,7 @@ During training, each iteration prints a summary to the console: ETA: 00:00:49 + Multi-GPU Training ^^^^^^^^^^^^^^^^^^ @@ -112,7 +138,7 @@ Add ``--distributed`` to spread environments across all available GPUs: --external_callback isaaclab_arena.environments.isaaclab_interop.environment_registration_callback \ --task lift_object \ --rl_training_mode \ - --num_envs 4096\ + --num_envs 4096 \ --max_iterations 2000 \ --distributed diff --git a/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst b/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst index 601ae47ab..fed818ec3 100644 --- a/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst +++ b/docs/pages/example_workflows/reinforcement_learning/step_3_evaluation.rst @@ -9,7 +9,7 @@ Once inside the container, set the models directory if you plan to download pre- .. code:: bash - export MODELS_DIR=models/isaaclab_arena/reinforcement_learning + export MODELS_DIR=/models/isaaclab_arena/reinforcement_learning mkdir -p $MODELS_DIR This tutorial assumes you've completed :doc:`step_2_policy_training` and have a trained checkpoint,