Developing RL control for IHMC Alex humanoid robot.
STATUS: WORKS
Train:
python train.py --timesteps 10000000 --n-envs 8
Evaluate:
python eval.py
TensorBoard:
tensorboard --logdir rl_models/tensorboard
STATUS: WORKS
Loads the trained stand pose if available for curriculum.
- Test initial untrained starting point gait for learning:
mjpython test-gait.py
-
Start training to refine walking:
python train.py -
Evaluate (after training):
python eval.py
Since the gymnasium mujoco env humanoid learn to walk more easily, we made a similar env for Alex.
STATUS: WORKS episodes up to 300 steps
Train:
python train.py
Tensorboard:
tensorboard --logdir rl_models/tensorboard
Eval:
python eval.py
STATUS: WORKS
using the built in gynasium MuJoCo humanoid-v5 env to learn to walk
Training
python train.py- Logs and models are saved in
rl_models/. - The script uses
SubprocVecEnvfor multi-threaded training andVecNormalizefor observation/reward normalization.
Tensorboard - monitor training progress:
tensorboard --logdir rl_models/tensorboardEvaluation
python eval.py-
Reference Gait: The robot follows a periodic sinusoidal target (a "fixed set of steps") for its legs. The RL policy learns to provide residuals (offsets) to this gait to maintain balance and optimize forward velocity.
-
Symmetry: The right leg automatically mirrors the left leg's motion with a 180-degree (0.5) phase shift.
-
Minimum Actuation: Only the pitch axes of the legs are active: hip_y, knee, and ankle_y. All other joints are held at the stable STAND_PREP pose.
-
Reward Function: Encourages forward velocity (vx) while penalizing lateral drift, vertical height deviations, and excessive control effort.