Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

The repo hosts the code for the experiments section in paper "Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies" by Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He (2023).

This code contains implementations for N-PG-IGT, (N)-HARPG, and Vanilla-PG.

Prerequisites

This code is based on the garage repository. To install the code with our algorithms implementation, navigate to the directory containing the code and install garage as an editable package:

pip install -e '.[all,dev]'

To run experiments on mujoco environments (e.g., humanoid, hopper, halfcheetah), you additionally need to install it. The installation guide for mujoco can be found in mujoco-py repository.

Running the Experiments

You can find the main experiment file run_PG.py in directory examples_PolicyGradient. You can specify various parameters to control the experiment settings.

seed: the random seed for reproducibility.
batch_size: the number of samples per batch at each iteration.
gamma_0: the initial stepsize.
method: optimization method (sgd=Vanilla SGD, nigt=N-PG-IGT, nsgdm=Normalized SGD, nstormhess=N-HARPG, stormhess=HARPG).
eta_0: the initial momentum parameter
env: the environment (walker, acrobot, cartpole, halfcheetah, hopper, humanoid, reacher, swimmer)
logdir: the path to directory for logs

Example Command

To run an experiment with a specific configuration, navigate to your project's root directory and use the following command:

python examples/run_experiments.py -seed 11 -batch_size 5000 -gamma_0 0.1 -method sgd -env cartpole -epochs 100```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Prerequisites

Running the Experiments

Example Command

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Prerequisites

Running the Experiments

Example Command