Gradient routing

A companion repository for Gradient Routing: Masking Gradients to Localize Computation in Neural Networks.

Repo structure

factored_representations is for shared functionality, although in practice, code for different subprojects is mostly siloed
- masklib.py and model_expansion.py implement Expand, Route, Ablate for any TransformerLens model.
- Has some tests
projects contains the code to reproduce the results in the paper
- minigrid - localizing behavioral tendencies in a gridworld reinforcement learning agent
- mnist - splitting representations of an MNIST autoencoder
- nanoGPT-factrep - training a model with a steering scalar, and unlearning virology
- tinystories - unlearning a subset of TinyStories
shared_configs is for commonly-used configurations, e.g. model definitions, standard training config options

To use

Install PDM
Install the PDM project (ie. install the dependencies)
```
pdm install
```
Install the recommended VSCode extensions
Install the pre-commit git hooks
```
pdm run pre-commit install
```

You can then run Python scripts with pdm run python <script.py> or by activating the virtual environment specified by pdm info. Eg:

source /pdm-venvs/factored-representations-Dp430888-3.12/bin/activate

.vscode/settings.json is configured to automatically format and lint the code with Ruff (using the extension) on save.

Tests

Run the tests with:

pdm run pytest

Citation

@article{cloud2024gradient,
	title={Gradient Routing: Masking Gradients to Localize Computation in Neural Networks},
	url={https://arxiv.org/abs/2410.04332v1},
	journal={arXiv.org},
	author={Cloud, Alex and Goldman-Wetzler, Jacob and Wybitul, Evžen and Miller, Joseph and Turner, Alexander Matt},
	year={2024},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient routing

Repo structure

To use

Tests

Citation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Gradient routing

Repo structure

To use

Tests

Citation