#

activation-engineering

Here are 4 public repositories matching this topic...

ZFancy / awesome-activation-engineering

A curated list of resources for activation engineering

control concept transparent ai-safety interpretability large-language-models llm llm-aligment activation-engineering concept-rep concept-activation-vector

Updated Oct 2, 2025

bassrehab / steering-vectors-agents

Runtime control of LLM agent behaviors through activation steering vectors. More calibrated than prompting.

machine-learning transformers pytorch steering-behaviors ai-safety interpretability langchain llm-agents activation-engineering steering-vectors contrastive-activation-addition

Updated Dec 19, 2025
Python

G-Art / matrix_steering_vector_research

Iterative Sparse Matrix Steering: Closed-Form Subspace Alignment for Multi-Layer LLM Control (No SGD required).

pytorch alignment interpretability llm activation-engineering steering-vectors

Updated Jan 5, 2026
Jupyter Notebook

Jason-Wang313 / RISER

A closed-loop control system for Large Language Models that steers internal activation states in real-time to prevent mode collapse and toxicity

reinforcement-learning pytorch control-theory ai-safety riser mechanistic-interpretability llm-steering activation-engineering

Updated Apr 8, 2026
Python

Improve this page

Add a description, image, and links to the activation-engineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the activation-engineering topic, visit your repo's landing page and select "manage topics."