Skip to content

rita-aga/thought-anchors

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

105 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thought Anchors ⚓

We introduce a framework for interpreting the reasoning of large language models by attributing importance to individual sentences in their chain-of-thought. Using black-box, attention-based, and causal methods, we identify key reasoning steps, which we call thought anchors, that disproportionately influence downstream reasoning. These anchors are typically planning or backtracking sentences. Our work offers new tools and insights for understanding multi-step reasoning in language models.

See more:

Get Started

You can download our MATH rollout dataset or resample your own data.

Here's a quick rundown of the main scripts in this repository and what they do:

  1. generate_rollouts.py: Main script for generating reasoning rollouts. Our dataset was created with it.
  2. analyze_rollouts.py: Processes the generated rollouts and adds chunks_labeled.json and other metadata for each reasoning trace. It calculates metrics like forced answer importance, resampling importance, and counterfactual importance.
  3. step_attribution.py: Computes the sentence-to-sentence counterfactual importance score for all sentences in all reasoning traces.
  4. plots.py: Generates figures (e.g., the ones in the paper).

Here is what other files do:

  • selected_problems.json: A list of problems identified in the 25% - 75% accuracy range (i.e., challenging problems). It is sorted in increasing order by average length of sentences (NOTE: We use chunks, steps, and sentences interchangeably through the code).
  • prompts.py: This includes auto-labeler LLM prompts we used throughout this project. DAG_PROMPT is the one we used to generate labels (i.e., function tags or categories, e.g., uncertainty management) for each sentence.
  • utils.py: Includes utility and helper functions for reasoning trace analysis.
  • misc-experiments/: This folder includes miscellaneous experiment scripts. Some of them are ongoing work.
  • whitebox-analyses/: This folder includes the white-box experiments in the paper, including attention pattern analysis (e.g., receiver heads) and attention suppression.

Vision Rollout Generation

Added support for vision-language models and creative tasks by extending the rollout generation pipeline:

  • generate_vision_rollouts.py: New script that generates rollouts for creative/artistic analysis using vision-language models like Qwen2.5-VL-7B-Instruct
  • Multimodal input handling: Processes images alongside text prompts for vision-based creative analysis tasks
  • Compatible output format: Generates the same rollout structure as the original MATH pipeline, enabling seamless integration with existing analysis tools
  • Creative evaluation: Implements quality-based evaluation for subjective creative tasks instead of exact answer matching

The vision generator maintains the same chunk-based rollout structure as the original, allowing creative vision tasks to benefit from the same sophisticated importance analysis framework.

Usage:

# Using Qwen vision model
python generate_vision_rollouts.py -d vision_dataset.json -m Qwen/Qwen2.5-VL-7B-Instruct -np 10 -nr 50

# Using GPT-5 (requires OpenAI API key, automatically uses higher token limit)
# Rollouts are generated in parallel for 10-50x speedup
python generate_vision_rollouts.py -p OpenAI -m gpt-5 -d vision_dataset.json -np 1 -nr 50

# Control concurrency to manage API rate limits (default: 50 concurrent requests)
python generate_vision_rollouts.py -p OpenAI -m gpt-5 -d vision_dataset.json -np 1 -nr 50 -c 20

Creative/Vision Analysis Extension

Extended the original MATH-focused analysis pipeline to handle subjective creative and artistic tasks. This extension:

  • Reuses the core algorithms: The same embedding models, similarity calculations, and importance metrics from the original MATH analysis are applied to creative responses
  • Replaces correctness with quality: Instead of binary correct/incorrect evaluation, we use GPT-5-nano to score creative response quality on a continuous scale (0-1)
  • Maintains full compatibility: Both MATH and creative analysis run through the same unified pipeline in analyze_rollouts.py
  • Supports vision-language models: Works with models like Qwen2.5-VL-7B-Instruct for image-based creative analysis

The extension demonstrates that the thought anchor framework generalizes beyond mathematical reasoning to subjective creative domains, while preserving all the sophistication of the original analysis infrastructure.

Usage:

# Analyze Qwen vision rollouts
python analyze_rollouts.py -vc vision_rollouts/Qwen2.5-VL-7B-Instruct/temperature_0.7_top_p_0.9/creative_analysis

# Analyze GPT-5 vision rollouts
python analyze_rollouts.py -vc vision_rollouts/gpt-5/temperature_0.7_top_p_0.9/creative_analysis

Extract Top Thought Anchors

After running creative analysis, extract the most important reasoning steps:

# Extract anchors from GPT-5 analysis (save full text to file)
python extract_creative_anchors.py -r analysis/basic/creative_analysis/vision_analysis_results.json -k 10 -o anchors_analysis.txt --patterns

# Generate visualizations (now uses resampling importance by default)
python plot_creative_analysis.py  

# Or specify custom path
python plot_creative_analysis.py -rd vision_rollouts/gpt-5/temperature_0.7_top_p_0.9/creative_analysis

# For Qwen analysis (if you have Qwen data)
python plot_creative_analysis.py -rd vision_rollouts/Qwen2.5-VL-7B-Instruct/temperature_0.7_top_p_0.9/creative_analysis

Audio Similarity Analysis

Compare audio files (MP3) using multi-feature analysis. Supports MFCC-based features (fast) or CLAP embeddings (more accurate semantic understanding).

cd suno-music

# Compare multiple songs for pair-wise similarity
python audio_similarity.py 01.mp3 02.mp3 03.mp3 --clap

# Find individual 3s windows that match across all files (similarity anchors)
python audio_similarity.py 01.mp3 02.mp3 03.mp3 --passages --clap

# Find 10-second sequences that connect all files
python audio_similarity.py 01.mp3 02.mp3 03.mp3 --sequences 10 --clap

Citation

Please cite our work if you are using our code or dataset.

@misc{bogdan2025thoughtanchorsllmreasoning,
      title={Thought Anchors: Which LLM Reasoning Steps Matter?},
      author={Paul C. Bogdan and Uzay Macar and Neel Nanda and Arthur Conmy},
      year={2025},
      eprint={2506.19143},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2506.19143},
}

Contact

For any questions, thoughts, or feedback, please reach out to uzaymacar@gmail.com and paulcbogdan@gmail.com.

Miscallenous

To upload the math_rollouts dataset to HuggingFace, I ran:

hf upload-large-folder uzaymacar/math_rollouts --repo-type=dataset math_rollouts

But it turns out this is not dataset compatible. The misc-scripts/push_hf_dataset.py takes care of this instead, creating a dataset-compatible data repository on HuggingFace.

About

⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 79.0%
  • Python 21.0%