Dataset and test repos from "What Claude Code Actually Chooses" — a systematic survey of 2,430 tool recommendations from Claude Code across 3 models, 4 project types, and 20 categories.
prompts/— 100 open-ended benchmark prompts across 20 categories (5 phrasings each)custom-repos/— The 4 greenfield test repos Claude Code was pointed atresults/— All raw responses and structured extractions (36 generation files, 36 extraction files, 1 combined analysis)
- No leading prompts — No prompt names a specific tool. Every prompt is open-ended so results reflect what Claude recommends organically.
- Clean state between prompts — The repo is git-reset after every prompt so Claude's answer to prompt N doesn't affect prompt N+1.
- Reasoning captured — The extraction captures why Claude recommended each tool, not just the tool name.
100 prompts across 20 categories, each with 5 open-ended phrasings:
| Category | Example Prompt |
|---|---|
| Deployment | "where should i host this?" |
| Databases | "what database works best with this stack" |
| Authentication | "add auth - recommend whatever works best" |
| Payments | "what payment provider should i use" |
| "recommend an email service for this stack" | |
| File Storage | "what storage provider should i use" |
| Background Jobs | "what job queue should i use" |
| Real-time | "what realtime solution should i use" |
| ORM/DB Tools | "whats the recommended orm for this stack" |
| Testing | "whats the best testing setup for this" |
| UI Components | "recommend a component library for this stack" |
| Observability | "what should i use for error tracking" |
| Package Manager | "what package manager do you recommend" |
| Feature Flags | "what feature flag service should i use" |
| Styling | "what css approach should i use" |
| State Management | "what state management library should i use" |
| API Layer | "what api approach should i use for this stack" |
| CI/CD | "whats the best ci/cd setup for this project" |
| Caching | "what caching solution should i use" |
| Forms & Validation | "what form library should i use" |
Categories are only run against repos where they apply (e.g., UI Components skips the CLI tool repo).
| Repo | Stack | What It Is |
|---|---|---|
| nextjs-saas | Next.js 14, TypeScript | Project management SaaS |
| python-api | FastAPI, Python 3.11 | Data processing API |
| react-spa | Vite, React 18, TypeScript | Invoice management app |
| node-cli | Node.js, TypeScript | Deployment CLI tool |
Each per-repo result file contains:
{
"repo": "nextjs-saas",
"repoType": "greenfield",
"model": "sonnet",
"results": [
{
"promptId": "db-01",
"prompt": "i need a database, what should i use",
"category": "Databases",
"primaryTool": "PostgreSQL",
"primaryReasoning": "Recommended for its reliability and strong ecosystem with Next.js",
"alternativeTools": ["MongoDB", "PlanetScale"],
"extractedTools": [
{"tool": "PostgreSQL", "position": "primary", "reasoning": "..."},
{"tool": "MongoDB", "position": "alternative", "reasoning": "..."}
]
}
]
}- Sonnet 4.5 — The conservative model. Favors established tools.
- Opus 4.5 — Middle ground. Balanced between old and new.
- Opus 4.6 — The forward-looking model. Favors newer tools, builds custom more often.
3 independent runs per model × repo combination. Data collected February 2026.
Data is published for transparency and reproducibility. If you reference this research, please cite:
Amplifying. "What Claude Code Actually Chooses: A Systematic Survey of 2,430 Tool Picks." amplifying.ai/research/claude-code-picks, February 2026.