A multi-agent hypothesis validation system using Claude Agent SDK that stress-tests business ideas across 5 dimensions: Market, Technical, Financial, Competitive, and Devil's Advocate.
- Multi-dimensional analysis: Validates hypotheses across 5 critical dimensions
- Evidence-based: Every claim must have a source URL (enforced via hooks)
- Iterative deep-dives: Automatically investigates low-confidence areas
- Market-first weighting: Market dimension gets 35% weight in final scoring
- Structured reports: Generates comprehensive markdown reports with verdicts
# Clone the repository
git clone https://github.com/Sarangk90/hypothesis-validator.git
cd hypothesis-validator
# Install with pip
pip install -e .
# Or with poetry
poetry install-
Copy the example environment file:
cp .env.example .env
-
Add your Anthropic API key to
.env:ANTHROPIC_API_KEY=your_actual_api_key
# Validate a hypothesis from a string
python -m hypothesis_validator "Your hypothesis here"
# Validate from a file
python -m hypothesis_validator hypothesis.txt
# With options
python -m hypothesis_validator hypothesis.txt \
--max-iterations 5 \
--confidence 0.8 \
--output report.mdimport asyncio
from hypothesis_validator import HypothesisValidator
async def main():
validator = HypothesisValidator()
report, state = await validator.run("""
Product: AI-powered code review tool
Target: Enterprise development teams
Problem: Code reviews are slow and inconsistent
""")
print(report)
asyncio.run(main())| Dimension | Weight | Focus Areas |
|---|---|---|
| Market | 35% | TAM/SAM/SOM, pain points, timing, demand signals |
| Technical | 20% | Feasibility, architecture, MVP timeline |
| Financial | 15% | Pricing, unit economics, funding requirements |
| Competitive | 15% | Direct/indirect competitors, defensibility |
| Devil's Advocate | 15% | Assumption attacks, failure modes (inverted) |
- STRONG_GO: Weighted score ≥7.0 AND devil's advocate ≤4.0
- CONDITIONAL_GO: Weighted score ≥6.0 AND devil's advocate ≤6.0
- NEEDS_MORE_DATA: Weighted score ≥5.0 OR many unresolved gaps
- WEAK_NO: Weighted score ≥3.5
- STRONG_NO: Weighted score <3.5
hypothesis-validator/
├── src/hypothesis_validator/
│ ├── __init__.py
│ ├── main.py # CLI entry point
│ ├── orchestrator.py # Main orchestration logic
│ ├── state.py # Pydantic models
│ ├── tools.py # Custom MCP tools
│ ├── agents/ # Validator agents
│ ├── prompts/ # Agent prompts
│ ├── hooks/ # Quality validation hooks
│ └── report/ # Report generator
├── examples/
│ └── example_hypothesis.py # Example usage
├── outputs/ # Generated reports
└── tests/
MIT License