The Outbreak Analysis Orchestration System is a comprehensive, multi-agent pipeline for detecting, validating, and assessing potential disease outbreaks. It employs a rigorous, evidence-based approach that challenges assumptions, tests alternative hypotheses, and validates threats through systematic data collection from authoritative sources.
- Multi-stage validation: 5 specialized agents working in sequence
- Devil's advocate approach: Systematic challenging of outbreak assumptions
- Evidence-based assessment: Data-driven validation of hypotheses
- Automated data collection: Firecrawl integration for web scraping
- Confidence scoring: Transparent assessment of evidence strength
- Risk prioritization: Resource allocation based on validated threats
┌─────────────────────┐
│ Outbreak Catalog │
│ (CSV Data) │
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 1. Outbreak Flagger │ ──► potential_outbreaks.md
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 2. Devil's Advocate │ ──► devils_advocate_analysis.md
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 3. Data Gatherer │ ──► data_gathering_plan.json
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 4. Firecrawl Agent │ ──► validation_results.json
└──────────┬──────────┘ outbreak_data/*.json
│
▼
┌─────────────────────┐
│ 5. Validation Agent │ ──► final_outbreak_validation_report.md
└─────────────────────┘
Analyzes outbreak catalog data to identify potential disease outbreaks.
Input: outbreak_data/catalog.csv
Output: potential_outbreaks.md
Function:
- Processes outbreak data entries
- Identifies disease patterns
- Generates initial hypotheses
- Suggests investigation URLs
Challenges conventional outbreak interpretations with alternative explanations.
Input: potential_outbreaks.md
Output: devils_advocate_analysis.md
Function:
- Proposes non-outbreak explanations
- Identifies potential biases and artifacts
- Creates validation tasks
- Questions assumptions systematically
Plans comprehensive data collection strategy for hypothesis validation.
Input: devils_advocate_analysis.md
Output: data_gathering_plan.json
Function:
- Generates 5-10 search queries per outbreak
- Identifies 10-15 URLs to scrape per outbreak
- Prioritizes data sources
- Defines validation requirements
Executes web searches and crawls to collect validation data.
Input: data_gathering_plan.json
Output: validation_results.json, outbreak_data/*.json
Function:
- Executes Firecrawl searches
- Performs deep crawls on authoritative sources
- Saves intermediate results
- Stores all scraped content
Validates all hypotheses against collected evidence for final assessment.
Input: All prior reports and crawled data
Output: final_outbreak_validation_report.md
Function:
- Evaluates evidence strength
- Tests original vs. alternative hypotheses
- Assigns confidence levels
- Provides risk assessments
- Python 3.8+
- Firecrawl API key
- ARGO LLM access
- Clone the repository:
git clone <repository-url>
cd outbreak-orchestration- Install dependencies:
pip install -r requirements.txt- Configure environment variables:
# Create .env file
echo "FIRECRAWL_API_KEY=your_api_key_here" > .env- Set up ARGO credentials in
scripts/ARGO.py
python outbreak_analysis_orchestrator.pyThis runs all 5 agents in sequence, generating a complete outbreak validation report.
# Stage 1: Identify outbreaks
python outbreak_flagger_argo.py
# Stage 2: Generate alternative hypotheses
python devils_advocate_analyzer.py
# Stage 3: Plan data collection
python data_gatherer_agent.py
# Stage 4: Collect validation data
python firecrawl_validation_agent.py
# Stage 5: Validate hypotheses
python hypothesis_validation_agent.pyMost agents accept custom input paths:
# Use custom outbreak report
python devils_advocate_analyzer.py custom_report.md
# Use custom analysis for data gathering
python data_gatherer_agent.py custom_analysis.md
# Use custom plan for validation
python firecrawl_validation_agent.py custom_plan.json| File | Description |
|---|---|
potential_outbreaks.md |
Initial outbreak analysis report |
devils_advocate_analysis.md |
Alternative hypotheses and validation tasks |
data_gathering_plan.json |
Structured data collection plan |
data_gathering_plan.md |
Human-readable collection plan |
validation_results.json |
Firecrawl execution results |
validation_summary.md |
Data collection summary |
outbreak_data/*.json |
All scraped content |
final_outbreak_validation_report.md |
Final evidence-based assessment |
validation_summary.json |
Final validation metadata |
pipeline_summary.md |
Orchestration execution summary |
- STRONG: Multiple consistent sources, high laboratory confirmation
- MODERATE: Some consistent sources, moderate confirmation
- WEAK: Limited sources, low confirmation
- INSUFFICIENT: Not enough data for assessment
- HIGH: Strong evidence, alternative hypotheses refuted
- MEDIUM: Moderate evidence, some alternatives possible
- LOW: Weak evidence, alternatives equally likely
- CRITICAL: Immediate action required
- HIGH: Urgent attention needed
- MODERATE: Enhanced monitoring recommended
- LOW: Standard surveillance sufficient
- MINIMAL: No immediate concern
# Final Outbreak Validation Report
## Executive Summary
High-level findings and risk assessment
## Validation Results by Outbreak
### [Outbreak Name]
- Evidence Assessment
- Original Hypotheses Validation
- Alternative Hypotheses Validation
- Data Quality Assessment
- Final Assessment
## Comparative Analysis
- Confirmed vs. Alternative Explanations
- Data Gaps and Uncertainties
## Final Recommendations
- Immediate Actions
- Resource Allocation
- Surveillance Enhancement
- Further Investigation NeedsEdit outbreak_analysis_orchestrator.py:
# Default: 5 minutes per agent
timeout=300 # secondsEdit firecrawl_validation_agent.py:
# Search settings
num_results = 5 # Results per search
max_depth = 3 # Crawl depth for authoritative sources
# Rate limiting
time.sleep(2) # Delay between searches
time.sleep(3) # Delay between crawlsEach agent can use different models:
self.argo = ArgoWrapper(model="gpt4o") # or "claude-3", etc.- Firecrawl timeout: Increase timeout or reduce number of searches
- ARGO rate limits: Add delays between agent calls
- Memory issues: Process smaller batches of data
- Missing dependencies: Run
pip install -r requirements.txt
Enable verbose logging:
# In any agent file
DEBUG = True # Add at top of file- Catalog Data → Outbreak identification
- Outbreak Report → Alternative hypothesis generation
- Hypotheses → Data collection planning
- Collection Plan → Web scraping execution
- Scraped Data → Hypothesis validation
- Validation → Final risk assessment
- Regular Updates: Keep outbreak catalog current
- Source Verification: Prioritize authoritative sources
- Hypothesis Testing: Test both original and alternative explanations
- Evidence Documentation: Save all scraped data for audit
- Confidence Transparency: Always report uncertainty levels
- Create agent file following the pattern:
class YourAgent:
def __init__(self, input_path):
self.argo = ArgoWrapper(model="gpt4o")
def run(self):
# Agent logic
pass- Add to orchestrator pipeline in
outbreak_analysis_orchestrator.py
- Enhance prompts in agent files
- Add validation logic
- Improve error handling
- Optimize API usage
[Your License Here]
For issues or questions:
- Create an issue in the repository
- Contact: [your-email@example.com]
- ARGO LLM for analysis capabilities
- Firecrawl for web scraping
- Public health organizations for data sources
Version: 1.0.0
Last Updated: November 2025
Status: Production Ready