| marp | true |
|---|---|
| theme | vibeminds |
| paginate | true |
| style | /* Mermaid diagram styling */ .mermaid-container { display: flex; justify-content: center; align-items: center; width: 100%; margin: 0.5em 0; } .mermaid { text-align: center; } .mermaid svg { max-height: 280px; width: auto; } .mermaid .node rect, .mermaid .node polygon { rx: 5px; ry: 5px; } .mermaid .nodeLabel { padding: 0 10px; } /* Two-column layout */ .columns { display: flex; gap: 40px; align-items: flex-start; } .column-left { flex: 1; } .column-right { flex: 1; } .column-left .mermaid svg { min-height: 400px; height: auto; max-height: 500px; } /* Section divider slides */ section.section-divider { display: flex; flex-direction: column; justify-content: center; align-items: center; text-align: center; background: linear-gradient(135deg, #1a1a3e 0%, #4a3f8a 50%, #2d2d5a 100%); } section.section-divider h1 { font-size: 3.5em; margin-bottom: 0.2em; } section.section-divider h2 { font-size: 1.5em; color: #b39ddb; font-weight: 400; } section.section-divider p { font-size: 1.1em; color: #9575cd; margin-top: 1em; } |
A Production-Ready Implementation
Built with Google ADK, Eino, and Multi-LLM Support
Understanding the challenge of verified statistics
Challenge: Finding verified, numerical statistics from reputable web sources
Pain Points
- ❌ LLMs hallucinate statistics and sources
- ❌ URLs from LLM memory are often outdated or wrong
- ❌ No verification that excerpts actually exist
- ❌ Hard to distinguish reputable vs unreliable sources
Goal: Build a system that provides provably accurate statistics
- ✅ Search web for statistics on any topic
- ✅ Extract numerical values with context
- ✅ Verify excerpts exist in source documents
- ✅ Validate numerical accuracy
- ✅ Prioritize reputable sources (.gov, .edu, research orgs)
- ✅ 60-90% verification rate (vs 0% for direct LLM)
- ✅ Response time: under 60 seconds
- ✅ Support multiple LLM providers
- ✅ Containerized deployment
Four specialized agents working together
⚙️ Graph
:8000 | :9000"] M["AI Assistant"] -.->|optional| MCP["MCP Server
🔌 Protocol"] MCP -.-> B B -->|HTTP or A2A| C["Research
⚡ Tool
:8001 | :9001"] B -->|HTTP or A2A| D["Synthesis
🧠 LLM
:8004 | :9004"] B -->|HTTP or A2A| E["Verification
🧠 LLM
:8002 | :9002"] C --> F["URLs"] D --> G["Statistics"] E --> H["Verified"] classDef agent fill:#00bfa5,stroke:#00897b,color:#fff class B,C,D,E agent
4 Specialized Agents with dual protocol support:
- Research - Tool-based (Search API) - HTTP :8001 | A2A :9001
- Synthesis - LLM-based extraction - HTTP :8004 | A2A :9004
- Verification - LLM-based validation - HTTP :8002 | A2A :9002
- Orchestration - Graph-based workflow - HTTP :8000 | A2A :9000
Graph-Based Orchestration (not inter-agent communication)
What This Means
- ✅ Hub-and-spoke: Orchestrator coordinates all communication
- ✅ Sequential pipeline: Predictable execution order
- ✅ Easy to debug: Clear data flow, reproducible behavior
- ❌ No peer-to-peer: Agents don't message each other directly
- ❌ No negotiation: No agent-to-agent collaboration protocols
Trade-off: Predictability over flexibility (right choice for production)
Every agent exposes both protocols simultaneously
| Protocol | Ports | Purpose |
|---|---|---|
| HTTP | 800x | Custom security (SPIFFE, KYA, XAA), observability |
| A2A | 900x | Standard agent interoperability (Google protocol) |
A2A Endpoints per Agent
GET /.well-known/agent-card.json- Agent discoveryPOST /invoke- JSON-RPC execution
Why Both?
- ✅ A2A: Standard protocol, agent discovery, interoperability
- ✅ HTTP: Flexibility for security layers, LLM observability
- ✅ Compare: Evaluate implementation complexity side-by-side
Configuration: A2A_ENABLED=true activates A2A servers
| Agent | Type | Technology | Why? |
|---|---|---|---|
| Orchestrator | ⚙️ Graph | Eino workflow | Deterministic, predictable |
| Research | ⚡ Tool | Serper/SerpAPI | No reasoning needed |
| Synthesis | 🧠 LLM | Gemini/Claude/etc | Language understanding |
| Verification | 🧠 LLM | Gemini/Claude/etc | Fuzzy text matching |
Key Insight: Use the right tool for each job
- ❌ Don't force everything through an LLM
- ✅ Graph for coordination (fast, predictable)
- ✅ Tool for API calls (simple, reliable)
- ✅ LLM for language tasks (intelligent, flexible)
Original Question: ADK is for inter-agent communication. Do we need it?
Answer: Yes! A2A protocol support requires ADK.
| Agent | ADK Role | A2A Benefit |
|---|---|---|
| Synthesis | LLM + A2A server | Standard invocation |
| Verification | LLM + A2A server | Agent discovery |
| Research | Tool wrapper + A2A | Interoperability |
| Orchestrator | Eino wrapped in ADK | A2A compatibility |
What ADK Provides for A2A
adka2a.NewExecutor()- Bridges ADK to A2A protocoladka2a.BuildAgentSkills()- Generates agent card skillsremoteagent.NewA2A()- A2A client for calling remote agents
Verdict: ADK is the right choice. A2A support justifies the framework.
Responsibility: Find relevant web sources
Implementation (Google ADK)
- No LLM required (pure search)
- Integrates with Serper/SerpAPI via
omniserplibrary - Filters for reputable domains
- Returns 30 URLs by default
Key Decision: Separate search from extraction
- Allows caching of search results
- Different providers don't need LLM changes
- Faster iteration on search queries
Port: 8001
Responsibility: Extract statistics from web pages
Implementation (Google ADK)
- Fetches webpage content (30K chars per page)
- LLM analyzes text for numerical statistics
- Extracts verbatim excerpts
- Processes 15+ pages for comprehensive coverage
- Returns candidates with metadata
Key Challenge: Getting complete extraction
- ❌ Initial: Only returned 5-8 statistics
- ✅ Solution: Increased pages (5→15), content (15K→30K), multiplier (2x→5x)
Port: 8004
Problem: Low statistical yield (5-8 stats vs ChatGPT's 20+)
Root Cause Analysis
- Too few pages processed (only 5)
- Too little content per page (15K chars)
- Too conservative multiplier (2x)
Solution - Aggressive extraction:
minPagesToProcess := 15 // (increased from 5)
maxContentLen := 30000 // (increased from 15K)
multiplier := 5 // (increased from 2x)Result: Now matches ChatGPT.com performance!
Responsibility: Validate statistics against sources
Implementation (Google ADK)
- Re-fetches source URLs
- Checks excerpts exist verbatim
- Validates numerical values match exactly
- Uses light LLM assistance for fuzzy matching
- Returns pass/fail with detailed reasons
Key Decision: Always fetch original source
- No trusting LLM claims
- Catches hallucinations
- Verifies pages haven't changed
Port: 8002
Two Implementations Available
- Uses LLM to decide workflow steps
- Adaptive retry logic
- More flexible but slower
- Type-safe graph-based workflow
- Predictable, reproducible behavior
- Faster and lower cost
- No LLM for orchestration decisions
Both run on Port 8000 (choose one)
30 URLs"] B --> C["Synthesis
15+ pages -> candidates"] C --> D["Verification
validate each"] D --> E["QualityCheck
>= min verified?"] E --> F["FormatOutput"] --> G["User"]
Why Eino?
- Type-safe operations
- No non-deterministic LLM decisions
- Easier to debug and test
- Production-ready reliability
Benefits:
- Predictable execution path
- No hidden LLM costs for orchestration
- Easy to trace and monitor
- Reproducible results
From LLM hallucinations to multi-provider support
Initial Idea: Let LLM answer from memory (like ChatGPT)
Implementation
./stats-agent search "AI trends" --directThe Problem
- LLM returns statistics from training data (up to Jan 2025)
- URLs are guessed - not from real search
- Pages have moved, changed, or are paywalled
- 0% verification rate when validated
The Lesson: Real-time web search is essential for statistics
Same Query: "AI trends"
| System | Statistics Found | Verification Rate | Why? |
|---|---|---|---|
| ChatGPT.com | 20+ | ✅ 90%+ | Real-time Bing search |
| Direct Mode | 10 | ❌ 0% | LLM memory (outdated URLs) |
| Pipeline Mode | 15-25 | ✅ 60-90% | Real-time Google search |
Key Insight: ChatGPT.com's success comes from web search, not just LLM quality!
Our Solution: Pipeline mode with Serper/SerpAPI
What We Changed
- Made Pipeline mode the default
- Added warnings to Direct mode docs
- Implemented hybrid mode (Direct + Verification)
README Warning
⚠️ Direct Mode - Not Recommended for Statistics
- ❌ Uses LLM memory (training data)
- ❌ Outdated URLs
- ❌ 0% verification rate
✅ For statistics, use Pipeline mode insteadResult: Clear expectations, better user experience
Requirement: Support multiple LLM providers
Supported Providers
- Google Gemini (default) -
gemini-2.5-flash/gemini-2.5-pro - Anthropic Claude -
claude-sonnet-4-20250514/claude-opus-4-1-20250805 - OpenAI -
gpt-4o/gpt-5 - xAI Grok -
grok-4-1-fast-reasoning/grok-4-1-fast-non-reasoning - Ollama -
llama3:8b/mistral:7b(local)
Challenge: Each provider has different APIs, models, rate limits
Solution: Abstraction via omnillm library
Factory Pattern in pkg/llm/factory.go:
func CreateLLM(cfg *config.Config) (*genai.Client, string, error) {
switch cfg.LLMProvider {
case "gemini":
return createGeminiClient(cfg)
case "claude":
return createClaudeClient(cfg)
case "openai":
return createOpenAIClient(cfg)
case "xai":
return createXAIClient(cfg)
case "ollama":
return createOllamaClient(cfg)
default:
return nil, "", fmt.Errorf("unsupported provider: %s", cfg.LLMProvider)
}
}Benefit: Agents are provider-agnostic
Simple Environment Variables
# Use Gemini (default)
export GOOGLE_API_KEY="your-key"
# Switch to Claude
export LLM_PROVIDER="claude"
export ANTHROPIC_API_KEY="your-key"
# Switch to local Ollama
export LLM_PROVIDER="ollama"
export OLLAMA_URL="http://localhost:11434"
export LLM_MODEL="llama3:8b"No code changes required!
Requirement: Support multiple search providers
Options
- Serper API - $50/month, 5K queries (recommended)
- SerpAPI - Alternative with different pricing
- Mock - For development without API keys
Challenge: Different APIs, different response formats
Solution: omniserp library abstraction
// Unified interface - works with any provider
result, err := searchClient.SearchNormalized(ctx, params)Initial Design: Client-side LLM (❌ Bad)
# Client needs API key!
export GOOGLE_API_KEY="key"
./stats-agent search "topic" --directProblem
- Clients need API keys (security risk)
- Hard to update prompts
- No centralized rate limiting
Solution: Server-side Direct Agent (✅ Good)
- Direct Agent server on port 8005
- Client makes HTTP requests
- Server holds API keys
- Centralized control
Built with Huma v2 + Chi router
- OpenAPI 3.1 automatic generation
- Interactive Swagger UI at
/docs - Type-safe request/response handling
- Proper HTTP timeouts
Example
type DirectSearchInput struct {
Body struct {
Topic string `json:"topic" minLength:"1"`
MinStats int `json:"min_stats" minimum:"1"`
}
}
huma.Register(api, operation, handler)Port 8005 - Production-ready with docs!
The Bug
{
"value": 2,537 // ❌ Invalid JSON!
}Root Cause: LLM formats numbers like humans (2,537)
The Fix - Explicit prompt instructions:
CRITICAL: The "value" field must be a plain number
with NO commas (e.g., 2537 not 2,537)
REMEMBER: Numbers like 75,000 should be written
as 75000 (no comma).
Result: Valid JSON every time! ✅
Problem: LLM returns 1-2 statistics, stops
Bad Prompt
Find statistics about climate change.
Good Prompt
Extract EVERY statistic you find, not just one or two. Be thorough and comprehensive.
If the page contains 10 statistics, return 10 items in the array.
Return empty array [] ONLY if absolutely no statistics are found.
Impact: 2-3x more statistics extracted per page
From local development to production
Two Deployment Methods
make run-all-eino # Start all 4 agents (HTTP + A2A)
./bin/stats-agent search "topic"docker-compose up -d # All agents containerized
curl -X POST http://localhost:8000/orchestrate # HTTP
# or via A2A: POST http://localhost:9000/invokeSame code, same config - seamless transition!
| Agent | HTTP | A2A |
|---|---|---|
| Orchestrator | :8000 | :9000 |
| Research | :8001 | :9001 |
| Verification | :8002 | :9002 |
| Synthesis | :8004 | :9004 |
Model Context Protocol support for AI tool integration
Use Case: Claude Code can search for verified statistics
{
"mcpServers": {
"stats-agent": {
"command": "go",
"args": ["run", "mcp/server/main.go"]
}
}
}Tools Available
search_statistics- Full pipeline searchverify_statistic- Single verification
Integration: Works with Claude Code, other MCP clients
| Metric | Direct Mode | Pipeline Mode |
|---|---|---|
| Verification Rate | ❌ 0-30% | ✅ 60-90% |
| Response Time | ⚡ 5-10s | ⚡ 30-60s |
| URLs Searched | 0 (LLM memory) | 30 (real search) |
| Pages Processed | 0 | 15+ |
| Cost per Query | Low | Medium |
| Accuracy | ❌ Low | ✅ High |
Sweet Spot: Pipeline mode for statistics, Direct for general Q&A
Query: "climate change statistics"
Result
{
"name": "Global temperature increase",
"value": 1.1,
"unit": "C",
"source": "IPCC Sixth Assessment Report",
"source_url": "https://www.ipcc.ch/...",
"excerpt": "Global surface temperature has increased
by approximately 1.1C since pre-industrial
times...",
"verified": true
}Verification: Excerpt found verbatim in source! ✅
Language & Runtime
- Go 1.21+ - Concurrency, performance, simple deployment
Agent Frameworks
- Google ADK - LLM agents + A2A protocol support
- Eino - Deterministic graph orchestration
- A2A Protocol - Agent-to-agent interoperability (Google)
API & Protocols
- HTTP - Custom security, observability (ports 800x)
- A2A/JSON-RPC - Standard agent invocation (ports 900x)
- Huma v2 - OpenAPI 3.1 generation
Integrations
- omnillm - Multi-provider LLM abstraction
- omniobserve - Unified LLM observability (Opik, Langfuse, Phoenix)
- omniserp - Unified search API
- Real-time search > LLM memory for current data
- 0% vs 60-90% verification rate
- Verification is non-negotiable for accuracy
- Always fetch and validate sources
- Separation of concerns enables optimization
- Search, extract, verify are independent
- Prompt engineering matters at scale
- Explicit completeness instructions needed
- Flexibility enables adoption
- Multi-LLM, multi-search provider support
Current Limitations
- ❌ Paywalled content inaccessible
- ❌ Non-English sources need translation
⚠️ Range statistics (e.g., "79-96%") need schema updates
Future Enhancements
- ✨ Add
value_maxfield for ranges - ✨ Perplexity API integration (built-in search)
- ✨ Caching layer for search results
- ✨ Streaming responses for faster perceived performance
- ✨ Multi-language support
Running the system in production
./stats-agent search "renewable energy" --min-stats 10What Happens
- Orchestrator validates input
- Research searches 30 URLs via Serper
- Synthesis processes 15+ pages (450K+ chars total)
- Synthesis extracts 50+ candidate statistics
- Verification validates each candidate
- Verification returns 12 verified (60% rate)
- Orchestrator checks: 12 ≥ 10 ✅
- User receives JSON output
Total time: ~45 seconds
Structured Logging at each stage:
Research Agent: Found 30 search results
Synthesis Agent: Extracted 8 statistics from nature.com
Synthesis Agent: Total candidates: 52 from 15 pages
Verification Agent: Verified 10/15 candidates (67%)
Orchestration: Target met (10 verified)
Health Checks
/healthendpoint on each agent- Docker health checks in production
- Timeout monitoring (60s max)
LLM Observability (via OmniObserve)
- Automatic tracing of all LLM calls
- Token usage and cost tracking
- Supports: Comet Opik, Langfuse, Arize Phoenix
Metrics to Track
- Verification rate per query
- Average response time
- Cost per query (API calls)
Simple Commands
# Install dependencies
make install
# Build all agents
make build
# Run everything (Eino orchestrator)
make run-all-eino
# Run direct + verification only
make run-direct-verify
# Run tests
make testClean Abstractions: Agents don't know about each other's internals
Easy Debugging: Run individual agents in separate terminals
Environment-Based
# .env file
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your-key
SEARCH_PROVIDER=serper
SERPER_API_KEY=your-keyOverride per Agent
# Use different LLM for synthesis
export SYNTHESIS_LLM_PROVIDER=claude
export SYNTHESIS_LLM_MODEL=claude-sonnet-4-20250514Docker-Friendly: All config via environment variables
| Feature | Direct | Hybrid | Pipeline |
|---|---|---|---|
| Speed | ⚡⚡⚡ 5s | ⚡⚡ 15s | ⚡ 45s |
| Accuracy | ❌ Low | ✅ High | |
| Verification | ❌ No | ✅ Real URLs | |
| Cost | $ | $$ | $$$ |
| Use Case | Brainstorm | Quick check | Production |
| Agents Needed | 1 | 2 | 4 |
Recommendation: Pipeline mode for statistics that matter
- Unit Tests
- Individual function validation
- LLM provider factory
- JSON parsing edge cases
- Integration Tests
- Agent-to-agent communication
- HTTP endpoint validation
- Error handling flows
- End-to-End Tests
- Complete pipeline execution
- Verification rate validation
- Performance benchmarks
- Manual Testing
- Known statistics verification
- Multi-provider compatibility
- Edge case exploration
Graceful Degradation
// If source unreachable, mark failed
if err := fetchURL(url); err != nil {
return VerificationResult{
Verified: false,
Reason: "Source unreachable",
}
}Retry Logic
- HTTP retries with exponential backoff
- Automatic quality check retries
- Human-in-the-loop for partial results
User-Friendly Messages
- "Found 8 of 10 requested, continue? (y/n)"
- Clear error messages with remediation steps
API Key Management
- Environment variables only (never in code)
- Server-side storage (clients don't need keys)
- Per-agent key rotation possible
Input Validation
- Topic length limits (500 chars)
- Min/max stats bounds (1-100)
- URL validation before fetching
Timeouts
- HTTP request timeouts (30-60s)
- LLM generation timeouts
- Overall query timeout (120s)
Future: Add rate limiting, authentication
- Research Agent
- Parallel URL searches where supported
- Connection pooling for HTTP clients
- Synthesis Agent
- Parallel page fetching (up to 5 concurrent)
- Content truncation (30K chars max)
- Efficient JSON parsing
- Verification Agent
- Batch verification where possible
- Early exit on clear failures
- LLM only for fuzzy matching
- Overall
- 45-second average for 10 verified statistics
- Scales linearly with min_stats target
agents/ # Each agent is independent
├── research/
├── synthesis/
├── verification/
├── direct/
└── orchestration-eino/
pkg/ # Shared libraries
├── config/ # Centralized configuration
├── llm/ # Multi-provider factory
├── models/ # Shared data structures
├── search/ # Search abstraction
└── direct/ # Direct search service
main.go # CLI entry point
Principle: High cohesion, low coupling
README.md
- Comprehensive setup instructions
- Clear mode comparisons
- Warning callouts for limitations
Code Documentation
- Inline comments for complex logic
- Function documentation (godoc format)
- Architecture decision records (ADRs)
API Documentation
- OpenAPI 3.1 specification (Huma)
- Interactive Swagger UI at
/docs - Example requests/responses
Presentation: Architecture overview (this!)
Easy to Extend
- Add new LLM provider: Implement
omnillminterface - Add new search provider: Implement
omniserpinterface - Add new agent: Follow existing patterns
- Add new verification rules: Extend verification agent
Contribution Areas
- 🔧 New LLM providers (e.g., Perplexity)
- 🌍 Multi-language support
- 📊 Range statistics (
value_max) - ⚡ Performance optimizations
- 📚 Documentation improvements
License: MIT (permissive)
Costs, scaling, and enterprise considerations
- Use Case 1: Research Reports
- Pipeline mode with
--reputable-only - Export to JSON for analysis
- Cite sources with URLs
- Pipeline mode with
- Use Case 2: Data Analysis
- Bulk queries via API
- Process results in pandas/R
- Visualization of trends
- Use Case 3: AI Assistant Integration
- MCP server with Claude Code
- LLM asks stats-agent for verified data
- Compose into reports
- Use Case 4: Quick Fact-Checking
- Direct mode for fast lookup
- Accept unverified for speed
Per Query Costs (estimates):
| Component | Direct | Hybrid | Pipeline |
|---|---|---|---|
| Search API | $0.00 | $0.00 | $0.02 |
| LLM Calls | $0.01 | $0.03 | $0.08 |
| Total | $0.01 | $0.03 | $0.10 |
Cost Drivers
- Number of pages processed (15+)
- LLM provider choice (Gemini < Claude < GPT-4o/GPT-5)
- Verification attempts
Optimization: Use Gemini 2.5 Flash (fast + cheap)
Horizontal Scaling
- Each agent scales independently
- Load balancer per agent type
- Stateless design enables easy scaling
Vertical Scaling
- Increase concurrency limits
- Larger content chunks (current: 30K)
- More parallel page fetching
Optimizations for Scale
- Cache search results (1 hour TTL)
- Queue-based processing for bulk queries
- Database for results persistence
Example: 10 orchestrators + 20 synthesis agents
- Metrics to Collect
- Verification rate by source domain
- Response time percentiles (p50, p95, p99)
- Error rate by agent
- API cost per query
- Throughput (queries/minute)
- Alerting
- Verification rate < 50% (alert)
- Response time > 120s (alert)
- Agent health check failures
- API quota exhaustion
- Tools
- OmniObserve for LLM tracing (Opik, Langfuse, Phoenix) ✅
- Prometheus for metrics (future)
- Grafana for dashboards (future)
- Jaeger for distributed tracing (future)
Responsible Web Scraping
- Respect
robots.txt - Rate limiting on URL fetches
- User-Agent identification
- No aggressive crawling
Data Privacy
- No PII collection
- No user query logging (optional)
- API keys stored securely
- GDPR compliance considerations
Source Attribution
- Always cite original sources
- Provide full URLs
- Verbatim excerpts (fair use)
Ethics: Promote verified information, combat misinformation
| System | Search | Verify | Multi-LLM | Open Source |
|---|---|---|---|---|
| ChatGPT.com | ✅ Bing | ❌ GPT only | ❌ Closed | |
| Perplexity | ✅ Multiple | ❌ Limited | ❌ Closed | |
| Our System | ✅ Strong | ✅ 5+ | ✅ MIT | |
| Direct LLM | ❌ Memory | ❌ None | ✅ Any | N/A |
Key Differentiator: Rigorous verification + flexibility
Open Source: Community can audit, extend, trust
From Direct LLM Usage
// Before: Client-side LLM
resp, err := llmClient.Generate(ctx, "Find climate statistics")
// After: Stats Agent Direct mode
body := DirectSearchRequest{Topic: "climate change", MinStats: 10}
resp, err := http.Post("http://localhost:8005/search", "application/json", body)From Other APIs
// Before: Direct LLM call (no verification)
stats, err := getLLMStats(ctx, "climate statistics")
// After: Stats Agent Pipeline (verified)
body := OrchestrationRequest{Topic: "climate change", MinVerifiedStats: 10}
resp, err := http.Post("http://localhost:8000/orchestrate", "application/json", body)Q1 2026
- ✨ Perplexity API integration (built-in search)
- ✨ Range statistics (
value_maxfield) - ✨ Response streaming for faster UX
Q2 2026
- ✨ Multi-language support (ES, FR, DE, ZH)
- ✨ Caching layer for search results
- ✨ GraphQL API option
Q3 2026
- ✨ Browser extension for fact-checking
- ✨ Notion/Confluence integrations
- ✨ Advanced citation formats (APA, MLA)
Community Driven: Submit feature requests on GitHub!
Summary, resources, and next steps
Development Approach
- Agent-based architecture enables parallel work
- Clear interfaces between components
- Code reviews for quality
- Continuous integration (GitHub Actions)
Best Practices
- Branch protection on main
- Required passing tests for merge
- Semantic versioning
- Changelog maintenance
Communication
- Architecture decisions documented
- Weekly sync meetings
- GitHub issues for tracking
- Technical
- Real-time data > LLM memory for facts
- Verification is essential, not optional
- Modular architecture enables optimization
- Prompt engineering is critical at scale
- Process
- Clear requirements prevent scope creep
- Early testing reveals issues sooner
- Documentation enables adoption
- User feedback drives priorities
- Product
- Be honest about limitations (builds trust)
- Provide flexibility (multi-LLM, multi-search)
- Developer experience matters
What We Built
- Production-ready statistics verification system
- 60-90% verification rate (vs 0% for LLM alone)
- Multi-agent architecture with clear separation
- Flexible (multi-LLM, multi-search)
- Open source (MIT license)
Key Success Factors
- Real-time web search for current data
- Rigorous verification against sources
- Modular, extensible design
- Comprehensive testing & documentation
Impact: Enables verified statistics for research, reporting, analysis
Repository: github.com/agentplexus/stats-agent-team
Quick Start
git clone https://github.com/agentplexus/stats-agent-team
cd stats-agent-team
make install
make build
make run-all-einoContribute
- 🐛 Report bugs
- 💡 Suggest features
- 📝 Improve docs
- 🔧 Submit PRs
License: MIT (permissive, commercial-friendly)
Contact & Resources
- 📧 GitHub Issues for questions
- 📚 Full documentation in README.md
- 🔗 OpenAPI docs at
localhost:8005/docs - 💬 Discussions tab for community chat
Thank You! 🙏
Special Thanks
- Google ADK team
- Eino framework contributors
- Open source LLM providers
- The Go community
Documentation
README.md- Setup & usage guideROADMAP.md- Planned features & enhancements4_AGENT_ARCHITECTURE.md- Architecture deep diveLLM_CONFIGURATION.md- Multi-LLM setupSEARCH_INTEGRATION.md- Search provider setupMCP_SERVER.md- MCP integration guideDOCKER.md- Container deployment
Example Queries
- Climate change statistics
- AI industry trends
- Healthcare outcomes
- Economic indicators
- Educational metrics
Try it yourself!