OpenAnalytics

AI-powered AEO (Answer Engine Optimization) analytics platform with health checks and visibility analysis.

Architecture

Stage Health (per URL)
     ↓
┌────┴────┐
▼         ▼
[29 checks]   parallel
  │
  ▼
Tiered Score → Grade → Band

Stage Mentions (per company)
     ↓
┌────┴────┐
▼         ▼
[Query Gen] → [AI Test]  parallel queries
  │
  ▼
Visibility Score → Quality

Stages

Stage	Name	AI Calls	Purpose
Health	AEO Health Check	0	29 checks across 4 categories, tiered scoring
Mentions	AI Visibility	1 + N	Generate hyperniche queries, test with Gemini

Data Flow

HealthStageInput: URL → FetchResult → Checks → TieredScore → HealthStageOutput
MentionsStageInput: Company info → Query generation → AI testing → MentionsStageOutput
Pipeline: Can run both stages in parallel

Project Structure

openanalytics/
├── shared/                    # Shared components
│   ├── gemini_client.py       # Unified Gemini client
│   ├── models.py              # Request/Response schemas
│   ├── scoring.py             # Tiered AEO scoring system
│   ├── fetcher.py             # Async HTML/robots.txt fetcher
│   ├── constants.py           # GEMINI_MODEL, AI_CRAWLERS, etc.
│   └── __init__.py            # Package exports
├── stage health/              # Health Check stage
│   ├── stage_health.py        # Main orchestrator
│   ├── health_models.py       # Stage-specific models
│   └── __init__.py            # Package exports
├── stage mentions/            # Mentions Check stage
│   ├── stage_mentions.py      # Main orchestrator
│   ├── mentions_models.py     # Stage-specific models
│   └── __init__.py            # Package exports
├── checks/                    # Individual check modules
│   ├── technical.py           # 16 technical SEO checks
│   ├── structured_data.py     # 6 schema.org checks
│   ├── aeo_crawler.py         # 4 AI crawler access checks
│   └── authority.py           # 3 authority signal checks
├── pipeline/                  # Pipeline orchestration
│   ├── run_pipeline.py        # Main orchestrator + CLI
│   └── __init__.py            # Package exports
├── service/                   # Service layer
│   ├── analytics_service.py   # High-level business logic
│   └── __init__.py            # Package exports
├── api.py                     # FastAPI endpoints
├── requirements.txt           # Python dependencies
└── CLAUDE.md                  # This file

Key Design Decisions

Shared GeminiClient: All stages use shared.gemini_client.GeminiClient for consistency
Tiered Scoring: Score is minimum of Tier 0/1/2 caps and base score
Micro-API Pattern: Each stage is JSON in → JSON out, can run standalone
Parallel Processing: Health and Mentions stages can run in parallel
Service Layer: High-level business logic in service/ for API endpoints

Tiered Scoring System

The AEO funnel:

CAN AI ACCESS? → If blocked, nothing else matters (Tier 0)
CAN AI UNDERSTAND? → Schema.org is essential (Tier 1)
IS CONTENT STRUCTURED? → Technical SEO (Tier 2)
IS IT TRUSTWORTHY? → Authority signals (Tier 3)

Tier	Name	Cap if Failed	Example
0	Critical	10	Blocks all AI crawlers
1	Essential	45-55	No Organization schema
2	Important	70-95	Incomplete schema, thin content
3	Excellence	100	Full optimization

Environment Variables

GEMINI_API_KEY=your-gemini-api-key
SERPER_API_KEY=optional-serper-key-for-search

Usage

API Server

# Start server
python api.py

# Or with uvicorn
uvicorn api:app --host 0.0.0.0 --port 8000

CLI

# Health check only
python pipeline/run_pipeline.py --url https://example.com --health-only

# Mentions check only
python pipeline/run_pipeline.py --company "Example Corp" --industry "SaaS" --mentions-only

# Full analysis
python pipeline/run_pipeline.py --url https://example.com --company "Example Corp"

# With output file
python pipeline/run_pipeline.py --url https://example.com -o results/output.json

API Endpoints

# Health check
curl -X POST http://localhost:8000/health \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# Mentions check
curl -X POST http://localhost:8000/mentions \
  -H "Content-Type: application/json" \
  -d '{"company_name": "Example Corp", "industry": "SaaS"}'

# Full analysis
curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "company_name": "Example Corp"}'

Dependencies

google-genai>=1.0 - Gemini API client
pydantic>=2.0 - Data validation
httpx>=0.25 - Async HTTP client
python-dotenv>=1.0 - Environment variables
beautifulsoup4>=4.12 - HTML parsing
fastapi>=0.100 - API framework
uvicorn>=0.23 - ASGI server
playwright>=1.40 - JS rendering (optional)

Health Check Categories

Technical SEO (16 checks)

Title tag, meta description, canonical, robots meta
H1/H2 structure, HTTPS, response time, sitemap
Content word count, hreflang, etc.

Structured Data (6 checks)

Organization schema completeness
FAQ/HowTo schemas, sameAs links
Content freshness dates

AI Crawler Access (4 checks)

GPTBot, Claude-Web, PerplexityBot, CCBot
Checks robots.txt for blocking

Authority Signals (3 checks)

About page, contact info, social proof

Mentions Check Query Distribution

70% UNBRANDED: Test real organic discovery
20% COMPETITIVE: Comparison/alternative queries
10% BRANDED: Brand awareness queries

Grade Scale

Grade	Score	Description
A+	90+	Exceptional - passes all tiers with excellence
A	80-89	Excellent - full schema, good optimization
B	65-79	Good - has schema, some gaps
C	45-64	Fair - missing schema or has issues
D	25-44	Poor - major gaps, partial AI access
F	<25	Critical - blocks AI or fundamental issues

Visibility Bands

Band	Score	Color
Excellent	80+	#22c55e (green)
Strong	65-79	#84cc16 (lime)
Moderate	45-64	#eab308 (yellow)
Weak	25-44	#f97316 (orange)
Critical	<25	#ef4444 (red)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAnalytics

Architecture

Stages

Data Flow

Project Structure

Key Design Decisions

Tiered Scoring System

Environment Variables

Usage

API Server

CLI

API Endpoints

Dependencies

Health Check Categories

Technical SEO (16 checks)

Structured Data (6 checks)

AI Crawler Access (4 checks)

Authority Signals (3 checks)

Mentions Check Query Distribution

Grade Scale

Visibility Bands

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

OpenAnalytics

Architecture

Stages

Data Flow

Project Structure

Key Design Decisions

Tiered Scoring System

Environment Variables

Usage

API Server

CLI

API Endpoints

Dependencies

Health Check Categories

Technical SEO (16 checks)

Structured Data (6 checks)

AI Crawler Access (4 checks)

Authority Signals (3 checks)

Mentions Check Query Distribution

Grade Scale

Visibility Bands