From 78e1fcc733f9b6553242569f9f14f7bfda3d10b2 Mon Sep 17 00:00:00 2001 From: Aria Pramesi Date: Tue, 17 Mar 2026 19:43:00 -0500 Subject: [PATCH] feat(skills): add GEO interview and technical SEO audit skills Two new AIDD skills for AI search visibility optimization: - aidd-geo-interview: Reflexive AI interview measuring share of voice, competitive positioning, sentiment analysis, and citation strategy. Includes fan-out query templates (9 categories, 30+ templates) and 7-dimension GEO scoring methodology. - aidd-technical-seo: 8-step technical SEO audit covering page speed, headers, meta tags, schema markup, AI crawler access (7 bots), fan-out coverage analysis, and content quality with traffic-light scoring. Includes 43-item checklist and AI crawler remediation guide. Both skills are model-agnostic (work with any AI agent), require no code execution or external APIs, and cross-reference each other. Co-Authored-By: Claude Opus 4.6 (1M context) --- ai/skills/aidd-geo-interview/SKILL.md | 158 ++++++++++++ .../aidd-geo-interview/fan-out-queries.md | 143 +++++++++++ ai/skills/aidd-geo-interview/geo-scoring.md | 114 +++++++++ ai/skills/aidd-geo-interview/index.md | 23 ++ ai/skills/aidd-technical-seo/SKILL.md | 234 ++++++++++++++++++ .../aidd-technical-seo/ai-crawler-audit.md | 193 +++++++++++++++ ai/skills/aidd-technical-seo/index.md | 23 ++ ai/skills/aidd-technical-seo/seo-checklist.md | 191 ++++++++++++++ ai/skills/index.md | 8 + 9 files changed, 1087 insertions(+) create mode 100644 ai/skills/aidd-geo-interview/SKILL.md create mode 100644 ai/skills/aidd-geo-interview/fan-out-queries.md create mode 100644 ai/skills/aidd-geo-interview/geo-scoring.md create mode 100644 ai/skills/aidd-geo-interview/index.md create mode 100644 ai/skills/aidd-technical-seo/SKILL.md create mode 100644 ai/skills/aidd-technical-seo/ai-crawler-audit.md create mode 100644 ai/skills/aidd-technical-seo/index.md create mode 100644 ai/skills/aidd-technical-seo/seo-checklist.md diff --git a/ai/skills/aidd-geo-interview/SKILL.md b/ai/skills/aidd-geo-interview/SKILL.md new file mode 100644 index 0000000..2dc3d79 --- /dev/null +++ b/ai/skills/aidd-geo-interview/SKILL.md @@ -0,0 +1,158 @@ +--- +name: aidd-geo-interview +description: Interview AI models about GEO (Generative Engine Optimization) visibility for a product or topic. Measures share of voice, competitive positioning, and generates citation strategy recommendations. +--- + +# 🔍 GEO Interview + +Act as a top-tier GEO (Generative Engine Optimization) analyst. Your job is to +interview AI models — including yourself — and measure how they recommend products +in a given category. You produce a structured visibility report with share of voice +scores, competitive positioning, sentiment analysis, and actionable recommendations. + +Competencies { + share of voice analysis, + competitive positioning, + AI citation strategy, + fan-out query generation, + sentiment detection +} + +Constraints { + Do ONE STEP at a time. Get user approval before moving on. + Use reflexive interview: you answer the queries yourself, genuinely. + For multi-model coverage, the user runs this skill with different AI assistants. + Output markdown tables and structured reports. + No external APIs or code execution needed. + Reference fan-out-queries.md for extended query templates. + Reference geo-scoring.md for scoring methodology. +} + +## Process + +### 1. defineContext() => State + +Gather the following from the user (ask if not provided): + +``` +State { + product: string // e.g. "Notion" + category: string // e.g. "project management software" + competitors: string[] // e.g. ["Asana", "Monday.com", "ClickUp"] + audience: string // e.g. "startup founders" + year: string // current year +} +``` + +### 2. generateQueries(State) => queries[] + +Generate 7 queries using these templates: + +| Type | Template | +|------|----------| +| category_discovery | "What is the best {category}? Compare the top options." | +| recommendation | "I need a {category}. What do you recommend?" | +| product_awareness | "What is {product}? Is it any good?" | +| product_review | "{product} review — is it worth it?" | +| comparison | "Compare {product} to {competitors[0]}. Which is better?" | +| alternatives | "What are the alternatives to {competitors[0]}?" | +| gap_analysis | "What would {product} need to beat {competitors[0]} and {competitors[1]}?" | + +See `fan-out-queries.md` for 9 extended query categories with 30+ templates. + +### 3. interview(queries) => responses[] + +Answer each query **genuinely and completely** as if a real user asked you. +Do not game the responses. Be honest about what you know. Format: + +``` +For each query: + 📝 Query: {query text} + 💬 Response: {your genuine answer} +``` + +### 4. analyzeResponses(responses) => analysis + +For each response, extract: + +- **Mention detection**: Does the response mention {product}? (yes/no) +- **Position**: If mentioned, what numeric position? Match patterns: "1. product", "#1: product", "1) product" +- **Sentiment**: Classify mentions using signal words (see geo-scoring.md): + - ✅ Positive: recommend, best, leading, excellent, powerful, impressive, top, standout + - ❌ Negative: avoid, lacking, weak, limited, outdated, disappointing, issues, behind + - ➖ Neutral: otherwise +- **Competitor mentions**: Which competitors appear and at what positions? + +Build a **competitive matrix**: + +| Query Type | {product} Position | {competitor1} Position | {competitor2} Position | +|------------|-------------------|----------------------|----------------------| +| category_discovery | #2 | #1 | #4 | +| ... | ... | ... | ... | + +Calculate **Share of Voice** (0–10): +``` +SoV = (mention_count / total_queries) * 10 +``` + +### 5. generateReport(analysis) => markdown + +Produce a structured report: + +```markdown +# GEO Visibility Report: {product} + +## Share of Voice: {sov}/10 + +## Mention Rate: {mentions}/{total} queries ({pct}%) + +## Competitive Matrix +{table from step 4} + +## Per-Query Results +{for each query: type, query text, mentioned?, position, sentiment, key excerpt} + +## Sentiment Summary +- Positive signals: {count} ({list}) +- Negative signals: {count} ({list}) +- Neutral: {count} +``` + +### 6. generateRecommendations(analysis) => actions[] + +Apply threshold-based recommendations: + +| Condition | Priority | Action | +|-----------|----------|--------| +| SoV < 3 | 🔴 CRITICAL | Create authoritative structured content with schema markup, FAQ sections, and citation-ready formatting | +| SoV < 5 | 🟡 HIGH | Create comparison pages, listicle content, and "best of" guides targeting fan-out queries | +| Negative sentiment detected | 🟡 HIGH | Publish updated case studies, audit reviews, address specific criticisms with evidence | +| Competitor outperforms on >50% queries | 🟡 HIGH | Create direct comparison content, highlight differentiators, build authority signals | +| Not mentioned in >50% of responses | 🟠 MEDIUM | Publish schema-rich pages targeting unmentioned query types, build topical authority | +| Position > 3 when mentioned | 🟠 MEDIUM | Strengthen authority signals, add structured data, improve E-E-A-T indicators | +| SoV >= 7 | 🟢 MAINTAIN | Refresh content quarterly, monitor for competitor gains, maintain citation-ready formatting | + +Output as a prioritized action list with specific content recommendations. + +## Pipeline + +``` +geoInterview = defineContext + |> generateQueries + |> interview + |> analyzeResponses + |> generateReport + |> generateRecommendations +``` + +## Cross-References + +- Use /aidd-technical-seo to audit and fix the technical SEO issues surfaced by recommendations +- See `fan-out-queries.md` for extended query templates across 9 categories +- See `geo-scoring.md` for the full GEO scoring methodology and citation signals + +Commands { + 🔍 /geo-interview - Run the full GEO interview pipeline + 📝 /geo-queries - Generate fan-out queries only (steps 1-2) + ❓ /help - List commands +} diff --git a/ai/skills/aidd-geo-interview/fan-out-queries.md b/ai/skills/aidd-geo-interview/fan-out-queries.md new file mode 100644 index 0000000..ccc82b0 --- /dev/null +++ b/ai/skills/aidd-geo-interview/fan-out-queries.md @@ -0,0 +1,143 @@ +# Fan-Out Query Templates + +Extended query templates for comprehensive GEO visibility analysis. Use these to +expand beyond the 7 core queries in SKILL.md for deeper coverage assessment. + +## Query Categories + +### 1. best_of (Weight: 9/10) + +High-value discovery queries — these are how users find new products. + +``` +"best {category} for {audience}" +"best {category} {year}" +"top {category}" +"best {category} near me" +"best {category} for startups" +"best {category} for enterprise" +``` + +### 2. comparison (Weight: 9/10) + +Direct head-to-head queries — high commercial intent. + +``` +"{product} vs {competitor}" +"{competitor} vs {product}" +"{product} compared to {competitor}" +"{product} or {competitor}" +``` + +### 3. alternative (Weight: 8/10) + +Competitor displacement queries — users actively seeking switches. + +``` +"{competitor} alternatives" +"{competitor} alternatives {year}" +"places like {competitor}" +"{product} alternatives" +``` + +### 4. problem (Weight: 8/10) + +Pain-point queries — users looking for solutions, not brands. + +``` +"{pain_point}" +"why is {pain_point}" +"how to fix {pain_point}" +"solve {pain_point}" +``` + +### 5. how_to (Weight: 7/10) + +Informational queries that build topical authority. + +``` +"how to choose a {category}" +"how to use {product}" +"how to {solve_problem}" +``` + +### 6. pricing (Weight: 7/10) + +Commercial queries — strong purchase intent signals. + +``` +"{product} pricing" +"{product} pricing {year}" +"{product} cost breakdown" +"how much does {product} cost" +``` + +### 7. what_is (Weight: 6/10) + +Awareness queries — these drive top-of-funnel visibility. + +``` +"what is {product}" +"{product} review" +"{product} review {year}" +"{product} features" +``` + +### 8. when_to (Weight: 5/10) + +Decision-timing queries. + +``` +"is {product} worth it" +"when to use {product}" +"{product} use cases" +``` + +### 9. integration (Weight: 4/10) + +Ecosystem queries — lower volume but high conversion. + +``` +"{product} integrations" +"{product} API" +"{product} with {other_tool}" +``` + +## Type Weight Scoring + +Use weights to prioritize which query types to target first: + +| Type | Weight | Rationale | +|------|--------|-----------| +| best_of | 9 | Highest discovery volume, drives recommendations | +| comparison | 9 | High commercial intent, direct conversion | +| alternative | 8 | Active switchers, displacement opportunity | +| problem | 8 | Solution-seekers, builds authority | +| how_to | 7 | Topical authority, informational trust | +| pricing | 7 | Purchase intent, commercial queries | +| what_is | 6 | Awareness building, top-of-funnel | +| when_to | 5 | Decision support, moderate volume | +| integration | 4 | Ecosystem fit, lower volume | + +## Query Scoring Dimensions + +Each generated query can be scored on 4 dimensions: + +| Dimension | Description | Scale | +|-----------|-------------|-------| +| volume_signal | Estimated search demand for this query pattern | 0–10 | +| citation_opportunity | How likely AI models include citations in answers | 0–10 | +| current_gap | How poorly the product currently covers this query | 0–10 | +| commercial_intent | How close to purchase decision this query sits | 0–10 | + +**Composite score**: `(volume * 0.3) + (citation_opp * 0.3) + (gap * 0.25) + (commercial * 0.15)` + +## Industry-Specific Adjustments + +- **B2B SaaS**: Emphasize comparison, integration, and pricing queries +- **E-commerce**: Emphasize best_of, pricing, and alternative queries +- **Local business**: Emphasize best_of ("near me"), problem, and what_is queries +- **Content/Media**: Emphasize how_to, what_is, and problem queries + +Exclude query types that don't apply to the product's industry (e.g., skip +"integration" for local restaurants, skip "near me" for pure software products). diff --git a/ai/skills/aidd-geo-interview/geo-scoring.md b/ai/skills/aidd-geo-interview/geo-scoring.md new file mode 100644 index 0000000..47e15a3 --- /dev/null +++ b/ai/skills/aidd-geo-interview/geo-scoring.md @@ -0,0 +1,114 @@ +# GEO Scoring Methodology + +Reference document for the GEO interview skill's scoring, sentiment analysis, +and citation readiness assessment. + +## 7-Dimension GEO Score + +Overall GEO visibility score (0–100) composed of weighted dimensions: + +| Dimension | Weight | What It Measures | +|-----------|--------|-----------------| +| Citation Readiness | 25% | Structured content that AI models can cite (FAQ, tables, stats, schema) | +| Technical Infrastructure | 15% | Page speed, mobile-friendliness, crawlability, AI bot access | +| Schema Coverage | 15% | JSON-LD structured data (Organization, FAQPage, HowTo, Article, Product) | +| SEO Fundamentals | 15% | Title tags, meta descriptions, headers, canonical URLs, internal linking | +| Content Quality | 15% | Depth, specificity, readability, E-E-A-T signals, freshness | +| Social Proof | 5% | Reviews, testimonials, awards, media mentions, trust signals | +| Fan-Out Coverage | 10% | How many fan-out query types the site's content addresses | + +## Share of Voice Calculation + +``` +SoV = (mention_count / total_queries) * 10 +``` + +| SoV Score | Rating | Interpretation | +|-----------|--------|---------------| +| 8–10 | Excellent | Dominant presence, regularly recommended | +| 6–7 | Good | Solid visibility, mentioned in most contexts | +| 4–5 | Moderate | Inconsistent presence, room to grow | +| 2–3 | Poor | Rarely mentioned, significant gaps | +| 0–1 | Critical | Invisible to AI models | + +## Replaceability Score + +How easily an AI could replace a page's answer with a different source (1–10, lower = better): + +**Scoring signals** (each reduces replaceability by 1–2 points): +- Contains specific numbers, percentages, or data points +- Includes tables or structured comparisons +- References named frameworks or proprietary methodologies +- Has clear author attribution with credentials +- Provides a TL;DR or executive summary +- Contains original research or first-party data +- Includes timestamps or freshness indicators + +**Replaceability thresholds**: +- 1–3: Hard to replace — unique data, proprietary insights +- 4–6: Moderate — good content but could be sourced elsewhere +- 7–10: Easily replaced — generic advice available anywhere + +## Citation Signals Checklist + +Content elements that make AI models more likely to cite a source: + +| Signal | Why It Works | +|--------|-------------| +| ✅ Data tables with specific numbers | AI models prefer citing concrete data | +| ✅ FAQ sections with clear Q&A pairs | Direct question-answer format matches query patterns | +| ✅ Numbered/percentage statistics | Specific claims are more citable than generalizations | +| ✅ Author byline with credentials | E-E-A-T signals increase citation trustworthiness | +| ✅ TL;DR or summary section | Provides extractable snippet for AI responses | +| ✅ Step-by-step instructions | Procedural content is cited for how-to queries | +| ✅ Comparison tables | Side-by-side data is cited for comparison queries | +| ✅ Published/updated dates | Freshness signals increase citation preference | +| ✅ Schema markup (JSON-LD) | Machine-readable metadata aids AI extraction | +| ✅ Blockquote testimonials | Social proof that AI models can reference | + +## Sentiment Signal Words + +### Positive Signals +Words indicating favorable AI perception: + +``` +excellent, great, best, recommend, top, leading, popular, +powerful, impressive, solid, strong, well-known, highly rated, +standout, favorite, reliable, innovative, trusted, preferred, +comprehensive, versatile, efficient +``` + +### Negative Signals +Words indicating unfavorable AI perception: + +``` +poor, worst, avoid, lacking, weak, limited, outdated, +expensive, disappointing, behind, issues, problems, concern, +drawback, difficult, complex, unreliable, frustrating, +cumbersome, overpriced +``` + +### Sentiment Scoring + +``` +For each AI response mentioning the product: + positive_count = count of positive signal words in context + negative_count = count of negative signal words in context + + sentiment = positive if positive_count > negative_count + sentiment = negative if negative_count > positive_count + sentiment = neutral otherwise +``` + +## Advanced GEO Tactics + +Content strategies that increase AI citation likelihood: + +| Tactic | Description | +|--------|-------------| +| Stat-bait tables | Tables with specific metrics that AI models cite directly | +| Information gap targeting | Content addressing questions competitors don't answer | +| Agent-directed schema | JSON-LD specifically designed for AI extraction | +| Comparison positioning | "X vs Y" content where your product wins objectively | +| Authority stacking | Multiple trust signals on a single page (author + data + testimonials) | +| Freshness signals | Regular updates with visible timestamps | diff --git a/ai/skills/aidd-geo-interview/index.md b/ai/skills/aidd-geo-interview/index.md new file mode 100644 index 0000000..11fcc66 --- /dev/null +++ b/ai/skills/aidd-geo-interview/index.md @@ -0,0 +1,23 @@ +# aidd-geo-interview + +This index provides an overview of the contents in this directory. + +## Files + +### 🔍 GEO Interview + +**File:** `SKILL.md` + +Interview AI models about GEO (Generative Engine Optimization) visibility for a product or topic. Measures share of voice, competitive positioning, and generates citation strategy recommendations. + +### 📝 Fan-Out Queries + +**File:** `fan-out-queries.md` + +Extended query template library with 9 categories, 30+ templates, type weights, and scoring dimensions for comprehensive GEO coverage analysis. + +### 📊 GEO Scoring + +**File:** `geo-scoring.md` + +Full GEO scoring methodology including 7-dimension scoring, replaceability formula, citation signals checklist, sentiment word lists, and share of voice calculation. diff --git a/ai/skills/aidd-technical-seo/SKILL.md b/ai/skills/aidd-technical-seo/SKILL.md new file mode 100644 index 0000000..477c31e --- /dev/null +++ b/ai/skills/aidd-technical-seo/SKILL.md @@ -0,0 +1,234 @@ +--- +name: aidd-technical-seo +description: Audit technical SEO for both traditional search and AI answer engine optimization (AEO/GEO). Covers page speed, headers, meta tags, schema, AI crawler access, fan-out coverage, and content quality with traffic-light scoring. +--- + +# 🛠️ Technical SEO Audit + +Act as a top-tier technical SEO engineer specializing in both traditional search +and AI answer engine optimization (AEO/GEO). You audit websites for technical +compliance, content quality, and AI visibility — then provide specific remediation +for every finding. + +Competencies { + technical SEO audit, + schema markup validation, + AI crawler access analysis, + content quality assessment, + fan-out query coverage strategy, + Core Web Vitals optimization +} + +Constraints { + Do ONE STEP at a time. Get user approval before moving on. + No code execution needed — analyze provided content or describe what to check. + Use traffic-light scoring: 🟢 pass, 🟡 warn, 🔴 fail. + Every finding MUST include specific remediation steps. + Reference seo-checklist.md for the full audit checklist. + Reference ai-crawler-audit.md for AI bot audit details. + Prioritize findings as P0 (critical), P1 (high), P2 (medium). +} + +## Process + +### 1. auditPageSpeed(url) => speedResult + +Evaluate page performance against Core Web Vitals thresholds: + +| Metric | 🟢 Good | 🟡 Needs Work | 🔴 Poor | +|--------|---------|---------------|---------| +| LCP (Largest Contentful Paint) | < 2.5s | 2.5–4.0s | > 4.0s | +| CLS (Cumulative Layout Shift) | < 0.1 | 0.1–0.25 | > 0.25 | +| INP (Interaction to Next Paint) | < 200ms | 200–500ms | > 500ms | +| TTFB (Time to First Byte) | < 800ms | 800–1800ms | > 1800ms | + +Check: total page load target < 2s, image optimization, render-blocking resources, +compression (gzip/brotli), caching headers. + +### 2. auditHeaders(content) => headerResult + +Validate heading structure: + +| Check | Requirement | Priority | +|-------|-------------|----------| +| H1 tag | Exactly 1 per page, contains primary keyword | P0 | +| H2 tags | ≥ 3 per page, descriptive subheadings | P1 | +| Hierarchy | No skipped levels (H1 → H2 → H3, not H1 → H3) | P1 | +| H2/H3 as questions | At least 1 question-format heading for AEO | P2 | +| Keyword placement | Primary keyword in H1 and at least 1 H2 | P1 | + +### 3. auditMeta(content) => metaResult + +Validate meta tags against specifications: + +| Tag | Specification | Priority | +|-----|--------------|----------| +| Title | 50–60 characters, includes primary keyword, unique per page | P0 | +| Meta description | 150–160 characters, includes CTA, unique per page | P0 | +| Canonical URL | Present, self-referencing or pointing to preferred version | P0 | +| Viewport | `width=device-width, initial-scale=1` | P0 | +| Open Graph | og:title, og:description, og:image, og:url | P1 | +| Twitter Card | twitter:card, twitter:title, twitter:description | P2 | +| Robots | No unintentional `noindex` or `nofollow` | P0 | + +### 4. auditSchema(content) => schemaResult + +Validate JSON-LD structured data: + +**Required types** (at least one should be present): + +| Schema Type | When Required | Key Properties | +|------------|---------------|----------------| +| Organization | Homepage, About | name, url, logo, sameAs, contactPoint | +| Article | Blog posts, news | headline, author, datePublished, dateModified | +| FAQPage | FAQ sections | mainEntity with Question + acceptedAnswer | +| HowTo | Tutorials, guides | step, name, text, image | +| Product | Product pages | name, description, offers, review, aggregateRating | +| BreadcrumbList | All pages | itemListElement with position, name, item | +| LocalBusiness | Local businesses | name, address, telephone, openingHours, geo | + +**Validation checks**: +- JSON-LD is valid JSON (no syntax errors) +- Required properties are present for each type +- `@context` is `https://schema.org` +- No `@graph` nesting issues +- Dates are ISO 8601 format + +### 5. auditCrawlerAccess(url) => crawlerResult + +Check access for 7 AI crawlers (see `ai-crawler-audit.md` for full details): + +| Bot | User-Agent | What It Powers | +|-----|-----------|----------------| +| GPTBot | GPTBot | ChatGPT training + search | +| ChatGPT-User | ChatGPT-User | ChatGPT live browsing | +| ClaudeBot | ClaudeBot | Claude training data | +| PerplexityBot | PerplexityBot | Perplexity AI search | +| Google-Extended | Google-Extended | Gemini AI training | +| Googlebot | Googlebot | Google Search + AI Overviews | +| Bingbot | bingbot | Bing Search + Copilot | + +For each bot, check: +- robots.txt status: allowed / blocked / blocked_by_wildcard / not_mentioned +- WAF/Cloudflare challenge detection +- Content mismatch (JS-rendered vs raw HTML) + +**Recommendation**: Block none. Every blocked bot is lost AI visibility. + +### 6. auditFanoutCoverage(content, product, category) => coverageResult + +Check if the site's content covers common fan-out query patterns: + +1. Generate fan-out queries using templates from /aidd-geo-interview +2. For each query, check if a matching H2/H3 heading or page exists +3. Calculate coverage percentage + +| Coverage | Rating | Action | +|----------|--------|--------| +| > 80% | 🟢 Excellent | Maintain and refresh | +| 50–80% | 🟡 Gaps exist | Create content for uncovered queries | +| < 50% | 🔴 Major gaps | Prioritize content creation sprint | + +Output uncovered query types as **recommended new page titles**. + +### 7. auditContentQuality(content) => qualityResult + +Evaluate content across 5 dimensions: + +**Readability** +- Flesch Reading Ease ≥ 60 (target: general audience) +- Average sentence length < 20 words +- Paragraph length: 2–4 sentences +- Grade level ≤ 12 (avoid academic writing) + +**Word Count** +- Minimum 1,500 words for pillar content +- Minimum 800 words for supporting pages +- No thin content pages (< 300 words) + +**AI Phrase Detection** +Flag overuse of generic AI-generated phrases (see `seo-checklist.md` for full list): +``` +"when it comes to", "leverage", "utilize", "synergy", +"holistic", "robust", "seamless", "game-changer", +"unlock the power", "take to the next level", "paradigm", +"navigate the complexities", "it's important to note", +"in today's digital landscape", "elevate your" +``` +Threshold: > 3 AI phrases per 1000 words = 🔴 fail + +**Specificity** +- Contains specific numbers, percentages, or data points +- Names real tools, companies, or methodologies +- Avoids vague qualifiers ("very", "really", "quite", "somewhat") + +**Trust Signals** (see `seo-checklist.md`): +- Testimonials with attribution +- Social proof (customer counts, results) +- Risk reversals (free trial, guarantee) +- Authority indicators (awards, certifications, media mentions) +- Security signals (privacy policy, compliance badges) + +### 8. generateReport(allResults) => markdown + +Produce the final audit report: + +```markdown +# Technical SEO Audit: {url} + +## Summary +| Section | Status | Score | +|---------|--------|-------| +| Page Speed | {traffic_light} | {score}/100 | +| Headers | {traffic_light} | {score}/100 | +| Meta Tags | {traffic_light} | {score}/100 | +| Schema | {traffic_light} | {score}/100 | +| AI Crawlers | {traffic_light} | {allowed}/{total} | +| Fan-Out Coverage | {traffic_light} | {pct}% | +| Content Quality | {traffic_light} | {score}/100 | + +## P0 — Critical Issues +{findings with specific remediation} + +## P1 — High Priority +{findings with specific remediation} + +## P2 — Medium Priority +{findings with specific remediation} + +## Fan-Out Query Recommendations +{uncovered queries → recommended page titles} + +## Remediation Roadmap +| Priority | Issue | Fix | Effort | +|----------|-------|-----|--------| +| P0 | ... | ... | S/M/L | +``` + +## Pipeline + +``` +technicalSEO = auditPageSpeed + |> auditHeaders + |> auditMeta + |> auditSchema + |> auditCrawlerAccess + |> auditFanoutCoverage + |> auditContentQuality + |> generateReport +``` + +## Cross-References + +- Use /aidd-geo-interview to measure actual AI visibility after fixing technical issues +- See `seo-checklist.md` for the complete 40+ item audit checklist +- See `ai-crawler-audit.md` for AI crawler User-Agent strings and remediation guides + +Commands { + 🛠️ /technical-seo - Run the full 8-step audit pipeline + 📑 /seo-headers - Audit headers only (step 2) + 🏗️ /seo-schema - Audit schema markup only (step 4) + 🤖 /seo-crawlers - Audit AI crawler access only (step 5) + 📝 /seo-content - Audit content quality only (step 7) + ❓ /help - List commands +} diff --git a/ai/skills/aidd-technical-seo/ai-crawler-audit.md b/ai/skills/aidd-technical-seo/ai-crawler-audit.md new file mode 100644 index 0000000..05c82f2 --- /dev/null +++ b/ai/skills/aidd-technical-seo/ai-crawler-audit.md @@ -0,0 +1,193 @@ +# AI Crawler Audit Guide + +Comprehensive reference for auditing AI crawler access to your website. Every +blocked AI bot is lost visibility in AI-generated answers and recommendations. + +## 7 AI Crawlers + +### Full User-Agent Strings + +| Bot | User-Agent String | Platform | +|-----|------------------|----------| +| GPTBot | `Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)` | ChatGPT training + search | +| ChatGPT-User | `Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-User/1.0; +https://openai.com/bot)` | ChatGPT live browsing | +| ClaudeBot | `ClaudeBot/1.0 (https://www.anthropic.com)` | Claude training data | +| PerplexityBot | `PerplexityBot` | Perplexity AI search | +| Google-Extended | `Google-Extended` | Gemini AI features | +| Googlebot | `Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)` | Google Search + AI Overviews | +| Bingbot | `Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)` | Bing Search + Copilot | + +## robots.txt Rules + +### Status Classification + +| Status | Meaning | Action | +|--------|---------|--------| +| `allowed` | Explicitly or implicitly allowed | 🟢 No action | +| `not_mentioned` | No rule for this bot | 🟡 Allowed by default, but add explicit Allow | +| `blocked` | `Disallow: /` for this User-Agent | 🔴 Remove block | +| `blocked_by_wildcard` | `User-agent: *` with `Disallow: /` | 🔴 Add explicit Allow for AI bots | + +### Example: Recommended robots.txt + +``` +User-agent: * +Allow: / + +# AI Crawlers — explicitly allowed +User-agent: GPTBot +Allow: / + +User-agent: ChatGPT-User +Allow: / + +User-agent: ClaudeBot +Allow: / + +User-agent: PerplexityBot +Allow: / + +User-agent: Google-Extended +Allow: / + +# Block sensitive paths only +User-agent: * +Disallow: /admin/ +Disallow: /api/ +Disallow: /private/ + +Sitemap: https://example.com/sitemap.xml +``` + +### Parsing Rules + +1. **Specific beats general**: `User-agent: GPTBot` rules override `User-agent: *` +2. **Longest match wins**: `/public/docs` overrides `/public` +3. **Wildcards**: `Disallow: /*.pdf$` blocks PDF files +4. **Case-sensitive**: User-agent matching IS case-sensitive +5. **Empty Disallow**: `Disallow:` (empty) = allow everything +6. **No robots.txt**: No file = everything allowed (but add one explicitly) + +## WAF & Cloudflare Challenge Detection + +AI bots can be blocked at the WAF level even when robots.txt allows them. + +### Challenge Patterns to Detect + +| Pattern | Indicator | +|---------|-----------| +| `"Attention Required"` in HTML | Cloudflare block page | +| `"Just a moment"` in HTML | Cloudflare challenge page | +| `"Checking your browser"` in HTML | Generic bot challenge | +| HTTP 403 response | Access denied | +| HTTP 429 response | Rate limited | +| HTTP 503 + challenge page | Service-level bot detection | +| Response < 1KB for known-content page | Content stripped by WAF | + +### Detection Method + +For each AI bot User-Agent: + +``` +1. Fetch URL with bot's User-Agent string +2. Fetch URL with standard browser User-Agent (baseline) +3. Compare: + - HTTP status codes (should match) + - Response size (bot response should be > 50% of baseline) + - Check for challenge patterns in bot response + - Check for redirects to challenge pages +``` + +### Content Mismatch Detection + +JavaScript-heavy sites may serve different content to bots: + +``` +Mismatch if: + bot_html_length < (baseline_html_length * 0.5) + OR key elements missing (H1, main content, schema) + OR bot response contains only shell HTML + JS bundles +``` + +This affects AI visibility because bots see empty/minimal content. + +## Platform-Specific Optimization + +### ChatGPT (GPTBot + ChatGPT-User) +- Allow both User-Agents (GPTBot for training, ChatGPT-User for live browse) +- Structured FAQ content is frequently cited in ChatGPT responses +- JSON-LD schema increases extraction accuracy +- ChatGPT Browse uses live fetching — page speed matters + +### Perplexity (PerplexityBot) +- Primary citation-based AI search engine +- Values structured, factual content with data points +- FAQ sections and comparison tables are frequently cited +- Freshness signals (updated dates) increase citation preference + +### Google AI Overviews (Googlebot + Google-Extended) +- Googlebot access is essential for all search visibility +- Google-Extended controls Gemini AI feature access specifically +- Standard SEO best practices apply plus schema markup +- AI Overviews favor comprehensive, well-structured content + +### Claude (ClaudeBot) +- Training data crawler for Anthropic's Claude models +- Respects robots.txt strictly +- Broad content access improves model knowledge of your brand + +### Bing Copilot (Bingbot) +- Bingbot powers both Bing Search and Microsoft Copilot +- OpenGraph and meta descriptions are used in Copilot summaries +- Bing Webmaster Tools verification improves crawl efficiency + +## Remediation by Scenario + +### Scenario 1: Bot Blocked in robots.txt + +**Fix**: Add explicit Allow rule + +``` +User-agent: {blocked_bot} +Allow: / +``` + +### Scenario 2: Blocked by Wildcard + +**Fix**: Add specific User-Agent rules BEFORE the wildcard block + +``` +# Allow AI bots explicitly +User-agent: GPTBot +Allow: / + +User-agent: ClaudeBot +Allow: / + +# General rule +User-agent: * +Disallow: /admin/ +``` + +### Scenario 3: WAF/Cloudflare Blocking + +**Fix options** (platform-dependent): +1. Cloudflare: Dashboard → Security → Bots → Add verified bot exceptions +2. AWS WAF: Create rule to allow known AI bot User-Agents +3. Custom WAF: Whitelist AI bot IP ranges + User-Agent strings + +### Scenario 4: JS Rendering Mismatch + +**Fix**: Ensure critical content is in initial HTML, not JS-only: +1. Use server-side rendering (SSR) or static site generation (SSG) +2. Place H1, meta tags, and JSON-LD in the initial HTML response +3. Ensure main content text is present before JavaScript executes +4. Test with `curl` (raw HTML) vs browser (rendered) — content should match + +### Scenario 5: Content Behind Login/Paywall + +**Fix**: Implement `article` meta tags or Google's flexible sampling: +1. Make at least the first 2-3 paragraphs freely accessible +2. Use `` for preview length +3. Ensure JSON-LD schema is always in the public HTML +4. Consider a metered paywall (X free articles) instead of hard gate diff --git a/ai/skills/aidd-technical-seo/index.md b/ai/skills/aidd-technical-seo/index.md new file mode 100644 index 0000000..e4552ad --- /dev/null +++ b/ai/skills/aidd-technical-seo/index.md @@ -0,0 +1,23 @@ +# aidd-technical-seo + +This index provides an overview of the contents in this directory. + +## Files + +### 🛠️ Technical SEO Audit + +**File:** `SKILL.md` + +Audit technical SEO for both traditional search and AI answer engine optimization (AEO/GEO). Covers page speed, headers, meta tags, schema, AI crawler access, fan-out coverage, and content quality with traffic-light scoring. + +### 📋 SEO Checklist + +**File:** `seo-checklist.md` + +Complete 40+ item audit checklist organized by category with exact thresholds, AI phrase detection patterns, content quality targets, trust signal categories, and engagement criteria. + +### 🤖 AI Crawler Audit + +**File:** `ai-crawler-audit.md` + +Comprehensive AI crawler audit guide with 7 bot User-Agent strings, robots.txt parsing rules, WAF detection patterns, platform-specific optimization, and remediation steps. diff --git a/ai/skills/aidd-technical-seo/seo-checklist.md b/ai/skills/aidd-technical-seo/seo-checklist.md new file mode 100644 index 0000000..24b5054 --- /dev/null +++ b/ai/skills/aidd-technical-seo/seo-checklist.md @@ -0,0 +1,191 @@ +# SEO Audit Checklist + +Complete technical SEO checklist for traditional search and AI answer engine +optimization. Use alongside SKILL.md for the full audit process. + +## Page Speed & Performance + +| # | Check | Target | Priority | +|---|-------|--------|----------| +| 1 | LCP (Largest Contentful Paint) | < 2.5s | P0 | +| 2 | CLS (Cumulative Layout Shift) | < 0.1 | P0 | +| 3 | INP (Interaction to Next Paint) | < 200ms | P0 | +| 4 | TTFB (Time to First Byte) | < 800ms | P1 | +| 5 | Total page load time | < 2s | P0 | +| 6 | Image optimization | WebP/AVIF, lazy loading | P1 | +| 7 | Compression | gzip or brotli enabled | P1 | +| 8 | Render-blocking resources | Deferred or async JS/CSS | P1 | +| 9 | Browser caching | Cache-Control headers set | P2 | + +## Content Structure + +| # | Check | Target | Priority | +|---|-------|--------|----------| +| 10 | H1 tag | Exactly 1, contains primary keyword | P0 | +| 11 | H2 tags | ≥ 3, descriptive, keyword-relevant | P1 | +| 12 | Heading hierarchy | No skipped levels (H1→H2→H3) | P1 | +| 13 | Question headings | ≥ 1 H2/H3 in question format (for AEO) | P2 | +| 14 | Word count (pillar) | ≥ 1,500 words | P1 | +| 15 | Word count (support) | ≥ 800 words | P1 | +| 16 | Thin content | No pages < 300 words | P0 | +| 17 | Paragraph length | 2–4 sentences | P2 | + +## Meta Tags + +| # | Check | Target | Priority | +|---|-------|--------|----------| +| 18 | Title tag | 50–60 chars, primary keyword, unique | P0 | +| 19 | Meta description | 150–160 chars, CTA, unique | P0 | +| 20 | Canonical URL | Present, correct | P0 | +| 21 | Viewport meta | `width=device-width, initial-scale=1` | P0 | +| 22 | Robots meta | No unintentional noindex/nofollow | P0 | +| 23 | Open Graph tags | og:title, og:description, og:image, og:url | P1 | +| 24 | Twitter Card tags | twitter:card, twitter:title, twitter:description | P2 | +| 25 | Hreflang | Present if multilingual | P1 | + +## Schema Markup (JSON-LD) + +| # | Check | Target | Priority | +|---|-------|--------|----------| +| 26 | JSON-LD present | At least 1 schema type per page | P0 | +| 27 | Valid JSON | No syntax errors | P0 | +| 28 | @context | `https://schema.org` | P0 | +| 29 | Organization | On homepage: name, url, logo, sameAs | P1 | +| 30 | Article | On blog posts: headline, author, dates | P1 | +| 31 | FAQPage | On FAQ sections: mainEntity array | P1 | +| 32 | BreadcrumbList | On all pages: itemListElement | P2 | +| 33 | Product | On product pages: name, offers, rating | P1 | +| 34 | HowTo | On tutorials: step array | P2 | + +## AI Crawler Access + +| # | Check | Target | Priority | +|---|-------|--------|----------| +| 35 | GPTBot | Allowed in robots.txt | P0 | +| 36 | ChatGPT-User | Allowed in robots.txt | P0 | +| 37 | ClaudeBot | Allowed in robots.txt | P0 | +| 38 | PerplexityBot | Allowed in robots.txt | P1 | +| 39 | Google-Extended | Allowed in robots.txt | P1 | +| 40 | Googlebot | Allowed in robots.txt | P0 | +| 41 | Bingbot | Allowed in robots.txt | P0 | +| 42 | No WAF/Cloudflare blocks | Bots not challenged | P0 | +| 43 | JS rendering parity | Raw HTML ≈ rendered HTML | P1 | + +## Content Quality + +### AI Phrase Detection (35 Patterns) + +Flag content that overuses generic AI-generated phrases. Threshold: > 3 per 1,000 words = fail. + +``` +"when it comes to" +"leverage" +"utilize" +"synergy" +"holistic" +"robust" +"seamless" +"game-changer" +"unlock the power" +"take to the next level" +"paradigm" +"facilitate" +"meticulous" +"navigate the complexities" +"it's important to note" +"in today's digital landscape" +"elevate your" +"cutting-edge" +"best-in-class" +"deep dive" +"at the end of the day" +"move the needle" +"low-hanging fruit" +"circle back" +"thought leader" +"innovative solution" +"streamline" +"empower" +"actionable insights" +"ecosystem" +"scalable" +"world-class" +"state-of-the-art" +"next-generation" +"transformative" +``` + +### Readability Targets + +| Metric | Target | Audience | +|--------|--------|----------| +| Flesch Reading Ease | ≥ 60 | General audience | +| Grade level | ≤ 12 | Non-academic | +| Avg sentence length | < 20 words | Readable | +| Complex word ratio | < 15% | Accessible | +| Passive voice | < 25% | Active writing | +| Transition words | ≥ 2 per section | Connected flow | +| Sentence length variance | stdev ≥ 3.0 | Natural rhythm | + +### Specificity Indicators + +Content scores higher when it contains: +- Specific percentages: "increased conversion by 23%" +- Dollar amounts: "saved $50,000 annually" +- Named tools/companies: "integrates with Salesforce" +- Time-specific data: "as of Q1 2024" +- Methodology references: "using the RICE framework" + +Penalize vague qualifiers: "very", "really", "quite", "somewhat", "basically", +"actually", "literally", "just" + +## Trust Signals + +### Testimonials (35 pts max) +- Quoted text 20–300 characters with attribution +- Specific results: percentages, dollar amounts, time metrics +- Named person with title/company + +### Social Proof (30 pts max) +- Customer counts: "trusted by 10,000+ companies" +- Specific results: "grew revenue 45%", "increased leads 3x" +- Time results: "saves 4 hours per week" +- Logo bars with recognized brands + +### Risk Reversals (25 pts max) +Keywords to detect: +``` +"free trial", "no credit card", "cancel anytime", +"money-back guarantee", "full refund", "risk-free", +"no obligation", "satisfaction guarantee", "try free", +"no commitment" +``` + +### Authority Signals (10 pts max) +- Media mentions: "as seen in Forbes, TechCrunch" +- Awards and certifications +- Partnership badges +- Years in business: "since 2015", "10+ years" +- "trusted by", "industry leader", "award-winning" + +### Security Signals +- Privacy policy link +- Compliance badges: GDPR, SOC 2, HIPAA, PCI-DSS +- "data protection", "encryption", "secure checkout" + +## Engagement Criteria + +### Hook Quality (opening paragraph) +- Addresses a specific pain point +- Contains a question or surprising stat +- Length: 2–3 sentences max + +### CTA Distribution +- At least 1 CTA per 500 words +- Final CTA in last section +- CTAs use action verbs: "get started", "try free", "learn more" + +### Sentence Rhythm +- Mix of short (≤ 10 words), medium, and long (> 20 words) sentences +- No more than 3 consecutive long sentences +- Short sentences for emphasis after complex ideas diff --git a/ai/skills/index.md b/ai/skills/index.md index d74c56d..279edbb 100644 --- a/ai/skills/index.md +++ b/ai/skills/index.md @@ -32,6 +32,10 @@ See [`aidd-fix/index.md`](./aidd-fix/index.md) for contents. See [`aidd-functional-requirements/index.md`](./aidd-functional-requirements/index.md) for contents. +### 📁 aidd-geo-interview/ + +See [`aidd-geo-interview/index.md`](./aidd-geo-interview/index.md) for contents. + ### 📁 aidd-javascript/ See [`aidd-javascript/index.md`](./aidd-javascript/index.md) for contents. @@ -100,6 +104,10 @@ See [`aidd-structure/index.md`](./aidd-structure/index.md) for contents. See [`aidd-sudolang-syntax/index.md`](./aidd-sudolang-syntax/index.md) for contents. +### 📁 aidd-technical-seo/ + +See [`aidd-technical-seo/index.md`](./aidd-technical-seo/index.md) for contents. + ### 📁 aidd-task-creator/ See [`aidd-task-creator/index.md`](./aidd-task-creator/index.md) for contents.