Skip to content

feat: add metadata-ai-readiness skill and enrich mart model YAML#2

Open
aboyalejandro wants to merge 2 commits into
mainfrom
feature/metadata-ai-readiness-skill
Open

feat: add metadata-ai-readiness skill and enrich mart model YAML#2
aboyalejandro wants to merge 2 commits into
mainfrom
feature/metadata-ai-readiness-skill

Conversation

@aboyalejandro
Copy link
Copy Markdown
Owner

@aboyalejandro aboyalejandro commented Apr 4, 2026

Summary

  • Enriched all 46 column descriptions across campaign_performance and daily_summary with [Business Purpose] and [Known Issues / Caveats] sections
  • Applied dbt Agent Skills writing-documentation standards — no description restates its column name
  • Documented COALESCE masking behavior, synthetic data uniformity, date gaps, and misleading zero values discovered via database profiling

AI Readiness Audit Report

campaign_performance

dbt Schema

Check Status Detail
Model description exists PASS Has grain, key metrics, business context
All 26 SQL columns in YAML PASS Full coverage
Descriptions pass writing-documentation PASS (was FAIL) All 26 columns now enriched with [Business Purpose] and caveats
Grain tests (composite: campaign_id + date) FAIL Both have not_null but no composite unique test — grain is unverified

Query Guidance

Check Finding
Grain holds PASS — 400 rows, 400 distinct (campaign_id, date) pairs
Date range 2025-12-01 to 2025-12-21 (21 days expected, 20 present)
Date gap 2025-12-20 missing — absent from source stg_campaigns_daily
Sessions constant total_sessions = 110 for ALL 400 rows — likely synthetic/uniform data
Zero conversions 41/400 rows (10.3%) have 0 conversions → avg_order_value COALESCE'd to 0 (misleading: 0 ≠ "no data")
COALESCE masking Session and conversion columns LEFT JOIN'd with COALESCE(x, 0) — zeros hide NULLs from missing joins
ROAS spread 0.066 to 6.20 — wide range, no outlier flagging
Channels 8 channels, 20 campaigns (2 paused: tiktok, linkedin)

daily_summary

dbt Schema

Check Status Detail
Model description exists PASS Has grain, key metrics, business context
All 20 SQL columns in YAML PASS Full coverage
Descriptions pass writing-documentation PASS (was FAIL) All 20 columns now enriched
Grain tests (date) PASS not_null + unique present

Query Guidance

Check Finding
Grain holds PASS — 20 rows, 20 distinct dates
Date gap 2025-12-20 missing — same source gap
Sessions constant total_sessions = 2,200 every day (20 campaigns × 110)
Conversions constant total_conversions = 150 every day — suspiciously uniform
NULLIF columns budget_utilization, overall_conversion_rate, overall_roas, overall_cpa — 0 NULLs currently (no divide-by-zero days exist, but would produce NULLs if they did)
Spend range $33,878 – $50,386/day

Summary

PASS: 9/10 | Auto-fixed: 3 (descriptions enriched) | Manual: 1 (composite unique test on campaign_performance)

Remaining manual action

campaign_performance needs a composite uniqueness test on its grain. Add to _marts.yml:

    data_tests:
      - dbt_utils.unique_combination_of_columns:
          combination_of_columns:
            - campaign_id
            - date

Test plan

  • Verify _marts.yml parses without errors: dbt parse --profiles-dir . --project-dir .
  • Confirm enriched descriptions render correctly in dbt docs
  • Review flagged caveats against business knowledge (especially uniform session/conversion data)

🤖 Generated with Claude Code

aboyalejandro and others added 2 commits April 4, 2026 12:12
Add a Claude Code skill that audits and enriches dbt schema YAML for
AI consumption. The skill automates dbt Agent Skills' writing-documentation
and discovering-data standards via Postgres MCP.

Enrich _marts.yml with structured descriptions using bracketed headers
([Business Purpose], [Data Grain], [Known Issues / Caveats]) on both
campaign_performance and daily_summary models. Key caveats surfaced:
COALESCE vs NULLIF inconsistency between models, composite grain
testing gap, and averaged-averages in daily_summary aggregations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…caveats

Apply writing-documentation standards to all 46 columns across both mart models.
Each description now includes [Business Purpose] context and [Known Issues / Caveats]
discovered via database profiling (COALESCE masking, uniform session data, date gaps,
misleading zero values on calculated KPIs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant