From 7027ca4157eab00daf71a4831f6dcfedcc1bb97d Mon Sep 17 00:00:00 2001 From: Sourabh Chourasia Date: Thu, 14 May 2026 20:27:30 +0530 Subject: [PATCH 1/2] feat: pipeline scripts + April 2026 draft entries (30, flagged draft) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a reproducible release-to-entries pipeline and 30 draft changelog entries for April 2026 under generated/april-2026/. All drafts carry `draft: true` so the website skips them until reviewers promote files into entries/. Pipeline (scripts/): - fetch_releases.py — pulls GitHub releases for 7 source repos (altimate-backend, altimate-frontend, vscode-dbt-power-user, altimate-code, altimate-mcp-engine, vscode-altimate-mcp-server, altimate-dbt-snowflake-query-tags) with primary-product mapping. - fetch_pr_bodies.py — resumable, threaded PR body cache. - reparse_releases.py — re-parses PR refs in releases.json after regex tweaks without re-fetching. - build_features.py — classifier (PR → product), dedup by Jira ID + paired-PR refs, feat:-prefix filter. - build_curation.py — emits a [x]/[ ] checklist grouped by product → month. - polish_entries.py — reads `[x]` marks and writes drafts. - generate_entries.py — one-shot draft writer. Drafts: - generated/april-2026/ holds 30 hand-polished entries covering 58 merged PRs across the 6 active April repos. - All 30 pass `.github/scripts/validate.py`. - Review doc on Notion mirrors these files for team verification before promotion to entries/. .gitignore: - Ignores data/ (raw release + PR bodies from private repos). - Ignores generated/* but tracks `generated//` subdirectories so monthly curated drafts can be reviewed in PRs while polish drafts containing private PR refs stay local. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 9 + .../2026-04-01-dbt-1-11-udf-lineage-in-mcp.md | 11 + .../2026-04-02-dbt-cloud-sync-consolidated.md | 13 + .../2026-04-02-llm-guard-prompt-injection.md | 13 + ...26-04-02-remote-mcp-http-sse-transports.md | 13 + ...26-04-04-altimate-code-gitlab-mr-review.md | 13 + ...ltimate-code-login-and-project-profiles.md | 13 + ...026-04-06-dbt-cloud-environment-aliases.md | 13 + ...dio-citations-sharing-scheduled-reports.md | 13 + ...26-04-06-table-level-lineage-csv-export.md | 13 + ...dbt-syntax-grammar-and-python-detection.md | 13 + .../2026-04-09-tokens-tab-and-deposit-flow.md | 13 + .../2026-04-11-datamates-chat-panel-polish.md | 13 + ...-altimate-code-dbt-unit-test-generation.md | 13 + .../2026-04-14-auto-suspend-savings-range.md | 11 + ...scriptions-webhooks-cloning-team-alerts.md | 13 + ...026-04-14-team-attribution-in-qtp-slack.md | 13 + ...-cortex-ai-services-and-warehouse-names.md | 13 + ...26-04-15-cte-profiler-in-dbt-power-user.md | 13 + ...5-schedule-name-validation-and-send-now.md | 13 + ...04-15-schema-yaml-hovers-and-codelenses.md | 13 + ...ta-parity-skill-mssql-fabric-clickhouse.md | 13 + ...1-databricks-ai-gateway-as-llm-provider.md | 13 + .../april-2026/2026-04-21-referral-program.md | 13 + ...2-databricks-jobs-spark-users-deep-dive.md | 13 + ...ate-skills-push-to-cursor-copilot-cline.md | 13 + .../2026-04-22-sso-inactivity-timeout.md | 13 + ...23-altimate-code-chat-in-dbt-power-user.md | 13 + .../2026-04-23-risingwave-adapter-support.md | 11 + ...04-28-share-studio-to-slack-via-webhook.md | 13 + ...26-04-29-custom-date-range-on-workloads.md | 11 + scripts/build_curation.py | 153 +++++++ scripts/build_features.py | 411 ++++++++++++++++++ scripts/fetch_pr_bodies.py | 129 ++++++ scripts/fetch_releases.py | 185 ++++++++ scripts/generate_entries.py | 202 +++++++++ scripts/polish_entries.py | 230 ++++++++++ scripts/reparse_releases.py | 27 ++ 38 files changed, 1728 insertions(+) create mode 100644 generated/april-2026/2026-04-01-dbt-1-11-udf-lineage-in-mcp.md create mode 100644 generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md create mode 100644 generated/april-2026/2026-04-02-llm-guard-prompt-injection.md create mode 100644 generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md create mode 100644 generated/april-2026/2026-04-04-altimate-code-gitlab-mr-review.md create mode 100644 generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md create mode 100644 generated/april-2026/2026-04-06-dbt-cloud-environment-aliases.md create mode 100644 generated/april-2026/2026-04-06-studio-citations-sharing-scheduled-reports.md create mode 100644 generated/april-2026/2026-04-06-table-level-lineage-csv-export.md create mode 100644 generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md create mode 100644 generated/april-2026/2026-04-09-tokens-tab-and-deposit-flow.md create mode 100644 generated/april-2026/2026-04-11-datamates-chat-panel-polish.md create mode 100644 generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md create mode 100644 generated/april-2026/2026-04-14-auto-suspend-savings-range.md create mode 100644 generated/april-2026/2026-04-14-subscriptions-webhooks-cloning-team-alerts.md create mode 100644 generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md create mode 100644 generated/april-2026/2026-04-15-cortex-ai-services-and-warehouse-names.md create mode 100644 generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md create mode 100644 generated/april-2026/2026-04-15-schedule-name-validation-and-send-now.md create mode 100644 generated/april-2026/2026-04-15-schema-yaml-hovers-and-codelenses.md create mode 100644 generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md create mode 100644 generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md create mode 100644 generated/april-2026/2026-04-21-referral-program.md create mode 100644 generated/april-2026/2026-04-22-databricks-jobs-spark-users-deep-dive.md create mode 100644 generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md create mode 100644 generated/april-2026/2026-04-22-sso-inactivity-timeout.md create mode 100644 generated/april-2026/2026-04-23-altimate-code-chat-in-dbt-power-user.md create mode 100644 generated/april-2026/2026-04-23-risingwave-adapter-support.md create mode 100644 generated/april-2026/2026-04-28-share-studio-to-slack-via-webhook.md create mode 100644 generated/april-2026/2026-04-29-custom-date-range-on-workloads.md create mode 100644 scripts/build_curation.py create mode 100644 scripts/build_features.py create mode 100644 scripts/fetch_pr_bodies.py create mode 100644 scripts/fetch_releases.py create mode 100644 scripts/generate_entries.py create mode 100644 scripts/polish_entries.py create mode 100644 scripts/reparse_releases.py diff --git a/.gitignore b/.gitignore index 9002caf..5a6defc 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,12 @@ __pycache__/ *.pyc .venv/ .lycheecache + +# Pipeline artifacts (raw release/PR data fetched from private repos) +data/ + +# Auto-generated polish drafts that include private PR refs in HTML comments. +# Curated monthly drafts under generated// are tracked explicitly. +generated/* +!generated/april-2026/ +!generated/*/ diff --git a/generated/april-2026/2026-04-01-dbt-1-11-udf-lineage-in-mcp.md b/generated/april-2026/2026-04-01-dbt-1-11-udf-lineage-in-mcp.md new file mode 100644 index 0000000..6d2b339 --- /dev/null +++ b/generated/april-2026/2026-04-01-dbt-1-11-udf-lineage-in-mcp.md @@ -0,0 +1,11 @@ +--- +title: dbt 1.11 UDF lineage in Datamates MCP +date: 2026-04-01 +products: [datamates] +tag: new +emoji: 🔗 +draft: true +description: User-defined functions from dbt 1.11 now appear as first-class nodes in lineage graphs served through MCP. +--- + +Datamates MCP now parses dbt 1.11 function nodes and serves them as first-class lineage nodes alongside models, sources, and exposures. UDFs registered in a dbt project show up in lineage queries automatically — no new MCP tools, no configuration. Projects on older dbt versions see no change; the parser produces an empty map and falls through. diff --git a/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md b/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md new file mode 100644 index 0000000..ab6e1bd --- /dev/null +++ b/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md @@ -0,0 +1,13 @@ +--- +title: dbt Cloud sync, 5x faster +date: 2026-04-02 +products: [dbt-power-user, datamates] +tag: improved +emoji: 🧰 +draft: true +description: dbt Cloud sync consolidates ~1000 daily tasks per project/environment into a single ingestion run, cutting sync time from ~250 min to under 50. +--- + +dbt Cloud sync no longer creates one ingestion task per dbt Cloud run. The sync now consolidates to one task per `(project, environment)` per cycle — so a project firing 1000 runs a day produces one ingestion run instead of 1000 redundant ones. + +Manifest parsing, health checks, and PostgreSQL upserts each happen once per node per cycle instead of ~20 times, dropping a typical 4-worker sync from ~250 min to under 50. diff --git a/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md b/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md new file mode 100644 index 0000000..22ab980 --- /dev/null +++ b/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md @@ -0,0 +1,13 @@ +--- +title: LLM Guard blocks prompt injection and poisoning +date: 2026-04-02 +products: [snowflake-app, databricks-app, datamates] +tag: improved +emoji: 🛡️ +draft: true +description: Studio's agent gateway now blocks nine new injection categories — instruction override, role hijacking, system-prompt extraction, payload obfuscation, multi-turn poisoning, and more. +--- + +The agent gateway used by Studio's `/chat/completions` now blocks nine new categories of prompt-injection and prompt-poisoning attempts before they reach the model: instruction override, role hijacking, system-prompt extraction, security bypass, fake system tokens, payload obfuscation, indirect injection (via tool output or documents), social engineering, and multi-turn poisoning. + +Existing AI-model-identification detection continues to work. The added checks run in front of the same agent endpoint, so all Studio surfaces get the upgraded coverage automatically. diff --git a/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md b/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md new file mode 100644 index 0000000..4f0cf40 --- /dev/null +++ b/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md @@ -0,0 +1,13 @@ +--- +title: Remote MCP setup with HTTP and SSE transport options +date: 2026-04-02 +products: [datamates, altimate-code] +tag: improved +emoji: 🔌 +draft: true +description: The remote MCP setup screen now offers Streamable HTTP and SSE transport tabs, each pre-filled with the right config and auth headers. +--- + +The remote MCP setup screen now lets you pick between two transports. The Streamable HTTP tab uses `type: "http"` with the `/mcp` endpoint and is the default for Claude and most clients; the SSE tab uses `type: "sse"` with the `/sse` endpoint for clients that don't speak streamable HTTP — Cursor users in particular. + +Each tab fills in the matching config snippet and auth headers, so it's a copy-paste away from a working connection. diff --git a/generated/april-2026/2026-04-04-altimate-code-gitlab-mr-review.md b/generated/april-2026/2026-04-04-altimate-code-gitlab-mr-review.md new file mode 100644 index 0000000..fc9ed4f --- /dev/null +++ b/generated/april-2026/2026-04-04-altimate-code-gitlab-mr-review.md @@ -0,0 +1,13 @@ +--- +title: AI code review for GitLab merge requests +date: 2026-04-04 +products: [altimate-code] +tag: new +emoji: 🔎 +draft: true +description: The Altimate Code CLI fetches a GitLab MR diff, runs the AI review, and posts the result back as MR notes. +--- + +Altimate Code's CLI now reviews GitLab merge requests end-to-end. Run `altimate-code gitlab review ` and the CLI fetches the diff through GitLab REST API v4, runs the AI review, and posts the results back as MR notes — the same flow that has been available for GitHub since launch. + +URL parsing handles gitlab.com, self-hosted instances on custom ports, and nested group project paths. A `--no-post-comment` flag dry-runs the review locally, and `--model` picks which model handles the analysis. diff --git a/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md b/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md new file mode 100644 index 0000000..8e16d9f --- /dev/null +++ b/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md @@ -0,0 +1,13 @@ +--- +title: Sign in to Altimate and find project-local dbt profiles +date: 2026-04-04 +products: [altimate-code] +tag: new +emoji: 🔐 +draft: true +description: A `/login` command for the Altimate provider, plus dbt-standard profile discovery that picks up project-local and `DBT_PROFILES_DIR` paths. +--- + +Altimate Code's TUI now includes the Altimate platform as a first-class LLM provider. A new `/login` dialog asks for instance name, API key, and URL (defaults to the Altimate cloud endpoint), validates credentials against `/auth_health` before saving, and writes `~/.altimate/altimate.json` with `0600` permissions. The provider is selectable from `/connect` and re-bootstraps the session immediately on save. + +dbt profile discovery also catches up to dbt's standard resolution order. `/discover` now finds profiles in priority order: an explicit path argument, `DBT_PROFILES_DIR`, a project-local `profiles.yml` sitting next to `dbt_project.yml`, and finally `~/.dbt/profiles.yml`. The common CI/CD pattern of committing a project-local profile no longer falls back silently to the global one. diff --git a/generated/april-2026/2026-04-06-dbt-cloud-environment-aliases.md b/generated/april-2026/2026-04-06-dbt-cloud-environment-aliases.md new file mode 100644 index 0000000..67a8a38 --- /dev/null +++ b/generated/april-2026/2026-04-06-dbt-cloud-environment-aliases.md @@ -0,0 +1,13 @@ +--- +title: dbt Cloud environment aliases +date: 2026-04-06 +products: [dbt-power-user] +tag: improved +emoji: 🧰 +draft: true +description: Map differently-named dbt environments to the same logical environment so warehouse data lines up with model data. +--- + +dbt Cloud environments can now carry aliases. If your warehouse refers to "Prod" but your dbt Cloud environment is "Production_5505", set an alias and both names resolve to the same logical environment — no more disjoint joins or missing model rows on the warehouse side. + +Multiple aliases per environment are supported; setting none leaves behavior unchanged. diff --git a/generated/april-2026/2026-04-06-studio-citations-sharing-scheduled-reports.md b/generated/april-2026/2026-04-06-studio-citations-sharing-scheduled-reports.md new file mode 100644 index 0000000..4bad9b8 --- /dev/null +++ b/generated/april-2026/2026-04-06-studio-citations-sharing-scheduled-reports.md @@ -0,0 +1,13 @@ +--- +title: Studio gets citations, shareable links, and scheduled reports +date: 2026-04-06 +products: [snowflake-app, databricks-app] +tag: new +emoji: ✨ +draft: true +description: Inline source citations on every Studio answer, shareable session links, and recurring AI reports delivered to email. +--- + +Studio answers now cite their sources inline. Click any reference to jump to the underlying table, query, or dashboard the answer was built from — no more taking the AI's word for it. + +Sharing a session is now a single dialog: generate a time-limited read-only link and send it to anyone who has access, no copy-pasting screenshots. The new schedule action turns any Studio conversation into a recurring email report — pick a cadence, pick recipients, and the same analysis lands in their inbox each week with fresh data and inline charts. diff --git a/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md b/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md new file mode 100644 index 0000000..6149ca6 --- /dev/null +++ b/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md @@ -0,0 +1,13 @@ +--- +title: Table-level lineage now exports to CSV +date: 2026-04-06 +products: [datamates] +tag: new +emoji: 🔗 +draft: true +description: Download upstream or downstream table lineage as CSV — same export contract as column lineage, one row per edge. +--- + +Table-level lineage now exports to CSV alongside the existing column-level export. Pick any table, choose upstream or downstream, and download the full BFS traversal as a CSV with Source DB / Schema / Table → Target DB / Schema / Table columns — one row per edge. + +The lineage modal also got a z-index fix so it renders above the sidebar backdrop, and long resource keys in the export dialog no longer overflow. diff --git a/generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md b/generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md new file mode 100644 index 0000000..317c27f --- /dev/null +++ b/generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md @@ -0,0 +1,13 @@ +--- +title: dbt SQL+Jinja syntax highlighting and "Detect Python from terminal" +date: 2026-04-07 +products: [dbt-power-user] +tag: improved +emoji: 🎨 +draft: true +description: Proper TextMate grammars for dbt SQL+Jinja and YAML+Jinja, plus a one-click way to fix mismatched Python interpreters. +--- + +dbt Power User ships dedicated TextMate grammars for dbt SQL+Jinja and YAML+Jinja. `ref`, `source`, `config`, SQL aggregates, window functions, Jinja `{{ }}`/`{% %}`/`{# #}` blocks, and operators each get distinct scopes — schema YAML files highlight Jinja inline alongside YAML, and SQL files no longer fall back to a generic grammar that mis-colored half the keywords. + +The most-reported onboarding bug — dbt works in the terminal but not in the extension because the Python extension picked a different interpreter — gets a dedicated **Detect Python from terminal** action. It spawns a login shell to find where `dbt` actually lives and writes that path to `dbtPythonPathOverride`. The button appears on every Python/dbt error dialog and in the onboarding prerequisites step. diff --git a/generated/april-2026/2026-04-09-tokens-tab-and-deposit-flow.md b/generated/april-2026/2026-04-09-tokens-tab-and-deposit-flow.md new file mode 100644 index 0000000..d7d8dfb --- /dev/null +++ b/generated/april-2026/2026-04-09-tokens-tab-and-deposit-flow.md @@ -0,0 +1,13 @@ +--- +title: Tokens tab with live conversion and deposit flow +date: 2026-04-09 +products: [snowflake-app, databricks-app, datamates] +tag: new +emoji: 💰 +draft: true +description: A single Credits page tab that shows monthly allowance, grant balance, and Token Wallet — plus an Add Tokens form with live dollar-to-token conversion. +--- + +The Credits page now has a Tokens tab that consolidates monthly allowance, grant balance, and Token Wallet into one view. A progress bar shows what's left of the monthly allowance, and the Token Wallet card shows both dollar and token equivalents so finance and engineering see the same number. + +Adding tokens used to mean leaving the page. The new inline form previews the conversion as you type ($10 = 1.2M tokens) and the transaction history below shows every grant, top-up, and consumption event with timestamps and token amounts. diff --git a/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md b/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md new file mode 100644 index 0000000..3a846cc --- /dev/null +++ b/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md @@ -0,0 +1,13 @@ +--- +title: Chat panel polish — instant open, title bar icon, token usage indicator +date: 2026-04-11 +products: [datamates] +tag: improved +emoji: ⚡ +draft: true +description: The Altimate MCP chat panel opens with no perceptible delay, lives on the editor title bar, and shows live token usage in the header. +--- + +Opening the Altimate MCP chat panel used to wait for an `isInstalled()` check before rendering. The panel now appears immediately and runs the install check on the webview-ready handler, so there's no perceptible delay between clicking and seeing the panel. + +A new Altimate icon lands on the editor title bar (right after the run button) for one-click access without the command palette. The header also picks up a compact token usage indicator pulled from `/payment/token-usage` — usage percentage color-coded blue / orange / red against your monthly threshold, or "Unlimited" on unlimited plans. Click it for a detailed popover with allowance, grants, overage, billing period, and wallet balance. diff --git a/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md b/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md new file mode 100644 index 0000000..0c4963f --- /dev/null +++ b/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md @@ -0,0 +1,13 @@ +--- +title: Generate dbt unit tests from your manifest +date: 2026-04-13 +products: [altimate-code, dbt-power-user] +tag: new +emoji: 🧪 +draft: true +description: A new `dbt_unit_test_gen` tool inspects compiled SQL and writes dbt unit tests with type-correct mock data, including incremental and ephemeral cases. +--- + +Altimate Code now generates dbt unit tests for you. Point it at a model and the new `dbt_unit_test_gen` tool reads the manifest and compiled SQL, detects scenarios (CASE branches, JOINs, GROUP BY, division-by-zero, incremental loads), and emits a `unit_tests:` block in the model's schema YAML with type-correct mock data — happy path, null variants, and boundary cases. + +Incremental models get an `input: this` mock for the prior-state row; ephemeral upstream models use `format: sql` even when their column types aren't yet known. Cross-database — Snowflake, Databricks, BigQuery, Redshift, Postgres — works through the same `schema.inspect` adapter. diff --git a/generated/april-2026/2026-04-14-auto-suspend-savings-range.md b/generated/april-2026/2026-04-14-auto-suspend-savings-range.md new file mode 100644 index 0000000..b3bf80c --- /dev/null +++ b/generated/april-2026/2026-04-14-auto-suspend-savings-range.md @@ -0,0 +1,11 @@ +--- +title: Auto-suspend savings shown as a range +date: 2026-04-14 +products: [snowflake-app] +tag: improved +emoji: 💰 +draft: true +description: Warehouse auto-suspend savings now report a min/max range that accounts for Snowflake's 30-second polling delay. +--- + +Warehouse auto-suspend savings now report a range instead of a single point estimate. Snowflake polls for suspendable warehouses roughly every 30 seconds, so the actual suspend happens between 0 and 30 seconds after the `auto_suspend` timer fires. The recommendation card shows both the best case (immediate suspend) and the expected case (mid-poll, +15s) so the savings number isn't optimistic by 15 seconds of runtime per cycle. diff --git a/generated/april-2026/2026-04-14-subscriptions-webhooks-cloning-team-alerts.md b/generated/april-2026/2026-04-14-subscriptions-webhooks-cloning-team-alerts.md new file mode 100644 index 0000000..e4fcbf7 --- /dev/null +++ b/generated/april-2026/2026-04-14-subscriptions-webhooks-cloning-team-alerts.md @@ -0,0 +1,13 @@ +--- +title: Subscriptions — webhooks, cloning, team alerts, same-weekday comparisons +date: 2026-04-14 +products: [snowflake-app, databricks-app, datamates] +tag: improved +emoji: 🔔 +draft: true +description: Webhook delivery, one-click clone for any alert, team-level cost alerts, and same-weekday comparison for warehouse metrics. +--- + +Subscriptions picks up four shippable improvements this month. Webhooks join Slack and email as a first-class delivery channel — paste a URL into any rule and get a typed JSON payload with tenant, timestamp, and a deep-link back to the dashboard. Clone duplicates any alert or report you own in one click; the copy lands in the wizard ready to edit. + +Team is now a first-class entity in alerting, so a rule like _"total daily cost for team data-products > $100"_ works without per-warehouse plumbing. The new same-weekday comparison mode compares today against the prior week's same day — much better for warehouses with weekly seasonality than the old day-over-day baseline. Tag filters in the wizard switched to server-side search, so high-cardinality tag spaces no longer freeze the picker. diff --git a/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md b/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md new file mode 100644 index 0000000..0992567 --- /dev/null +++ b/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md @@ -0,0 +1,13 @@ +--- +title: Team attribution in QTP Slack threads +date: 2026-04-14 +products: [snowflake-app] +tag: improved +emoji: 👥 +draft: true +description: Query Timeout Prediction Slack threads now show the owning team even when the query tag is missing — using your tenant's ownership rules as a fallback. +--- + +When QTP posts a long-running query to Slack, the alert now identifies the owning team even on queries that don't carry a `team` tag. Your tenant's existing ownership rules — equals, startsWith, contains, in, and tag matchers, composable with AND/OR — apply as a fallback whenever the query tag itself is missing. + +Tag-derived team always wins when present; the rules engine only fills the gap, so existing tagging conventions keep working unchanged. diff --git a/generated/april-2026/2026-04-15-cortex-ai-services-and-warehouse-names.md b/generated/april-2026/2026-04-15-cortex-ai-services-and-warehouse-names.md new file mode 100644 index 0000000..0ae5e36 --- /dev/null +++ b/generated/april-2026/2026-04-15-cortex-ai-services-and-warehouse-names.md @@ -0,0 +1,13 @@ +--- +title: New Cortex AI services and readable warehouse names +date: 2026-04-15 +products: [snowflake-app] +tag: new +emoji: ❄️ +draft: true +description: Cortex Code CLI, Cortex Agent, and Snowflake Intelligence now appear in AI Services. Cortex Functions usage shows warehouse names instead of numeric IDs. +--- + +The AI Services page now tracks three new Snowflake Cortex service types: **Cortex Code CLI**, **Cortex Agent**, and **Snowflake Intelligence**. Each has its own cost summary, usage table, filter set, and place on the summary-page graph, so spend on the newer Cortex surfaces is no longer lumped under generic "Cortex". + +Cortex Functions usage now displays warehouse names (`EDS_HUMAN_WH_LARGE`) instead of opaque numeric IDs (`223`), and warehouse names work as a filter dimension. Same data, finally readable. diff --git a/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md b/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md new file mode 100644 index 0000000..ecc2c4c --- /dev/null +++ b/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md @@ -0,0 +1,13 @@ +--- +title: CTE Profiler — per-CTE timing and row counts in the editor +date: 2026-04-15 +products: [dbt-power-user] +tag: new +emoji: ⏱️ +draft: true +description: Run a dbt model and the editor decorates each CTE with its wall-clock time, row count, and a hot/warm/cool heat tier so you can find the slow one without leaving the file. +--- + +The dbt Power User extension now profiles every CTE in a model in one click. The CTE Profiler runs cumulative `SELECT COUNT(*)` queries per CTE against your warehouse, measures wall-clock time, calculates the marginal time each CTE adds, and decorates the editor inline (`⏱ 1.7s · 100 rows`). Hot CTEs go red, warm yellow, cool grey; a total time and row count lands at the bottom of the file. + +No Altimate API key required — the profiler runs SQL directly against the user's warehouse using the existing dbt connection. diff --git a/generated/april-2026/2026-04-15-schedule-name-validation-and-send-now.md b/generated/april-2026/2026-04-15-schedule-name-validation-and-send-now.md new file mode 100644 index 0000000..5ffcaa7 --- /dev/null +++ b/generated/april-2026/2026-04-15-schedule-name-validation-and-send-now.md @@ -0,0 +1,13 @@ +--- +title: Schedule name validation and "Send Now" +date: 2026-04-15 +products: [snowflake-app, databricks-app] +tag: improved +emoji: ⏱️ +draft: true +description: Auto-generated schedule names now stay under the limit, and a Send Now option fires a scheduled report on demand. +--- + +Auto-generated schedule names that ran past the 255-character limit no longer fail silently — names are truncated to fit, and the schedule form blocks names over the limit with a clear error instead of a backend rejection. + +The schedule "More options" menu also picks up a **Send Now** action, so a report can be fired on demand without waiting for the next scheduled run. diff --git a/generated/april-2026/2026-04-15-schema-yaml-hovers-and-codelenses.md b/generated/april-2026/2026-04-15-schema-yaml-hovers-and-codelenses.md new file mode 100644 index 0000000..b8249fd --- /dev/null +++ b/generated/april-2026/2026-04-15-schema-yaml-hovers-and-codelenses.md @@ -0,0 +1,13 @@ +--- +title: Model hovers and Run/Test codelenses inside YAML schema files +date: 2026-04-15 +products: [dbt-power-user] +tag: improved +emoji: 🧰 +draft: true +description: Hover any model in `schema.yml` to see its description, columns, and types — and run or test it without switching to the SQL file. +--- + +YAML schema files in dbt Power User now behave like first-class model surfaces. Hover any model name in a `models:` section or source table in a `sources:` block to see the model's description, columns, and data types as a popup — no need to jump to the `.sql` file to remember what's in it. + +Each model entry also gets Run, Test, and Document codelenses inline, so the most common actions are one click away from the schema file you're already editing. diff --git a/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md b/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md new file mode 100644 index 0000000..418d088 --- /dev/null +++ b/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md @@ -0,0 +1,13 @@ +--- +title: Data-parity diffs across SQL Server, Fabric, and ClickHouse +date: 2026-04-21 +products: [altimate-code] +tag: new +emoji: 🔁 +draft: true +description: Altimate Code can now diff data across SQL Server / Azure Fabric and ClickHouse with partition-aware execution and seven Azure AD auth flows. +--- + +Altimate Code's `data_diff` tool now handles three more warehouses end-to-end. SQL Server and Azure Fabric drop in with full T-SQL support — `TOP` injection, `sys.*` catalog queries, `DATETRUNC()` and `CONVERT(DATE, …, 23)` for date partitioning. Azure AD authentication covers seven flows (`default`, `password`, `access-token`, `service-principal-secret`, `msi-vm`, `msi-app-service`), with shorthand aliases (`cli`, `msi`, `service-principal`) for the common cases. + +The orchestrator that drives the diff is now a TypeScript layer that runs SQL tasks produced by the Rust state machine and feeds results back — so the algorithm and the database access stay independently swappable, and partitioned diffs run independently per partition before merging outcomes. diff --git a/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md b/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md new file mode 100644 index 0000000..88cdf23 --- /dev/null +++ b/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md @@ -0,0 +1,13 @@ +--- +title: Databricks AI Gateway as an LLM provider +date: 2026-04-21 +products: [altimate-code, databricks-app] +tag: new +emoji: 🧱 +draft: true +description: Use any of 11 Databricks-hosted foundation models — Llama 3.1, Claude, GPT-5, Gemini, DBRX, Mixtral — as the backing model for Altimate Code. +--- + +Altimate Code now treats Databricks serving endpoints as a first-class LLM provider. Authenticate with a Databricks PAT in `host::token` format and the provider validates the workspace host for AWS, Azure, or GCP Databricks deployments, then resolves the workspace URL from the PAT or from `DATABRICKS_HOST` / `DATABRICKS_TOKEN` environment variables. + +Eleven foundation models register out of the box: Meta Llama 3.1 (405B / 70B / 8B), Claude Sonnet and Opus, GPT-5 variants, Gemini 3.1 Pro, DBRX Instruct, and Mixtral 8x7B. Request bodies are normalized between `max_completion_tokens` and `max_tokens` so each model gets the parameter shape it expects. diff --git a/generated/april-2026/2026-04-21-referral-program.md b/generated/april-2026/2026-04-21-referral-program.md new file mode 100644 index 0000000..e17d434 --- /dev/null +++ b/generated/april-2026/2026-04-21-referral-program.md @@ -0,0 +1,13 @@ +--- +title: Referral signups with bundled token credit +date: 2026-04-21 +products: [snowflake-app, databricks-app, datamates] +tag: new +emoji: 🎁 +draft: true +description: Sign up with a referral code to get a free token grant, and create codes from a new admin page with optional expiry and usage caps. +--- + +New signups can now redeem a referral code at registration. Valid codes unlock the sign-up form for personal email providers that the company-email gate normally blocks, while disposable and temp-mail domains stay rejected. Successful signups receive a community token grant credited to their wallet on email verification. + +Admins manage codes from a new `/referrals` page: generate a code with optional max-uses and expiry, see active/inactive/expired/exhausted status badges at a glance, and click any code to copy it. Codes can be auto-generated with high-entropy IDs or supplied manually for hand-tracked campaigns. diff --git a/generated/april-2026/2026-04-22-databricks-jobs-spark-users-deep-dive.md b/generated/april-2026/2026-04-22-databricks-jobs-spark-users-deep-dive.md new file mode 100644 index 0000000..3e30f08 --- /dev/null +++ b/generated/april-2026/2026-04-22-databricks-jobs-spark-users-deep-dive.md @@ -0,0 +1,13 @@ +--- +title: Databricks Jobs, Spark analysis, and Users deep-dive +date: 2026-04-22 +products: [databricks-app] +tag: new +emoji: 🧱 +draft: true +description: Run trends, Spark stage-level analysis, user attribution, and a dedicated Databricks summary. +--- + +The Databricks App now drills down to the rows that explain a job's cost: a Jobs page with success/failure run trends and per-job duration and cost; a Spark analysis surface with stage-level gantt charts, task drill-downs, and config tabs; and a Users page with per-user cost attribution and a summary chart. + +A Databricks-specific Summary page replaces the Snowflake layout for Databricks tenants — workspace cost moves to the top of the sidebar, and the navigation hides Snowflake-only sections that didn't apply. Open a job to follow it from run trends → stages → tasks → individual run, without losing the cost context on the way down. diff --git a/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md b/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md new file mode 100644 index 0000000..6fb0a07 --- /dev/null +++ b/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md @@ -0,0 +1,13 @@ +--- +title: Datamate Skills push to Cursor, Copilot, and Cline +date: 2026-04-22 +products: [datamates] +tag: new +emoji: 📚 +draft: true +description: Skills configured on a Teammate now deliver as Cursor `.mdc`, Copilot `.instructions.md`, or Cline `.clinerules/skills//SKILL.md` files to your workspace automatically. +--- + +Datamate Skills are push-based markdown instructions that tell an AI agent when and how to use Datamate MCP tools. From April, the MCP server extension reads each Teammate's `skills[]` from the API and writes them as instruction files into the workspace in the right format for whichever IDE you're running. + +Cursor gets `.mdc` files with conditional activation (`alwaysApply` and globs); Copilot gets `.instructions.md`; Cline picks up native `.clinerules/skills//SKILL.md` files with YAML frontmatter, matching Claude Code's layout. Custom skills slug their file names with an ID suffix so two skills with the same display name don't collide. The DatamateCard shows the custom-skill count alongside Assists and Guardrails, and `always_active` skills attach to every conversation regardless of context. diff --git a/generated/april-2026/2026-04-22-sso-inactivity-timeout.md b/generated/april-2026/2026-04-22-sso-inactivity-timeout.md new file mode 100644 index 0000000..6124a5b --- /dev/null +++ b/generated/april-2026/2026-04-22-sso-inactivity-timeout.md @@ -0,0 +1,13 @@ +--- +title: SSO inactivity timeout (per-tenant) +date: 2026-04-22 +products: [snowflake-app, databricks-app, datamates] +tag: new +emoji: 🔐 +draft: true +description: Auto-log-out SSO users after N minutes of inactivity, configured per tenant. +--- + +Tenants can now set a hard inactivity timeout for SSO users. The new `ssoUserRefreshMins` tenant setting, when positive, logs SSO users out after that many minutes of inactivity — either when the tab is closed and reopened, or when the cursor leaves the window long enough. + +The control is per-tenant: tenants with stricter compliance requirements can pick a tight window without affecting tenants that don't need one. Default behavior is unchanged (no forced timeout) when the setting is unset. diff --git a/generated/april-2026/2026-04-23-altimate-code-chat-in-dbt-power-user.md b/generated/april-2026/2026-04-23-altimate-code-chat-in-dbt-power-user.md new file mode 100644 index 0000000..d210e53 --- /dev/null +++ b/generated/april-2026/2026-04-23-altimate-code-chat-in-dbt-power-user.md @@ -0,0 +1,13 @@ +--- +title: Altimate Code chat lives inside dbt Power User +date: 2026-04-23 +products: [dbt-power-user, altimate-code] +tag: new +emoji: ✨ +draft: true +description: Execute, Explain, Optimize, Document, and Review codelenses on every SQL file — and a Troubleshoot button on failed queries that opens Altimate Code with the SQL and error preloaded. +--- + +dbt Power User now has Altimate Code chat actions on every SQL and YAML file. Inline codelenses for **Execute**, **Explain**, **Optimize**, **Document**, and **Review** sit at the top of each file, and the same actions appear in the editor and explorer context menus and the editor title bar. The Review action only surfaces when git detects unstaged changes, so it's quiet on clean files. + +When a query fails in the Query Results panel, the new **Troubleshoot with Altimate Code** button opens an Altimate Code chat with the compiled SQL and error message already in the prompt — one click from "this query errored" to a working conversation about what went wrong. diff --git a/generated/april-2026/2026-04-23-risingwave-adapter-support.md b/generated/april-2026/2026-04-23-risingwave-adapter-support.md new file mode 100644 index 0000000..7baaec1 --- /dev/null +++ b/generated/april-2026/2026-04-23-risingwave-adapter-support.md @@ -0,0 +1,11 @@ +--- +title: RisingWave adapter support during onboarding +date: 2026-04-23 +products: [dbt-power-user] +tag: new +emoji: 🌊 +draft: true +description: Pick `risingwave` from the adapter list and the extension installs `dbt-risingwave` for you. +--- + +dbt Power User's onboarding walkthrough now lists RisingWave as a supported adapter. Pick it and the extension installs `dbt-risingwave` for you and configures the project to treat it as a Postgres-compatible dialect — no manual `pip install` or path fiddling. diff --git a/generated/april-2026/2026-04-28-share-studio-to-slack-via-webhook.md b/generated/april-2026/2026-04-28-share-studio-to-slack-via-webhook.md new file mode 100644 index 0000000..77e5ed9 --- /dev/null +++ b/generated/april-2026/2026-04-28-share-studio-to-slack-via-webhook.md @@ -0,0 +1,13 @@ +--- +title: Share Studio sessions to Slack +date: 2026-04-28 +products: [snowflake-app, databricks-app] +tag: new +emoji: 💬 +draft: true +description: Send a Studio session or scheduled report into any Slack channel using an incoming webhook — no OAuth, no admin setup. +--- + +Studio's Share dialog now has a Slack tab. Paste an incoming-webhook URL once and the same session, with answer and inline charts, posts into the destination channel. The Schedule dialog supports the same webhook target, so recurring reports can land in a Slack channel on a cadence without anyone configuring an integration. + +No OAuth, no bot tokens, no settings page — the share flow reuses the same notification webhook plumbing that drives alerts and scheduled reports. diff --git a/generated/april-2026/2026-04-29-custom-date-range-on-workloads.md b/generated/april-2026/2026-04-29-custom-date-range-on-workloads.md new file mode 100644 index 0000000..9c36a32 --- /dev/null +++ b/generated/april-2026/2026-04-29-custom-date-range-on-workloads.md @@ -0,0 +1,11 @@ +--- +title: Pick any date range on workloads pages +date: 2026-04-29 +products: [snowflake-app, databricks-app] +tag: improved +emoji: 📆 +draft: true +description: Calendar date selection now works across every workloads page — Snowflake Jobs, dbt, Notebooks, Stored Procedures, Custom Workloads, and Streamlit. +--- + +Every workloads page now supports custom date ranges from the calendar picker. Snowflake Jobs, dbt, Notebooks, Stored Procedures, Custom Workloads, and Streamlit all accept any start/end pair instead of forcing you onto the preset 7/30/90-day buckets. diff --git a/scripts/build_curation.py b/scripts/build_curation.py new file mode 100644 index 0000000..2967e77 --- /dev/null +++ b/scripts/build_curation.py @@ -0,0 +1,153 @@ +#!/usr/bin/env python3 +"""Generate `data/curation.md` — a checklist for picking which features get entries. + +For each feature group from data/features.json, emit: + - [ ] **Title** _(date, repos, PRs, Jira, link)_ + +Strong candidates are pre-marked `[x]`: + - cross-repo (backend + frontend = paired customer-facing feature) + - has a Jira ID (tracked work, usually substantive) + - has 3+ PRs (substantial multi-PR effort) + +Grouped by product → month (newest first). Unclassified features are last. + +Usage: + python scripts/build_curation.py + +Workflow: + 1. Run this script → review data/curation.md + 2. Edit data/curation.md: toggle `[x]`/`[ ]` to choose what gets an entry + 3. Run scripts/polish_entries.py → drafts land in generated/ +""" + +from __future__ import annotations + +import json +from collections import defaultdict +from pathlib import Path + +ROOT = Path(__file__).resolve().parent.parent +FEATURES_PATH = ROOT / "data" / "features.json" +OUT_PATH = ROOT / "data" / "curation.md" + +PRODUCT_ORDER = [ + "snowflake-app", + "databricks-app", + "dbt-power-user", + "datamates", + "altimate-code", +] + + +def is_strong_candidate(feat: dict) -> bool: + if len(feat["repos"]) > 1: + return True + if feat["jira_ids"]: + return True + if feat["pr_count"] >= 3: + return True + return False + + +def feature_line(feat: dict) -> str: + marker = "[x]" if is_strong_candidate(feat) else "[ ]" + date = (feat.get("latest_published_at") or "")[:10] + repos = "+".join(r.replace("altimate-", "")[:2] for r in feat["repos"]) + jira = " ".join(feat["jira_ids"]) if feat["jira_ids"] else "" + pr_links = ", ".join( + f"[{p['repo'].replace('altimate-', '')}#{p['number']}]({p['url']})" + for p in feat["prs"][:3] + ) + extra = f", {jira}" if jira else "" + fid = feat["id"] + title = feat["title"].strip() + return ( + f"- {marker} `{fid}` **{title}** " + f"_(date: {date}, repos: {repos}{extra}, PRs: {feat['pr_count']}, " + f"src: {pr_links})_" + ) + + +def main() -> None: + data = json.loads(FEATURES_PATH.read_text()) + features = data["features"] + + by_product: dict[str, dict[str, list[dict]]] = defaultdict(lambda: defaultdict(list)) + unclassified_by_month: dict[str, list[dict]] = defaultdict(list) + + for f in features: + month = (f.get("latest_published_at") or "0000-00")[:7] + if not f["products"]: + unclassified_by_month[month].append(f) + continue + # Place feature under EACH of its products (so a cross-cutting feature + # appears in both sections — curator marks it once in whichever section + # makes most sense; duplicate marks are deduped at polish time). + for product in f["products"]: + by_product[product][month].append(f) + + lines: list[str] = [] + lines.append("# Changelog curation") + lines.append("") + lines.append( + f"_{len(features)} feature groups from {data['total_pr_count']} PRs " + f"(filtered: feat: prefix only, internal/CI/test removed)._" + ) + lines.append("") + lines.append("## How to use") + lines.append("") + lines.append( + "1. Toggle `[x]` next to features you want as customer changelog entries.\n" + "2. Strong candidates (cross-repo, Jira-tracked, or multi-PR) are pre-marked.\n" + "3. Run `python scripts/polish_entries.py` to emit polished drafts to `generated/`.\n" + "4. The same feature appearing under multiple products is deduped by id — mark once." + ) + lines.append("") + lines.append(f"**Pre-marked strong candidates: " + f"{sum(1 for f in features if is_strong_candidate(f))} / {len(features)}**") + lines.append("") + lines.append("---") + lines.append("") + + for product in PRODUCT_ORDER: + months = by_product.get(product, {}) + if not months: + continue + total = sum(len(v) for v in months.values()) + lines.append(f"## {product} ({total})") + lines.append("") + for month in sorted(months.keys(), reverse=True): + lines.append(f"### {month}") + lines.append("") + for feat in months[month]: + lines.append(feature_line(feat)) + lines.append("") + + if unclassified_by_month: + total = sum(len(v) for v in unclassified_by_month.values()) + lines.append(f"## unclassified ({total})") + lines.append("") + lines.append( + "_These didn't match any product keyword. Mostly internal infra " + "(LLM gateway, PostHog tracking, dev tooling) — but skim for missed " + "customer features._" + ) + lines.append("") + for month in sorted(unclassified_by_month.keys(), reverse=True): + lines.append(f"### {month}") + lines.append("") + for feat in unclassified_by_month[month]: + lines.append(feature_line(feat)) + lines.append("") + + OUT_PATH.write_text("\n".join(lines) + "\n") + print(f"Wrote {OUT_PATH}") + print(f" total features: {len(features)}") + print( + f" pre-marked: " + f"{sum(1 for f in features if is_strong_candidate(f))}" + ) + + +if __name__ == "__main__": + main() diff --git a/scripts/build_features.py b/scripts/build_features.py new file mode 100644 index 0000000..9d5f812 --- /dev/null +++ b/scripts/build_features.py @@ -0,0 +1,411 @@ +#!/usr/bin/env python3 +"""Group PRs into feature units for the changelog. + +Reads: + data/releases.json + data/prs.json (produced by fetch_pr_bodies.py) + +Writes: + data/features.json — one entry per feature group, deduped across repos + data/features.md — human-readable summary for review + +Pipeline: + 1. For each PR: classify products, detect Jira ID + paired-PR refs + 2. Drop trivial PRs (chore/test/ci/docs/refactor/dependabot) + 3. Group PRs: by Jira ID, else by paired-PR ref, else singleton + 4. Attach release metadata (which tag(s) shipped each PR) + 5. Emit features.json sorted by latest ship date desc +""" + +from __future__ import annotations + +import json +import re +from collections import defaultdict +from pathlib import Path + +ROOT = Path(__file__).resolve().parent.parent +RELEASES_PATH = ROOT / "data" / "releases.json" +PRS_PATH = ROOT / "data" / "prs.json" +FEATURES_JSON = ROOT / "data" / "features.json" +FEATURES_MD = ROOT / "data" / "features.md" + +# ---------- classification ---------- + +PRODUCT_RULES = [ + # order matters: first match wins on title; body adds extras + ("databricks-app", re.compile( + r"(?i)\b(databricks|\bdbx\b|dbx[_\s\-]|/dbx/|\bdbr\b|" + r"\bspark\b|\bphoton\b|auto[\s-]?tune|databricks_jobs|workspace_id|" + r"unity[\s-]?catalog|all[\s-]?purpose[\s-]?cluster|job[\s-]?cluster|" + r"sku[\s-]?cost|dbu\b|dbus\b)" + )), + ("snowflake-app", re.compile( + r"(?i)\b(snowflake|snowflake_jobs|\bsnow\b|warehouse|" + r"tableau|bi[\s_-]?dashboard|looker|powerbi|power[\s-]?bi|" + r"query[\s_-]?usage|query[\s_-]?tag|query[\s_-]?routing|" + r"sf[\s-]copilot|cortex|account[\s-]?usage)" + )), + ("dbt-power-user", re.compile( + r"(?i)\b(dbt[\s_-]power[\s_-]user|dbt[\s-]?powe[r]?|\bdbt[\s_-]model|" + r"dbt_model|dbt_cloud|dbt[\s-]cloud|dbt[\s-]docs|dbt[\s-]core|" + r"dbt[\s-]project|dbt[\s-]profile|datapilot)" + )), + ("altimate-code", re.compile( + r"(?i)\b(altimate[\s-]?code|vscode[\s-]?extension|code[\s-]?extension|" + r"ide[\s-]?integration|cursor[\s-]?integration|vsx|vs[\s-]?code)" + )), + ("datamates", re.compile( + r"(?i)\b(datamates|data[\s-]?mate|\blineage\b|\bcatalog\b|" + r"column[\s-]?lineage|glossary|metadata[\s-]?explorer)" + )), +] + +# Studio is shared platform — mark it as cross-product for both major apps +STUDIO_RE = re.compile( + r"(?i)\b(studio|knowledge[\s-]?engine|prompt[\s-]?library|" + r"agent[\s-]?ops|subscription)" +) + +# Reach for default product when nothing else hits. Many backend PRs touch +# shared infra; we'll mark these as "unclassified" so the curator can decide. +DEFAULT_PRODUCT = None + +# ---------- trivial filter ---------- + +TRIVIAL_TITLE_PREFIXES = ( + "chore:", "test:", "tests:", "ci:", "docs:", "doc:", "build:", "perf:", + "style:", "refactor:", "revert:", +) +TRIVIAL_TITLE_SUBSTRINGS = ( + "bump ", "dependabot", "[skip ci]", "merge branch", + "release ", "release:", +) +TRIVIAL_AUTHORS = { + "dependabot[bot]", "renovate[bot]", "github-actions[bot]", + "altimate-harness-bot[bot]", "altimate-bot[bot]", +} +# Words/phrases that signal an internal-only PR (CI, tests, infra, observability, +# logs, regression suites, alembic, etc.) — drop these from the customer changelog. +INTERNAL_ONLY_RE = re.compile( + r"(?i)\b(" + r"langfuse|observability|telemetry|metric[s]?[\s-]?endpoint|" + r"sentry|datadog|signoz|logging|loguru|" + r"alembic|migration[s]?\b|schema[\s-]?migration|" + r"regression[\s-]?test|flaky|smoke[\s-]?test|integration[\s-]?test|" + r"\bci\b|github[\s-]?action|workflow[\s-]?(file|yaml|yml)|" + r"typescript[\s-]?error|typecheck|tsconfig|eslint|prettier|" + r"dependabot|vanta|soc2|sast|snyk|" + r"unit[\s-]?test|jest[\s-]?test|pytest|" + r"backfill|ingestion[\s-]?pipeline|dbt[\s-]?build|" + r"warehouse[\s-]?adapter|connection[\s-]?pool|" + r"\bmcp[\s-]?(server|tool|engine)|claude[\s-]?code|" + r"refactor|cleanup|dead[\s-]?code|unused[\s-]?import" + r")\b" +) + +# ---------- extraction ---------- + +JIRA_RE = re.compile(r"\b(AI-\d{2,5})\b") +PAIRED_PR_RE = re.compile( + r"AltimateAI/(altimate-(?:backend|frontend))(?:#|/pull/)(\d+)" +) +CONVENTIONAL_PREFIX_RE = re.compile( + r"^\s*(feat|fix|chore|test|ci|docs|build|perf|refactor|revert|style)" + r"(?:\([^)]*\))?\s*:\s*", + re.IGNORECASE, +) + + +def classify_products(title: str, body: str, labels: list[dict]) -> list[str]: + text = f"{title}\n{body or ''}" + products: list[str] = [] + for product, pat in PRODUCT_RULES: + if pat.search(text): + if product not in products: + products.append(product) + # Studio touches both major apps — assign cross-cutting if no specific app matched + if STUDIO_RE.search(text): + if not products: + products = ["snowflake-app", "databricks-app"] + # Label hints from GitHub + label_names = " ".join((l.get("name") or "").lower() for l in (labels or [])) + if "databricks" in label_names and "databricks-app" not in products: + products.append("databricks-app") + if "snowflake" in label_names and "snowflake-app" not in products: + products.append("snowflake-app") + return products + + +def is_trivial(title: str, body: str, author: str) -> bool: + t = (title or "").strip() + tl = t.lower() + if author in TRIVIAL_AUTHORS: + return True + for p in TRIVIAL_TITLE_PREFIXES: + if tl.startswith(p): + return True + for s in TRIVIAL_TITLE_SUBSTRINGS: + if s in tl: + return True + # Internal-only signal: title or body matches infra/CI/observability keywords + # AND title doesn't contain a clear user-visible signal + text = f"{title}\n{body or ''}" + if INTERNAL_ONLY_RE.search(t): + return True + return False + + +def clean_title(title: str) -> str: + t = CONVENTIONAL_PREFIX_RE.sub("", title or "").strip() + # drop [AI-1234] / [AI-0000] + t = re.sub(r"\[AI-\d{2,5}\]\s*", "", t) + # drop trailing PR number + t = re.sub(r"\s*\(#\d+\)\s*$", "", t) + return t.strip() + + +def extract_jira_ids(title: str, body: str) -> list[str]: + ids = set() + for s in (title, body or ""): + for m in JIRA_RE.findall(s or ""): + if m.upper() != "AI-0000": + ids.add(m.upper()) + return sorted(ids) + + +def extract_paired_prs(body: str) -> list[tuple[str, int]]: + out = [] + for repo, num in PAIRED_PR_RE.findall(body or ""): + out.append((repo, int(num))) + return out + + +# ---------- pipeline ---------- + +def load_data(): + releases = json.loads(RELEASES_PATH.read_text()) + prs = json.loads(PRS_PATH.read_text()) + return releases, prs + + +def build_pr_index(releases) -> dict[tuple[str, int], dict]: + """Map (short_repo, pr_number) -> {tag, published_at}.""" + idx = {} + for short_name, info in releases["repos"].items(): + for r in info["releases"]: + for pr in r["prs"]: + key = (short_name, pr["number"]) + # Keep the EARLIEST release that shipped this PR (chronologically first) + if key not in idx or r["published_at"] < idx[key]["published_at"]: + idx[key] = { + "tag": r["tag"], + "published_at": r["published_at"], + } + return idx + + +def main() -> None: + releases, prs = load_data() + pr_release_idx = build_pr_index(releases) + + # Build enriched PR list + pr_records = [] + for key, pr in prs.items(): + if pr.get("_error"): + continue + short_repo, number_s = key.split(":", 1) + number = int(number_s) + title = pr.get("title") or "" + body = pr.get("body") or "" + author_obj = pr.get("author") or {} + author = author_obj.get("login") or "" + labels = pr.get("labels") or [] + + prefix_match = CONVENTIONAL_PREFIX_RE.match(title) + prefix = (prefix_match.group(1).lower() if prefix_match else "").strip() + + rec = { + "key": key, + "repo": short_repo, + "number": number, + "url": pr.get("url"), + "title_raw": title, + "title": clean_title(title), + "body": body, + "author": author, + "labels": [l.get("name") for l in labels], + "merged_at": pr.get("mergedAt"), + "prefix": prefix, # feat / fix / chore / ... + "products": classify_products(title, body, labels), + "jira_ids": extract_jira_ids(title, body), + "paired": extract_paired_prs(body), + "trivial": is_trivial(title, body, author), + } + rec.update(pr_release_idx.get((short_repo, number), {})) + pr_records.append(rec) + + # Sort by ship date desc (newest first) + pr_records.sort(key=lambda r: r.get("published_at") or "", reverse=True) + + # --- Group into feature units --- + # Union-find over PR keys + parent: dict[str, str] = {} + + def find(x): + while parent.setdefault(x, x) != x: + parent[x] = parent[parent[x]] + x = parent[x] + return x + + def union(a, b): + ra, rb = find(a), find(b) + if ra != rb: + parent[rb] = ra + + by_key = {r["key"]: r for r in pr_records} + + # 1) Group by Jira ID + jira_to_keys: dict[str, list[str]] = defaultdict(list) + for r in pr_records: + for jid in r["jira_ids"]: + jira_to_keys[jid].append(r["key"]) + for keys in jira_to_keys.values(): + if len(keys) > 1: + base = keys[0] + for k in keys[1:]: + union(base, k) + + # 2) Group by paired PR mention (e.g., body mentions altimate-frontend#1234) + for r in pr_records: + for repo_paired, num_paired in r["paired"]: + other_key = f"{repo_paired}:{num_paired}" + if other_key in by_key: + union(r["key"], other_key) + + # Bucket by root + groups: dict[str, list[dict]] = defaultdict(list) + for r in pr_records: + root = find(r["key"]) + groups[root].append(r) + + # Build feature records + features = [] + for root, members in groups.items(): + # Drop groups that are entirely trivial + non_trivial = [m for m in members if not m["trivial"]] + if not non_trivial: + continue + # Require at least one feat: PR — fixes are too noisy for a changelog + has_feat = any(m["prefix"] == "feat" for m in non_trivial) + if not has_feat: + continue + members.sort(key=lambda m: m.get("published_at") or "", reverse=True) + latest = members[0] + # Union products across all members + prods: list[str] = [] + for m in members: + for p in m["products"]: + if p not in prods: + prods.append(p) + # Union jira ids + jiras: list[str] = [] + for m in members: + for j in m["jira_ids"]: + if j not in jiras: + jiras.append(j) + # Pick representative title — prefer longest non-trivial cleaned title + title_pick = max( + (m["title"] for m in non_trivial if m["title"]), + key=lambda t: len(t), + default=latest["title"], + ) + # Cross-repo flag + repos_involved = sorted({m["repo"] for m in members}) + features.append({ + "id": root, + "title": title_pick, + "products": prods, + "jira_ids": jiras, + "latest_published_at": latest.get("published_at"), + "latest_tag": latest.get("tag"), + "repos": repos_involved, + "pr_count": len(members), + "trivial_pr_count": sum(1 for m in members if m["trivial"]), + "prs": [ + { + "repo": m["repo"], + "number": m["number"], + "title": m["title"], + "title_raw": m["title_raw"], + "author": m["author"], + "products": m["products"], + "url": m["url"], + "trivial": m["trivial"], + "published_at": m.get("published_at"), + "tag": m.get("tag"), + } + for m in members + ], + }) + + # Sort features by latest ship date desc + features.sort(key=lambda f: f.get("latest_published_at") or "", reverse=True) + + FEATURES_JSON.parent.mkdir(parents=True, exist_ok=True) + FEATURES_JSON.write_text(json.dumps({ + "feature_count": len(features), + "total_pr_count": sum(f["pr_count"] for f in features), + "trivial_pr_count": sum(f["trivial_pr_count"] for f in features), + "features": features, + }, indent=2) + "\n") + + # Human-readable summary + lines = [ + f"# Feature plan — {len(features)} feature groups", + "", + f"_Generated from {sum(f['pr_count'] for f in features)} PRs " + f"({sum(f['trivial_pr_count'] for f in features)} trivial, filtered)._", + "", + ] + # Group features by month + by_month: dict[str, list[dict]] = defaultdict(list) + for f in features: + m = (f.get("latest_published_at") or "0000-00")[:7] + by_month[m].append(f) + for month in sorted(by_month.keys(), reverse=True): + lines.append(f"## {month}") + lines.append("") + for f in by_month[month]: + prods = ", ".join(f["products"]) or "(unclassified)" + jira = " ".join(f["jira_ids"]) if f["jira_ids"] else "" + repos = "+".join(r.replace("altimate-", "") for r in f["repos"]) + lines.append( + f"- **{f['title']}** " + f"_(repos: {repos}, products: {prods}{', ' + jira if jira else ''}, " + f"PRs: {f['pr_count']}, tag: {f['latest_tag']})_" + ) + lines.append("") + FEATURES_MD.write_text("\n".join(lines) + "\n") + + # Stats + unclassified = sum(1 for f in features if not f["products"]) + cross_repo = sum(1 for f in features if len(f["repos"]) > 1) + print(f"features: {len(features)}") + print(f" cross-repo (deduped): {cross_repo}") + print(f" unclassified: {unclassified}") + print(f" trivial PRs filtered: {sum(f['trivial_pr_count'] for f in features)}") + by_prod = defaultdict(int) + for f in features: + for p in f["products"]: + by_prod[p] += 1 + if not f["products"]: + by_prod["(unclassified)"] += 1 + for p, n in sorted(by_prod.items(), key=lambda kv: -kv[1]): + print(f" {p}: {n}") + print(f"\nWrote {FEATURES_JSON}") + print(f"Wrote {FEATURES_MD}") + + +if __name__ == "__main__": + main() diff --git a/scripts/fetch_pr_bodies.py b/scripts/fetch_pr_bodies.py new file mode 100644 index 0000000..6535caf --- /dev/null +++ b/scripts/fetch_pr_bodies.py @@ -0,0 +1,129 @@ +#!/usr/bin/env python3 +"""Fetch title/body/labels/closedAt for every PR referenced in data/releases.json. + +Resumable — already-fetched PRs are skipped. Writes to data/prs.json keyed by +":". + +Usage: + python scripts/fetch_pr_bodies.py [--workers N] +""" + +from __future__ import annotations + +import argparse +import json +import subprocess +import sys +import time +from concurrent.futures import ThreadPoolExecutor, as_completed +from pathlib import Path + +ROOT = Path(__file__).resolve().parent.parent +DEFAULT_RELEASES_PATH = ROOT / "data" / "releases.json" +DEFAULT_PRS_PATH = ROOT / "data" / "prs.json" + +REPO_MAP = { + "altimate-backend": "AltimateAI/altimate-backend", + "altimate-frontend": "AltimateAI/altimate-frontend", + "vscode-dbt-power-user": "AltimateAI/vscode-dbt-power-user", + "altimate-code": "AltimateAI/altimate-code", + "altimate-mcp-engine": "AltimateAI/altimate-mcp-engine", + "vscode-altimate-mcp-server": "AltimateAI/vscode-altimate-mcp-server", + "altimate-dbt-snowflake-query-tags": "AltimateAI/altimate-dbt-snowflake-query-tags", +} + + +def fetch_pr(repo: str, number: int) -> dict | None: + cmd = [ + "gh", "pr", "view", str(number), + "--repo", repo, + "--json", "number,title,body,labels,author,closedAt,mergedAt,state,url", + ] + result = subprocess.run(cmd, capture_output=True, text=True, check=False) + if result.returncode != 0: + sys.stderr.write(f" ! {repo}#{number}: {result.stderr.strip()[:120]}\n") + return None + return json.loads(result.stdout) + + +def main() -> None: + parser = argparse.ArgumentParser() + parser.add_argument("--workers", type=int, default=8) + parser.add_argument("--releases", default=str(DEFAULT_RELEASES_PATH)) + parser.add_argument("--out", default=str(DEFAULT_PRS_PATH)) + args = parser.parse_args() + + RELEASES_PATH = Path(args.releases) + PRS_PATH = Path(args.out) + + releases_data = json.loads(RELEASES_PATH.read_text()) + + # Gather all (short_repo, repo, number) triples + targets: list[tuple[str, str, int]] = [] + for short_name, info in releases_data["repos"].items(): + repo = REPO_MAP[short_name] + for r in info["releases"]: + for pr in r["prs"]: + targets.append((short_name, repo, pr["number"])) + + # Dedupe (PR can appear in only one release, but safe-guard) + seen = set() + unique = [] + for t in targets: + key = (t[0], t[2]) + if key in seen: + continue + seen.add(key) + unique.append(t) + + # Load existing cache + cache: dict[str, dict] = {} + if PRS_PATH.exists(): + cache = json.loads(PRS_PATH.read_text()) + + to_fetch = [ + (short, repo, num) for short, repo, num in unique + if f"{short}:{num}" not in cache + ] + sys.stderr.write( + f"Total PRs: {len(unique)}; cached: {len(cache)}; to fetch: {len(to_fetch)}\n" + ) + + if not to_fetch: + sys.stderr.write("Nothing to fetch.\n") + return + + start = time.time() + completed = 0 + save_every = 50 + + def task(t): + short, repo, num = t + data = fetch_pr(repo, num) + return short, num, data + + with ThreadPoolExecutor(max_workers=args.workers) as ex: + futures = [ex.submit(task, t) for t in to_fetch] + for fut in as_completed(futures): + short, num, data = fut.result() + key = f"{short}:{num}" + cache[key] = data or {"_error": True, "number": num, "repo": short} + completed += 1 + if completed % save_every == 0: + PRS_PATH.write_text(json.dumps(cache, indent=2) + "\n") + rate = completed / (time.time() - start) + eta = (len(to_fetch) - completed) / rate if rate else 0 + sys.stderr.write( + f" [{completed}/{len(to_fetch)}] " + f"rate={rate:.1f}/s eta={eta:.0f}s\n" + ) + + PRS_PATH.write_text(json.dumps(cache, indent=2) + "\n") + elapsed = time.time() - start + sys.stderr.write( + f"Done. Fetched {completed} PRs in {elapsed:.0f}s. Cache: {PRS_PATH}\n" + ) + + +if __name__ == "__main__": + main() diff --git a/scripts/fetch_releases.py b/scripts/fetch_releases.py new file mode 100644 index 0000000..35309eb --- /dev/null +++ b/scripts/fetch_releases.py @@ -0,0 +1,185 @@ +#!/usr/bin/env python3 +"""Fetch GitHub releases from altimate-backend and altimate-frontend. + +Usage: + python scripts/fetch_releases.py [--since YYYY-MM-DD] [--out data/releases.json] + +Requires `gh` CLI authenticated with access to AltimateAI org. +""" + +from __future__ import annotations + +import argparse +import json +import re +import subprocess +import sys +from datetime import datetime, timedelta, timezone +from pathlib import Path + +REPOS = [ + # (short_name, repo_full_name, primary_product_slug_or_None) + ("altimate-backend", "AltimateAI/altimate-backend", None), + ("altimate-frontend", "AltimateAI/altimate-frontend", None), + ("vscode-dbt-power-user", "AltimateAI/vscode-dbt-power-user", "dbt-power-user"), + ("altimate-code", "AltimateAI/altimate-code", "altimate-code"), + ("altimate-mcp-engine", "AltimateAI/altimate-mcp-engine", "datamates"), + ("vscode-altimate-mcp-server", "AltimateAI/vscode-altimate-mcp-server", "datamates"), + ("altimate-dbt-snowflake-query-tags", + "AltimateAI/altimate-dbt-snowflake-query-tags", "datamates"), +] + +PR_REF_RE = re.compile(r"https://github\.com/AltimateAI/[\w-]+/pull/(\d+)") +# Format A (autogen): "* title by @author in https://github.com/.../pull/123" +PR_LINE_A_RE = re.compile( + r"^\*\s+(?P.*?)\s+by\s+@(?P<author>[\w\-\[\]]+)\s+in\s+" + r"https://github\.com/AltimateAI/[\w-]+/pull/(?P<num>\d+)\s*$" +) +# Format B (git-cliff style): "- [sha ]?title (#123) [(sha)]?" +PR_LINE_B_RE = re.compile( + r"^-\s+(?:[0-9a-f]{7,40}\s+)?(?P<title>.+?)\s+\(#(?P<num>\d+)\)" + r"(?:\s+\([0-9a-f]{7,40}\))?\s*$" +) + + +def run_gh(args: list[str], repo: str) -> str: + cmd = ["gh", *args, "--repo", repo] + result = subprocess.run(cmd, capture_output=True, text=True, check=False) + if result.returncode != 0: + sys.stderr.write(f"gh failed: {' '.join(cmd)}\n{result.stderr}\n") + sys.exit(1) + return result.stdout + + +def list_releases(repo: str, since: datetime) -> list[dict]: + raw = run_gh( + [ + "release", "list", + "--limit", "500", + "--json", "tagName,name,publishedAt,isPrerelease,isDraft", + ], + repo, + ) + items = json.loads(raw) + out = [] + for r in items: + published = datetime.fromisoformat(r["publishedAt"].replace("Z", "+00:00")) + if published < since: + continue + out.append(r) + return out + + +def fetch_release_body(repo: str, tag: str) -> str: + raw = run_gh( + ["release", "view", tag, "--json", "body"], + repo, + ) + return json.loads(raw).get("body", "") or "" + + +def parse_prs(body: str) -> list[dict]: + prs = [] + seen = set() + for line in body.splitlines(): + s = line.strip() + m = PR_LINE_A_RE.match(s) + if m: + num = int(m.group("num")) + if num in seen: + continue + seen.add(num) + prs.append({ + "number": num, + "title": m.group("title").strip(), + "author": m.group("author"), + }) + continue + m = PR_LINE_B_RE.match(s) + if m: + num = int(m.group("num")) + if num in seen: + continue + seen.add(num) + prs.append({ + "number": num, + "title": m.group("title").strip(), + "author": None, + }) + return prs + + +def main() -> None: + parser = argparse.ArgumentParser() + parser.add_argument( + "--since", + default=(datetime.now(timezone.utc) - timedelta(days=183)).date().isoformat(), + help="ISO date; releases published on or after this date are included (default: ~6 months ago).", + ) + parser.add_argument( + "--out", + default="data/releases.json", + help="Output path relative to repo root.", + ) + parser.add_argument( + "--include-freemium", + action="store_true", + help="Include `freemium-*` tags from the frontend repo (excluded by default).", + ) + args = parser.parse_args() + + since = datetime.fromisoformat(args.since).replace(tzinfo=timezone.utc) + repo_root = Path(__file__).resolve().parent.parent + out_path = repo_root / args.out + out_path.parent.mkdir(parents=True, exist_ok=True) + + result = { + "generated_at": datetime.now(timezone.utc).isoformat(), + "since": since.date().isoformat(), + "include_freemium": args.include_freemium, + "repos": {}, + } + + for short_name, repo, primary_product in REPOS: + sys.stderr.write(f"Listing releases for {repo}...\n") + releases = list_releases(repo, since) + sys.stderr.write(f" {len(releases)} releases since {since.date()}\n") + + enriched = [] + for r in releases: + tag = r["tagName"] + if not args.include_freemium and tag.startswith("freemium-"): + continue + sys.stderr.write(f" fetching {tag}...\n") + body = fetch_release_body(repo, tag) + prs = parse_prs(body) + enriched.append({ + "tag": tag, + "name": r["name"], + "published_at": r["publishedAt"], + "is_prerelease": r["isPrerelease"], + "is_draft": r["isDraft"], + "body": body, + "prs": prs, + "pr_count": len(prs), + }) + + result["repos"][short_name] = { + "repo": repo, + "primary_product": primary_product, + "release_count": len(enriched), + "pr_count": sum(r["pr_count"] for r in enriched), + "releases": enriched, + } + + out_path.write_text(json.dumps(result, indent=2) + "\n") + sys.stderr.write(f"\nWrote {out_path}\n") + for short_name, data in result["repos"].items(): + sys.stderr.write( + f" {short_name}: {data['release_count']} releases, " + f"{data['pr_count']} PRs\n" + ) + + +if __name__ == "__main__": + main() diff --git a/scripts/generate_entries.py b/scripts/generate_entries.py new file mode 100644 index 0000000..0ca9903 --- /dev/null +++ b/scripts/generate_entries.py @@ -0,0 +1,202 @@ +#!/usr/bin/env python3 +"""Generate proposed changelog entries from data/features.json. + +Writes Markdown files to `generated/` (NOT `entries/`), with one file per +feature group. Frontmatter matches the schema in products.yml + STYLE.md. + +The generated content is a *draft* — titles/bodies pull from PR data and +will not match the polished customer-facing voice required by STYLE.md. +Use these as a starting point and edit/promote into entries/ manually. + +Usage: + python scripts/generate_entries.py [--limit N] [--sample] + --sample writes only 5 demo files for review. +""" + +from __future__ import annotations + +import argparse +import json +import re +from pathlib import Path + +ROOT = Path(__file__).resolve().parent.parent +FEATURES_PATH = ROOT / "data" / "features.json" +OUT_DIR = ROOT / "generated" + +# Map from internal product slugs (must match products.yml) +VALID_PRODUCTS = { + "dbt-power-user", "snowflake-app", "databricks-app", + "altimate-code", "datamates", +} + +EMOJI_BY_KEYWORD = [ + (re.compile(r"(?i)\b(cost|saving|spend|budget|billing|dbu)\b"), "💰"), + (re.compile(r"(?i)\b(alert|notification|email|slack|subscribe)\b"), "🔔"), + (re.compile(r"(?i)\b(filter|search|sort|column)\b"), "🔎"), + (re.compile(r"(?i)\b(dashboard|chart|graph|visual|insight)\b"), "📊"), + (re.compile(r"(?i)\b(auto[\s-]?tune|tune|optimi[sz]e|recommendation)\b"), "🤖"), + (re.compile(r"(?i)\b(query|sql|warehouse)\b"), "🗄️"), + (re.compile(r"(?i)\b(job|task|workload|schedule)\b"), "⏱️"), + (re.compile(r"(?i)\b(studio|prompt|agent|knowledge)\b"), "✨"), + (re.compile(r"(?i)\b(lineage|catalog|metadata)\b"), "🔗"), + (re.compile(r"(?i)\b(security|sso|auth|permission|access)\b"), "🔐"), + (re.compile(r"(?i)\b(databricks|spark|photon|dbx)\b"), "🧱"), + (re.compile(r"(?i)\b(snowflake)\b"), "❄️"), + (re.compile(r"(?i)\b(dbt)\b"), "🧰"), +] +DEFAULT_EMOJI = "🚀" + + +def pick_emoji(text: str) -> str: + for pat, emoji in EMOJI_BY_KEYWORD: + if pat.search(text): + return emoji + return DEFAULT_EMOJI + + +def slugify(title: str) -> str: + s = title.lower() + s = re.sub(r"[^a-z0-9\s-]+", "", s) + s = re.sub(r"\s+", "-", s).strip("-") + return s[:60] or "untitled" + + +def first_paragraph(body: str) -> str: + """Pull the first meaningful paragraph from a PR body.""" + if not body: + return "" + cleaned = re.sub(r"<!--.*?-->", "", body, flags=re.DOTALL) + # Skip standard headings like "## Summary" + paragraphs = [] + for chunk in re.split(r"\n\s*\n", cleaned.strip()): + chunk = chunk.strip() + if not chunk: + continue + # Skip pure headings + if re.match(r"^#+\s+\w+\s*$", chunk): + continue + # Skip code/tables/checkboxes-only blocks + if chunk.startswith("```") or chunk.startswith("|"): + continue + paragraphs.append(chunk) + if len(paragraphs) >= 2: + break + if not paragraphs: + return "" + # Drop heading prefix like "## Summary\n..." -> keep body + out = "\n\n".join(paragraphs) + out = re.sub(r"^#+\s+[\w\s]+\n+", "", out) + # Strip Jira links / internal markers + out = re.sub(r"\[?AI-\d{2,5}\]?", "", out) + out = re.sub(r"@[\w-]+", "", out) + # Trim + out = re.sub(r"\s+", " ", out).strip() + return out + + +def normalize_products(products: list[str]) -> list[str]: + keep = [p for p in products if p in VALID_PRODUCTS] + return keep or ["snowflake-app"] # default per user's note + + +def build_entry(feature: dict) -> tuple[str, str]: + title = feature["title"].strip() + # Strip lingering noise + title = re.sub(r"\(#?\d+\)$", "", title).strip() + title = re.sub(r"^\W+", "", title) + # Capitalize first letter + title = title[:1].upper() + title[1:] if title else "Untitled" + if len(title) > 80: + title = title[:77] + "..." + + date = (feature.get("latest_published_at") or "")[:10] + products = normalize_products(feature.get("products") or []) + tag = "new" # we filtered to feat: prefix + emoji = pick_emoji(title) + slug = slugify(title) + filename = f"{date}-{slug}.md" + + # Body: pick the longest first-paragraph from any PR in the group + bodies = [(first_paragraph(p.get("title_raw", "")), first_paragraph(_load_body(p))) + for p in feature["prs"]] + body_candidates = [] + for _, b in bodies: + if b and 40 < len(b) < 400: + body_candidates.append(b) + body = max(body_candidates, key=len, default="") + if not body: + # Fall back to title as the body + body = title + + pr_links = "\n".join( + f"- [{p['repo'].replace('altimate-', '')}#{p['number']}]({p['url']}) — {p['title']}" + for p in feature["prs"][:6] + ) + + frontmatter = ( + "---\n" + f"title: {title}\n" + f"date: {date}\n" + f"products: [{', '.join(products)}]\n" + f"tag: {tag}\n" + f"emoji: {emoji}\n" + "draft: true\n" + f"description: {body[:180]}\n" + "---\n\n" + ) + content = ( + f"{body}\n\n" + f"<!-- source PRs (remove before publishing):\n{pr_links}\n-->\n" + ) + return filename, frontmatter + content + + +_pr_body_cache: dict[str, str] = {} + + +def _load_body(pr_summary: dict) -> str: + """Look up full PR body from data/prs.json on demand.""" + if not _pr_body_cache: + prs = json.loads((ROOT / "data" / "prs.json").read_text()) + for k, v in prs.items(): + _pr_body_cache[k] = (v or {}).get("body") or "" + key = f"{pr_summary['repo']}:{pr_summary['number']}" + return _pr_body_cache.get(key, "") + + +def main() -> None: + parser = argparse.ArgumentParser() + parser.add_argument("--limit", type=int, default=None) + parser.add_argument("--sample", action="store_true", + help="Write only 5 demo files for review.") + args = parser.parse_args() + + OUT_DIR.mkdir(exist_ok=True) + + data = json.loads(FEATURES_PATH.read_text()) + features = data["features"] + if args.sample: + features = features[:5] + elif args.limit: + features = features[: args.limit] + + seen_filenames: set[str] = set() + written = 0 + for feat in features: + filename, content = build_entry(feat) + # Dedupe filenames + base = filename + idx = 2 + while filename in seen_filenames: + filename = base.replace(".md", f"-{idx}.md") + idx += 1 + seen_filenames.add(filename) + (OUT_DIR / filename).write_text(content) + written += 1 + + print(f"Wrote {written} entries to {OUT_DIR}/") + + +if __name__ == "__main__": + main() diff --git a/scripts/polish_entries.py b/scripts/polish_entries.py new file mode 100644 index 0000000..0587a1c --- /dev/null +++ b/scripts/polish_entries.py @@ -0,0 +1,230 @@ +#!/usr/bin/env python3 +"""Read `data/curation.md`, extract `[x]`-marked features, write drafts to `generated/`. + +Reads: + data/curation.md + data/features.json + data/prs.json + +Writes: + generated/YYYY-MM-DD-<slug>.md (one per marked feature, deduped) + +The "polish" applied here is rule-based: + - Strip `feat:` / `fix:` / `[AI-XXXX]` / `(#1234)` prefixes/suffixes + - Pull the first meaningful paragraph from the longest PR body + - Strip Jira links, @mentions, internal headings + - Truncate description to ≤200 chars (CI validator's limit) + - Default `draft: true` so the website skips entries until human-reviewed + +For true STYLE.md voice rewrites (active voice, no marketing-speak), run each +draft through Claude in a subsequent pass. + +Usage: + python scripts/polish_entries.py +""" + +from __future__ import annotations + +import json +import re +from pathlib import Path + +ROOT = Path(__file__).resolve().parent.parent +CURATION_PATH = ROOT / "data" / "curation.md" +FEATURES_PATH = ROOT / "data" / "features.json" +PRS_PATH = ROOT / "data" / "prs.json" +OUT_DIR = ROOT / "generated" + +VALID_PRODUCTS = { + "dbt-power-user", "snowflake-app", "databricks-app", + "altimate-code", "datamates", +} + +EMOJI_BY_KEYWORD = [ + (re.compile(r"(?i)\b(cost|saving|spend|budget|billing|dbu)\b"), "💰"), + (re.compile(r"(?i)\b(alert|notification|email|slack|subscribe)\b"), "🔔"), + (re.compile(r"(?i)\b(filter|search|sort|column)\b"), "🔎"), + (re.compile(r"(?i)\b(dashboard|chart|graph|visual|insight)\b"), "📊"), + (re.compile(r"(?i)\b(auto[\s-]?tune|tune|optimi[sz]e|recommendation)\b"), "🤖"), + (re.compile(r"(?i)\b(query|sql|warehouse)\b"), "🗄️"), + (re.compile(r"(?i)\b(job|task|workload|schedule)\b"), "⏱️"), + (re.compile(r"(?i)\b(studio|prompt|agent|knowledge)\b"), "✨"), + (re.compile(r"(?i)\b(lineage|catalog|metadata)\b"), "🔗"), + (re.compile(r"(?i)\b(security|sso|auth|permission|access)\b"), "🔐"), + (re.compile(r"(?i)\b(databricks|spark|photon|dbx)\b"), "🧱"), + (re.compile(r"(?i)\b(snowflake)\b"), "❄️"), + (re.compile(r"(?i)\b(dbt)\b"), "🧰"), +] + +MARK_RE = re.compile(r"^- \[x\] `([^`]+)`", re.IGNORECASE) + + +def slugify(title: str) -> str: + s = title.lower() + s = re.sub(r"[^a-z0-9\s-]+", "", s) + s = re.sub(r"\s+", "-", s).strip("-") + return s[:60] or "untitled" + + +def pick_emoji(text: str) -> str: + for pat, emoji in EMOJI_BY_KEYWORD: + if pat.search(text): + return emoji + return "🚀" + + +def clean_title(title: str) -> str: + t = title.strip() + t = re.sub(r"^\s*(feat|fix|chore|test|ci|docs|build|perf|refactor|revert|style)" + r"(?:\([^)]*\))?\s*:\s*", "", t, flags=re.IGNORECASE) + t = re.sub(r"\[?AI-\d{2,5}\]?", "", t) + t = re.sub(r"\(#\d+\)$", "", t).strip() + t = re.sub(r"\s+", " ", t) + if t: + t = t[:1].upper() + t[1:] + return t.strip() + + +def first_paragraph(body: str) -> str: + if not body: + return "" + cleaned = re.sub(r"<!--.*?-->", "", body, flags=re.DOTALL) + paragraphs = [] + for chunk in re.split(r"\n\s*\n", cleaned.strip()): + chunk = chunk.strip() + if not chunk: + continue + if re.match(r"^#+\s+\w+\s*$", chunk): # bare heading + continue + if chunk.startswith("```") or chunk.startswith("|"): + continue + # Skip task lists / checkboxes + if chunk.lstrip().startswith("- [ ]") or chunk.lstrip().startswith("- [x]"): + continue + paragraphs.append(chunk) + if len(paragraphs) >= 2: + break + if not paragraphs: + return "" + out = "\n\n".join(paragraphs) + out = re.sub(r"^#+\s+[\w\s]+\n+", "", out) + out = re.sub(r"\[?AI-\d{2,5}\]?", "", out) + out = re.sub(r"@[\w-]+", "", out) + out = re.sub(r"https?://altimateai\.atlassian\.net/[^\s)]+", "", out) + out = re.sub(r"\s+", " ", out).strip() + return out + + +def normalize_products(products: list[str]) -> list[str]: + keep = [p for p in products if p in VALID_PRODUCTS] + return keep or ["snowflake-app"] + + +def truncate_for_description(text: str, limit: int = 180) -> str: + text = text.strip().replace("\n", " ") + if len(text) <= limit: + return text + # Cut at sentence boundary if possible + cut = text[:limit] + last_period = cut.rfind(". ") + if last_period > 80: + return cut[: last_period + 1] + return cut.rsplit(" ", 1)[0] + "..." + + +def build_entry(feat: dict, pr_bodies: dict[str, str]) -> tuple[str, str]: + title = clean_title(feat["title"]) + if len(title) > 80: + title = title[:77] + "..." + + date = (feat.get("latest_published_at") or "")[:10] + products = normalize_products(feat.get("products") or []) + tag = "new" + emoji = pick_emoji(title) + slug = slugify(title) + filename = f"{date}-{slug}.md" + + body_candidates: list[str] = [] + for p in feat["prs"]: + key = f"{p['repo']}:{p['number']}" + bp = first_paragraph(pr_bodies.get(key, "")) + if bp and 60 < len(bp) < 800: + body_candidates.append(bp) + body = max(body_candidates, key=len, default=title) + description = truncate_for_description(body) + + pr_links = "\n".join( + f"- [{p['repo'].replace('altimate-', '')}#{p['number']}]({p['url']}) — " + f"{clean_title(p['title'])}" + for p in feat["prs"][:8] + ) + + frontmatter = ( + "---\n" + f"title: {title}\n" + f"date: {date}\n" + f"products: [{', '.join(products)}]\n" + f"tag: {tag}\n" + f"emoji: {emoji}\n" + "draft: true\n" + f"description: {description}\n" + "---\n\n" + ) + content = ( + f"{body}\n\n" + f"<!-- source PRs (review and remove before publishing):\n{pr_links}\n-->\n" + ) + return filename, frontmatter + content + + +def parse_marks(curation: str) -> list[str]: + ids: list[str] = [] + for line in curation.splitlines(): + m = MARK_RE.match(line.strip()) + if m: + ids.append(m.group(1)) + return ids + + +def main() -> None: + curation = CURATION_PATH.read_text() + marked_ids = parse_marks(curation) + print(f"Marked features: {len(marked_ids)}") + if not marked_ids: + print("No `[x]` marks found in data/curation.md. Edit it first.") + return + + features = json.loads(FEATURES_PATH.read_text())["features"] + by_id = {f["id"]: f for f in features} + + prs = json.loads(PRS_PATH.read_text()) + pr_bodies = {k: (v or {}).get("body") or "" for k, v in prs.items()} + + OUT_DIR.mkdir(exist_ok=True) + written = 0 + skipped = 0 + seen_filenames: set[str] = set() + + for fid in marked_ids: + feat = by_id.get(fid) + if not feat: + print(f" WARN: unknown id {fid!r}, skipping") + skipped += 1 + continue + filename, content = build_entry(feat, pr_bodies) + base = filename + idx = 2 + while filename in seen_filenames: + filename = base.replace(".md", f"-{idx}.md") + idx += 1 + seen_filenames.add(filename) + (OUT_DIR / filename).write_text(content) + written += 1 + + print(f"Wrote {written} entries to {OUT_DIR}/") + if skipped: + print(f" skipped {skipped} unknown ids") + + +if __name__ == "__main__": + main() diff --git a/scripts/reparse_releases.py b/scripts/reparse_releases.py new file mode 100644 index 0000000..e7cf972 --- /dev/null +++ b/scripts/reparse_releases.py @@ -0,0 +1,27 @@ +#!/usr/bin/env python3 +"""Re-parse PRs in data/releases.json using the current fetch_releases.parse_prs.""" + +import json +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).resolve().parent)) +from fetch_releases import parse_prs # noqa: E402 + +root = Path(__file__).resolve().parent.parent +path = root / "data" / "releases.json" +data = json.loads(path.read_text()) + +for repo_data in data["repos"].values(): + pr_total = 0 + for r in repo_data["releases"]: + prs = parse_prs(r["body"]) + r["prs"] = prs + r["pr_count"] = len(prs) + pr_total += len(prs) + repo_data["pr_count"] = pr_total + +path.write_text(json.dumps(data, indent=2) + "\n") +print(f"Re-parsed {path}") +for short, info in data["repos"].items(): + print(f" {short}: {info['release_count']} releases, {info['pr_count']} PRs") From 3043b7d1ac1a16893ac0a9ac59acb2ab1024075c Mon Sep 17 00:00:00 2001 From: Sourabh Chourasia <sourabh@altimate.ai> Date: Thu, 14 May 2026 20:44:40 +0530 Subject: [PATCH 2/2] feat: apply reviewer trims to April 2026 drafts + split syntax entry MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Apply paragraph-level revisions to 13 entries based on team reviewer feedback. The structure, curation set, tags, and product classifications were correct; ~8 entries leaked engineering vocabulary (internal endpoint names, file extensions, env vars, framework references) and ~5 needed lighter trims. Hard trims (drop implementation mechanics): - data-parity-skill-mssql-fabric-clickhouse: drop TOP injection, sys.* catalog queries, DATETRUNC/CONVERT calls, the seven Azure AD flow names, and Rust/TypeScript orchestrator sentence. - altimate-code-dbt-unit-test-generation: drop tool name, input:this, format:sql, schema.inspect adapter; keep scenario list. - databricks-ai-gateway: drop host::token, DATABRICKS_HOST/TOKEN env vars, max_completion_tokens normalization. - altimate-code-login-and-project-profiles: drop 0600 permissions, ~/.altimate/altimate.json, /auth_health endpoint. - remote-mcp-http-sse: drop type:http / type:sse / /mcp / /sse endpoints. - datamate-skills-push: drop .mdc / .instructions.md / .clinerules/skills/<id>/SKILL.md file extensions and alwaysApply. - datamates-chat-panel-polish: drop isInstalled() / webview-ready handler narrative and /payment/token-usage endpoint. - dbt-cloud-sync-consolidated: drop "PostgreSQL upserts" mechanic; keep the 250min → 50min number. Light trims: - table-level-lineage-csv-export: drop "BFS traversal" and z-index fix. - cte-profiler-in-dbt-power-user: drop "cumulative SELECT COUNT(*) queries" mechanic. - llm-guard-prompt-injection: drop /chat/completions endpoint. - team-attribution-in-qtp-slack: drop matcher DSL list (equals, startsWith, contains, in, AND/OR). - sso-inactivity-timeout: replace `ssoUserRefreshMins` setting name with plain-English "SSO inactivity timeout setting". Split (one entry → two for discoverability): - dbt-syntax-grammar-and-python-detection.md (deleted) → - dbt-jinja-syntax-highlighting.md (improvement) - detect-python-from-terminal.md (new feature, separate emoji/tag) .gitignore: also ignore .github/meta/ (local commit-message scratch). Validator: 31 entries pass (was 30; +1 from the split). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --- .gitignore | 3 +++ .../2026-04-02-dbt-cloud-sync-consolidated.md | 4 ++-- .../2026-04-02-llm-guard-prompt-injection.md | 4 ++-- .../2026-04-02-remote-mcp-http-sse-transports.md | 4 +--- ...4-04-altimate-code-login-and-project-profiles.md | 2 +- .../2026-04-06-table-level-lineage-csv-export.md | 4 +--- .../2026-04-07-dbt-jinja-syntax-highlighting.md | 11 +++++++++++ ...04-07-dbt-syntax-grammar-and-python-detection.md | 13 ------------- .../2026-04-07-detect-python-from-terminal.md | 13 +++++++++++++ .../2026-04-11-datamates-chat-panel-polish.md | 4 ++-- ...-04-13-altimate-code-dbt-unit-test-generation.md | 4 ++-- .../2026-04-14-team-attribution-in-qtp-slack.md | 2 +- .../2026-04-15-cte-profiler-in-dbt-power-user.md | 4 ++-- ...-21-data-parity-skill-mssql-fabric-clickhouse.md | 4 ++-- ...6-04-21-databricks-ai-gateway-as-llm-provider.md | 4 ++-- ...-datamate-skills-push-to-cursor-copilot-cline.md | 4 ++-- .../april-2026/2026-04-22-sso-inactivity-timeout.md | 4 ++-- 17 files changed, 49 insertions(+), 39 deletions(-) create mode 100644 generated/april-2026/2026-04-07-dbt-jinja-syntax-highlighting.md delete mode 100644 generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md create mode 100644 generated/april-2026/2026-04-07-detect-python-from-terminal.md diff --git a/.gitignore b/.gitignore index 5a6defc..62fc36d 100644 --- a/.gitignore +++ b/.gitignore @@ -4,6 +4,9 @@ __pycache__/ .venv/ .lycheecache +# Local commit-message scratch (referenced by general.md commit workflow) +.github/meta/ + # Pipeline artifacts (raw release/PR data fetched from private repos) data/ diff --git a/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md b/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md index ab6e1bd..f98541e 100644 --- a/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md +++ b/generated/april-2026/2026-04-02-dbt-cloud-sync-consolidated.md @@ -8,6 +8,6 @@ draft: true description: dbt Cloud sync consolidates ~1000 daily tasks per project/environment into a single ingestion run, cutting sync time from ~250 min to under 50. --- -dbt Cloud sync no longer creates one ingestion task per dbt Cloud run. The sync now consolidates to one task per `(project, environment)` per cycle — so a project firing 1000 runs a day produces one ingestion run instead of 1000 redundant ones. +dbt Cloud sync no longer creates one ingestion task per dbt Cloud run. It now consolidates to a single ingestion per `(project, environment)` per cycle — so a project firing 1000 runs a day produces one ingestion instead of 1000 redundant ones. -Manifest parsing, health checks, and PostgreSQL upserts each happen once per node per cycle instead of ~20 times, dropping a typical 4-worker sync from ~250 min to under 50. +A typical 4-worker sync drops from ~250 minutes to under 50. diff --git a/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md b/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md index 22ab980..cf09ec7 100644 --- a/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md +++ b/generated/april-2026/2026-04-02-llm-guard-prompt-injection.md @@ -8,6 +8,6 @@ draft: true description: Studio's agent gateway now blocks nine new injection categories — instruction override, role hijacking, system-prompt extraction, payload obfuscation, multi-turn poisoning, and more. --- -The agent gateway used by Studio's `/chat/completions` now blocks nine new categories of prompt-injection and prompt-poisoning attempts before they reach the model: instruction override, role hijacking, system-prompt extraction, security bypass, fake system tokens, payload obfuscation, indirect injection (via tool output or documents), social engineering, and multi-turn poisoning. +The Studio agent gateway now blocks nine new categories of prompt-injection and prompt-poisoning attempts before they reach the model: instruction override, role hijacking, system-prompt extraction, security bypass, fake system tokens, payload obfuscation, indirect injection (via tool output or documents), social engineering, and multi-turn poisoning. -Existing AI-model-identification detection continues to work. The added checks run in front of the same agent endpoint, so all Studio surfaces get the upgraded coverage automatically. +Existing AI-model-identification detection continues to work, and the new checks apply to every Studio conversation automatically. diff --git a/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md b/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md index 4f0cf40..8e884c6 100644 --- a/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md +++ b/generated/april-2026/2026-04-02-remote-mcp-http-sse-transports.md @@ -8,6 +8,4 @@ draft: true description: The remote MCP setup screen now offers Streamable HTTP and SSE transport tabs, each pre-filled with the right config and auth headers. --- -The remote MCP setup screen now lets you pick between two transports. The Streamable HTTP tab uses `type: "http"` with the `/mcp` endpoint and is the default for Claude and most clients; the SSE tab uses `type: "sse"` with the `/sse` endpoint for clients that don't speak streamable HTTP — Cursor users in particular. - -Each tab fills in the matching config snippet and auth headers, so it's a copy-paste away from a working connection. +The remote MCP setup screen now offers two transports: Streamable HTTP (the default for Claude and most clients) and SSE (for clients that don't speak streamable HTTP — Cursor in particular). Each tab pre-fills the matching config snippet and auth headers, so connecting is a single copy-paste. diff --git a/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md b/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md index 8e16d9f..1a0063a 100644 --- a/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md +++ b/generated/april-2026/2026-04-04-altimate-code-login-and-project-profiles.md @@ -8,6 +8,6 @@ draft: true description: A `/login` command for the Altimate provider, plus dbt-standard profile discovery that picks up project-local and `DBT_PROFILES_DIR` paths. --- -Altimate Code's TUI now includes the Altimate platform as a first-class LLM provider. A new `/login` dialog asks for instance name, API key, and URL (defaults to the Altimate cloud endpoint), validates credentials against `/auth_health` before saving, and writes `~/.altimate/altimate.json` with `0600` permissions. The provider is selectable from `/connect` and re-bootstraps the session immediately on save. +Altimate Code's TUI now includes the Altimate platform as a first-class LLM provider. A new `/login` dialog asks for instance name, API key, and URL, validates the credentials before saving, and re-bootstraps the session immediately — no restart, no manual config edits. dbt profile discovery also catches up to dbt's standard resolution order. `/discover` now finds profiles in priority order: an explicit path argument, `DBT_PROFILES_DIR`, a project-local `profiles.yml` sitting next to `dbt_project.yml`, and finally `~/.dbt/profiles.yml`. The common CI/CD pattern of committing a project-local profile no longer falls back silently to the global one. diff --git a/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md b/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md index 6149ca6..a5f7448 100644 --- a/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md +++ b/generated/april-2026/2026-04-06-table-level-lineage-csv-export.md @@ -8,6 +8,4 @@ draft: true description: Download upstream or downstream table lineage as CSV — same export contract as column lineage, one row per edge. --- -Table-level lineage now exports to CSV alongside the existing column-level export. Pick any table, choose upstream or downstream, and download the full BFS traversal as a CSV with Source DB / Schema / Table → Target DB / Schema / Table columns — one row per edge. - -The lineage modal also got a z-index fix so it renders above the sidebar backdrop, and long resource keys in the export dialog no longer overflow. +Table-level lineage now exports to CSV alongside the existing column-level export. Pick any table, choose upstream or downstream, and download the full traversal as a CSV — one row per edge with Source DB / Schema / Table → Target DB / Schema / Table. diff --git a/generated/april-2026/2026-04-07-dbt-jinja-syntax-highlighting.md b/generated/april-2026/2026-04-07-dbt-jinja-syntax-highlighting.md new file mode 100644 index 0000000..a7eaf7b --- /dev/null +++ b/generated/april-2026/2026-04-07-dbt-jinja-syntax-highlighting.md @@ -0,0 +1,11 @@ +--- +title: dbt-aware syntax highlighting for SQL+Jinja and YAML+Jinja +date: 2026-04-07 +products: [dbt-power-user] +tag: improved +emoji: 🎨 +draft: true +description: Dedicated syntax highlighting for dbt SQL+Jinja and YAML+Jinja, so `ref`, `source`, `config`, and Jinja blocks each get their own colors. +--- + +dbt Power User now ships dedicated syntax highlighting for dbt SQL+Jinja and YAML+Jinja. `ref`, `source`, `config`, SQL aggregates, window functions, and Jinja `{{ }}` / `{% %}` / `{# #}` blocks each get distinct colors. Schema YAML files highlight Jinja inline alongside YAML, and SQL files no longer fall back to a generic grammar that mis-colored half the keywords. diff --git a/generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md b/generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md deleted file mode 100644 index 317c27f..0000000 --- a/generated/april-2026/2026-04-07-dbt-syntax-grammar-and-python-detection.md +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: dbt SQL+Jinja syntax highlighting and "Detect Python from terminal" -date: 2026-04-07 -products: [dbt-power-user] -tag: improved -emoji: 🎨 -draft: true -description: Proper TextMate grammars for dbt SQL+Jinja and YAML+Jinja, plus a one-click way to fix mismatched Python interpreters. ---- - -dbt Power User ships dedicated TextMate grammars for dbt SQL+Jinja and YAML+Jinja. `ref`, `source`, `config`, SQL aggregates, window functions, Jinja `{{ }}`/`{% %}`/`{# #}` blocks, and operators each get distinct scopes — schema YAML files highlight Jinja inline alongside YAML, and SQL files no longer fall back to a generic grammar that mis-colored half the keywords. - -The most-reported onboarding bug — dbt works in the terminal but not in the extension because the Python extension picked a different interpreter — gets a dedicated **Detect Python from terminal** action. It spawns a login shell to find where `dbt` actually lives and writes that path to `dbtPythonPathOverride`. The button appears on every Python/dbt error dialog and in the onboarding prerequisites step. diff --git a/generated/april-2026/2026-04-07-detect-python-from-terminal.md b/generated/april-2026/2026-04-07-detect-python-from-terminal.md new file mode 100644 index 0000000..a60d169 --- /dev/null +++ b/generated/april-2026/2026-04-07-detect-python-from-terminal.md @@ -0,0 +1,13 @@ +--- +title: One-click fix for "dbt works in terminal but not in the extension" +date: 2026-04-07 +products: [dbt-power-user] +tag: new +emoji: 🐍 +draft: true +description: A "Detect Python from terminal" action finds the Python interpreter where dbt actually lives and points the extension at it. +--- + +The most-reported onboarding bug — dbt works in your terminal but not in the extension because VS Code's Python extension picked a different interpreter — now has a one-click fix. The new **Detect Python from terminal** action runs through your login shell to find where `dbt` actually lives and writes that path into the extension's Python override. + +The button appears on every Python or dbt error dialog and in the onboarding prerequisites step, so the failure mode that used to require Stack Overflow now resolves in seconds. diff --git a/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md b/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md index 3a846cc..7140451 100644 --- a/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md +++ b/generated/april-2026/2026-04-11-datamates-chat-panel-polish.md @@ -8,6 +8,6 @@ draft: true description: The Altimate MCP chat panel opens with no perceptible delay, lives on the editor title bar, and shows live token usage in the header. --- -Opening the Altimate MCP chat panel used to wait for an `isInstalled()` check before rendering. The panel now appears immediately and runs the install check on the webview-ready handler, so there's no perceptible delay between clicking and seeing the panel. +The Altimate MCP chat panel now opens instantly — no waiting on a startup check before the panel renders. -A new Altimate icon lands on the editor title bar (right after the run button) for one-click access without the command palette. The header also picks up a compact token usage indicator pulled from `/payment/token-usage` — usage percentage color-coded blue / orange / red against your monthly threshold, or "Unlimited" on unlimited plans. Click it for a detailed popover with allowance, grants, overage, billing period, and wallet balance. +A new Altimate icon lands on the editor title bar for one-click access without the command palette. The header also picks up a compact token usage indicator — usage percentage color-coded against your monthly threshold, or "Unlimited" on unlimited plans. Click it for a detailed popover with allowance, grants, overage, billing period, and wallet balance. diff --git a/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md b/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md index 0c4963f..7227822 100644 --- a/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md +++ b/generated/april-2026/2026-04-13-altimate-code-dbt-unit-test-generation.md @@ -8,6 +8,6 @@ draft: true description: A new `dbt_unit_test_gen` tool inspects compiled SQL and writes dbt unit tests with type-correct mock data, including incremental and ephemeral cases. --- -Altimate Code now generates dbt unit tests for you. Point it at a model and the new `dbt_unit_test_gen` tool reads the manifest and compiled SQL, detects scenarios (CASE branches, JOINs, GROUP BY, division-by-zero, incremental loads), and emits a `unit_tests:` block in the model's schema YAML with type-correct mock data — happy path, null variants, and boundary cases. +Altimate Code now generates dbt unit tests for you. Point it at a model and it reads the manifest and compiled SQL, detects the scenarios that matter (CASE branches, JOINs, GROUP BY, division-by-zero, incremental loads), and writes a `unit_tests:` block in the model's schema YAML with type-correct mock data — happy path, null variants, and boundary cases. -Incremental models get an `input: this` mock for the prior-state row; ephemeral upstream models use `format: sql` even when their column types aren't yet known. Cross-database — Snowflake, Databricks, BigQuery, Redshift, Postgres — works through the same `schema.inspect` adapter. +Incremental models also get a prior-state mock so the generated test exercises the merge logic, not just the initial load. Snowflake, Databricks, BigQuery, Redshift, and Postgres are all supported. diff --git a/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md b/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md index 0992567..d6cb975 100644 --- a/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md +++ b/generated/april-2026/2026-04-14-team-attribution-in-qtp-slack.md @@ -8,6 +8,6 @@ draft: true description: Query Timeout Prediction Slack threads now show the owning team even when the query tag is missing — using your tenant's ownership rules as a fallback. --- -When QTP posts a long-running query to Slack, the alert now identifies the owning team even on queries that don't carry a `team` tag. Your tenant's existing ownership rules — equals, startsWith, contains, in, and tag matchers, composable with AND/OR — apply as a fallback whenever the query tag itself is missing. +When QTP posts a long-running query to Slack, the alert now identifies the owning team even on queries that don't carry a `team` tag. Your tenant's existing ownership rules apply as a fallback whenever the query tag itself is missing. Tag-derived team always wins when present; the rules engine only fills the gap, so existing tagging conventions keep working unchanged. diff --git a/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md b/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md index ecc2c4c..6722221 100644 --- a/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md +++ b/generated/april-2026/2026-04-15-cte-profiler-in-dbt-power-user.md @@ -8,6 +8,6 @@ draft: true description: Run a dbt model and the editor decorates each CTE with its wall-clock time, row count, and a hot/warm/cool heat tier so you can find the slow one without leaving the file. --- -The dbt Power User extension now profiles every CTE in a model in one click. The CTE Profiler runs cumulative `SELECT COUNT(*)` queries per CTE against your warehouse, measures wall-clock time, calculates the marginal time each CTE adds, and decorates the editor inline (`⏱ 1.7s · 100 rows`). Hot CTEs go red, warm yellow, cool grey; a total time and row count lands at the bottom of the file. +The dbt Power User extension now profiles every CTE in a model in one click. The CTE Profiler measures per-CTE wall-clock time and row count against your warehouse and decorates the editor inline (`⏱ 1.7s · 100 rows`). Hot CTEs go red, warm yellow, cool grey; a total time and row count lands at the bottom of the file. -No Altimate API key required — the profiler runs SQL directly against the user's warehouse using the existing dbt connection. +No Altimate API key required — the profiler runs through your existing dbt connection. diff --git a/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md b/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md index 418d088..2fd190d 100644 --- a/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md +++ b/generated/april-2026/2026-04-21-data-parity-skill-mssql-fabric-clickhouse.md @@ -8,6 +8,6 @@ draft: true description: Altimate Code can now diff data across SQL Server / Azure Fabric and ClickHouse with partition-aware execution and seven Azure AD auth flows. --- -Altimate Code's `data_diff` tool now handles three more warehouses end-to-end. SQL Server and Azure Fabric drop in with full T-SQL support — `TOP` injection, `sys.*` catalog queries, `DATETRUNC()` and `CONVERT(DATE, …, 23)` for date partitioning. Azure AD authentication covers seven flows (`default`, `password`, `access-token`, `service-principal-secret`, `msi-vm`, `msi-app-service`), with shorthand aliases (`cli`, `msi`, `service-principal`) for the common cases. +Altimate Code's `data_diff` tool now handles three more warehouses end-to-end: SQL Server, Azure Fabric, and ClickHouse — all with partition-aware execution so large tables diff in independent chunks instead of one monolithic scan. -The orchestrator that drives the diff is now a TypeScript layer that runs SQL tasks produced by the Rust state machine and feeds results back — so the algorithm and the database access stay independently swappable, and partitioned diffs run independently per partition before merging outcomes. +Azure AD authentication is supported for SQL Server and Fabric, covering the common service-principal, MSI, and CLI flows. diff --git a/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md b/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md index 88cdf23..ae17218 100644 --- a/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md +++ b/generated/april-2026/2026-04-21-databricks-ai-gateway-as-llm-provider.md @@ -8,6 +8,6 @@ draft: true description: Use any of 11 Databricks-hosted foundation models — Llama 3.1, Claude, GPT-5, Gemini, DBRX, Mixtral — as the backing model for Altimate Code. --- -Altimate Code now treats Databricks serving endpoints as a first-class LLM provider. Authenticate with a Databricks PAT in `host::token` format and the provider validates the workspace host for AWS, Azure, or GCP Databricks deployments, then resolves the workspace URL from the PAT or from `DATABRICKS_HOST` / `DATABRICKS_TOKEN` environment variables. +Altimate Code now treats Databricks serving endpoints as a first-class LLM provider. Authenticate with a Databricks personal access token and the provider works against any AWS, Azure, or GCP Databricks workspace. -Eleven foundation models register out of the box: Meta Llama 3.1 (405B / 70B / 8B), Claude Sonnet and Opus, GPT-5 variants, Gemini 3.1 Pro, DBRX Instruct, and Mixtral 8x7B. Request bodies are normalized between `max_completion_tokens` and `max_tokens` so each model gets the parameter shape it expects. +Eleven foundation models register out of the box: Meta Llama 3.1 (405B / 70B / 8B), Claude Sonnet and Opus, GPT-5 variants, Gemini 3.1 Pro, DBRX Instruct, and Mixtral 8x7B. diff --git a/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md b/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md index 6fb0a07..8d07ecf 100644 --- a/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md +++ b/generated/april-2026/2026-04-22-datamate-skills-push-to-cursor-copilot-cline.md @@ -8,6 +8,6 @@ draft: true description: Skills configured on a Teammate now deliver as Cursor `.mdc`, Copilot `.instructions.md`, or Cline `.clinerules/skills/<id>/SKILL.md` files to your workspace automatically. --- -Datamate Skills are push-based markdown instructions that tell an AI agent when and how to use Datamate MCP tools. From April, the MCP server extension reads each Teammate's `skills[]` from the API and writes them as instruction files into the workspace in the right format for whichever IDE you're running. +Datamate Skills are push-based markdown instructions that tell an AI agent when and how to use Datamate MCP tools. The MCP server now reads each Teammate's skills and lands them in the right format for whichever IDE you're running — Cursor, Copilot, and Cline are all supported, with conditional activation so each skill only applies to the files it's scoped to. -Cursor gets `.mdc` files with conditional activation (`alwaysApply` and globs); Copilot gets `.instructions.md`; Cline picks up native `.clinerules/skills/<id>/SKILL.md` files with YAML frontmatter, matching Claude Code's layout. Custom skills slug their file names with an ID suffix so two skills with the same display name don't collide. The DatamateCard shows the custom-skill count alongside Assists and Guardrails, and `always_active` skills attach to every conversation regardless of context. +Custom skills count toward the per-Teammate budget shown on the DatamateCard alongside Assists and Guardrails, and skills marked "always active" attach to every conversation regardless of which file you're editing. diff --git a/generated/april-2026/2026-04-22-sso-inactivity-timeout.md b/generated/april-2026/2026-04-22-sso-inactivity-timeout.md index 6124a5b..8d6616e 100644 --- a/generated/april-2026/2026-04-22-sso-inactivity-timeout.md +++ b/generated/april-2026/2026-04-22-sso-inactivity-timeout.md @@ -8,6 +8,6 @@ draft: true description: Auto-log-out SSO users after N minutes of inactivity, configured per tenant. --- -Tenants can now set a hard inactivity timeout for SSO users. The new `ssoUserRefreshMins` tenant setting, when positive, logs SSO users out after that many minutes of inactivity — either when the tab is closed and reopened, or when the cursor leaves the window long enough. +Tenants can now set a hard inactivity timeout for SSO users. The new SSO inactivity timeout setting logs SSO users out after the configured number of minutes of inactivity — either when the tab is closed and reopened, or when the cursor leaves the window long enough. -The control is per-tenant: tenants with stricter compliance requirements can pick a tight window without affecting tenants that don't need one. Default behavior is unchanged (no forced timeout) when the setting is unset. +The control is per-tenant, so tenants with stricter compliance requirements can pick a tight window without affecting tenants that don't need one. Default behavior is unchanged (no forced timeout) when the setting is unset.