datnguye · datnguye · Jun 15, 2026 · Jun 15, 2026
diff --git a/.claude/agents/docs-dev.md b/.claude/agents/docs-dev.md
@@ -20,11 +20,12 @@ mkdocs-material, mike, or Jinja2 templating — those are gone.
 - `dbdocs/cli/main.py` — the click command group and subcommands
   (`generate`, `serve`, `deploy`).
 - `dbdocs/extract/` — derive doc data from artifacts: `nodes` (models/sources/
-  seeds/snapshots → display records + nav tree), `erd` + `erd_json` (structured
-  ERD `{nodes, edges}` via a dbterd `json` target adapter — not Mermaid text; the
-  SPA renders it with React Flow), `graph` (the node-level DAG), `column_lineage`
-  + `_sqlglot_lineage` (column-level lineage via sqlglot), and the `health/`
-  sub-package (the always-built Health Check section from `run_results.json`).
+  seeds/snapshots → display records + nav tree), `erd` (structured ERD
+  `{nodes, edges}` via dbterd's built-in `json` target ≥ 1.28.0 — not Mermaid
+  text; the SPA renders it with React Flow), `graph` (the node-level DAG),
+  `column_lineage` + `_sqlglot_lineage` (column-level lineage via sqlglot), and
+  the `health/` sub-package (the always-built Health Check section from
+  `run_results.json`).
 - `dbdocs/site/` — `builder` (assemble the one data dict + write the site),
   `inject` (`strip_marker` removes the `<!-- DBDOCS_DATA -->` placeholder — the
   data is external, not inlined), `deploy` (hand-rolled versioning), and the

diff --git a/.claude/design_patterns.md b/.claude/design_patterns.md
@@ -32,21 +32,24 @@ authoritative; grep it.
   - [Windowed graph rendering](#windowed-graph-rendering)
     - [Theory](#theory-7)
     - [Example](#example-7)
-  - [Bundled SPA directory resolution](#bundled-spa-directory-resolution)
+  - [ERD from dbterd's built-in json target](#erd-from-dbterds-built-in-json-target)
     - [Theory](#theory-8)
     - [Example](#example-8)
-  - [Versioned deploy without mike](#versioned-deploy-without-mike)
+  - [Bundled SPA directory resolution](#bundled-spa-directory-resolution)
     - [Theory](#theory-9)
     - [Example](#example-9)
-  - [Click group entrypoint](#click-group-entrypoint)
+  - [Versioned deploy without mike](#versioned-deploy-without-mike)
     - [Theory](#theory-10)
     - [Example](#example-10)
-  - [Singleton colored logger](#singleton-colored-logger)
+  - [Click group entrypoint](#click-group-entrypoint)
     - [Theory](#theory-11)
     - [Example](#example-11)
-  - [Always-built artifact-derived data-dict section (Health Check)](#always-built-artifact-derived-data-dict-section-health-check)
+  - [Singleton colored logger](#singleton-colored-logger)
     - [Theory](#theory-12)
     - [Example](#example-12)
+  - [Always-built artifact-derived data-dict section (Health Check)](#always-built-artifact-derived-data-dict-section-health-check)
+    - [Theory](#theory-13)
+    - [Example](#example-13)
 
 ## Pipeline-stage package layout
 
@@ -354,6 +357,55 @@ const dagKeep = useMemo(() => {
 - `frontend/src/components/GraphApp.tsx` — `MAX_UNFOCUSED_DAG_NODES`, `dagKeep`, `erdNodeEmpty`
 - `dbdocs/site/builder.py` / `dbdocs/extract/erd.py` — `erd_algo` (metadata)
 
+## ERD from dbterd's built-in json target
+
+### Theory
+
+The SPA renders its ERD with React Flow, which needs structured node/edge data
+— not the diagram *text* dbterd's other targets emit. dbterd ≥ 1.28 ships a
+**built-in, schema-validated `json` target** that emits `{nodes, edges,
+metadata}`; `build_erd(target="json")` forces it, and `build_erd_data` maps that
+into the SPA's `{nodes, edges}`. Do **not** reintroduce a custom
+`@register_target("json")` adapter — dbterd owns this contract now. Two dbterd
+quirks `build_erd_data` patches (verify after any dbterd bump with
+`task frontend:e2e`):
+
+1. **Short-name edge ids.** With `entity_name_format` configured, dbterd emits
+   edge `from_id`/`to_id` as the *formatted entity name* (e.g. `orders`), not the
+   full unique_id (e.g. `model.jaffle_shop.orders`). `_resolve_edge_id` resolves
+   those back through a `name_to_id` map (built from each node's `name`) so the
+   SPA's `source`/`target` always reference a valid node `id`. An id already in
+   `node_ids` passes through untouched (the no-`entity_name_format` case).
+2. **Missing FK flags.** Some algos (e.g. `model_contract`) don't set
+   `is_foreign_key` on node columns even when those columns appear in FK edges.
+   `_backfill_fk_flags` sets `is_foreign_key=True` on any column named in an
+   edge's `from_columns` (the FK/child side), indexed by node id so it's O(1) per
+   column per edge.
+
+**SPA edge direction:** `source` = the referenced/parent side (dbterd `to_id`),
+`target` = the FK/child side (dbterd `from_id`). The graph bundle's per-column
+connector handles (`buildErdFlow` in `frontend/src/lib/data.ts`) resolve each
+handle against whichever endpoint actually owns the named column (`owned(...)`),
+so a join whose FK/PK columns differ in name still lands on the right rows.
+
+### Example
+
+```python
+# dbdocs/extract/erd.py — official {nodes, edges}; resolve short names + back-fill FK flags
+payload = json.loads(erd.get_erd())
+raw_nodes = payload.get("nodes", [])
+nodes = [_build_node(n) for n in raw_nodes]
+node_ids = {n["id"] for n in nodes}
+name_to_id = {n.get("name"): n["id"] for n in raw_nodes if n.get("name")}
+edges = [_build_edge(e, i, node_ids, name_to_id) for i, e in enumerate(payload.get("edges", []))]
+_backfill_fk_flags(nodes, edges)
+return {"nodes": nodes, "edges": edges}
+```
+
+- `dbdocs/extract/erd.py` — `def build_erd` (forces `target="json"`), `def build_erd_data`, `def _build_node`, `def _build_edge`, `def _resolve_edge_id`, `def _backfill_fk_flags`
+- `frontend/src/lib/data.ts` — `buildErdFlow` (consumes `source`/`target`/`from_columns`/`to_columns`/`is_foreign_key`; `owned()` picks the handle column each endpoint owns)
+- `pyproject.toml` — `dbterd>=1.28` (the built-in `json` target floor)
+
 ## Bundled SPA directory resolution
 
 ### Theory

diff --git a/.claude/skills/spa-site/SKILL.md b/.claude/skills/spa-site/SKILL.md
@@ -95,10 +95,9 @@ SPA loads (via `data.js`, which fetches `dbdocs-data.json.gz` and exposes it on
   entities with their columns (`is_primary_key` / `is_foreign_key` flags),
   `edges` are foreign-key relationships — all keyed by dbt `unique_id`. Built by
   `dbdocs/extract/erd.py` (`build_erd` / `build_erd_data`), which runs dbterd's
-  `json` target — a custom `@register_target("json")` adapter in
-  `dbdocs/extract/erd_json.py` that emits `{tables, relationships}`. The React
-  Flow bundle derives all three graph surfaces (full DAG, global ERD, per-node
-  ERD) from `lineage` + `erd`.
+  built-in `json` target (dbterd ≥ 1.28.0; emits `{nodes, edges, metadata}`).
+  The React Flow bundle derives all three graph surfaces (full DAG, global ERD,
+  per-node ERD) from `lineage` + `erd`.
 
 ## Payload (external gzip — `dbdocs/site/inject.py` + `builder.generate`)
 

diff --git a/dbdocs/extract/erd.py b/dbdocs/extract/erd.py
@@ -1,19 +1,21 @@
-"""Structured ERD data via dbterd's ``json`` target.
-
-dbterd's built-in targets emit diagram text; the SPA renders its ERD with React
-Flow, which needs structured node/edge data. We register a ``json`` target
-(:mod:`dbdocs.extract.erd_json`) and parse its ``{tables, relationships}`` output
-into the SPA's ``{nodes, edges}`` — entities with columns (PK/FK flags) and
-foreign-key edges between them, all keyed by dbt unique_id.
+"""Structured ERD data via dbterd's official ``json`` target.
+
+dbterd 1.28.0 ships a built-in, schema-validated ``json`` target that emits
+``{nodes, edges, metadata}``. Each node carries ``id`` (the dbt unique_id),
+``name``, ``schema_name``, ``database``, ``resource_type``, and ``columns``
+(with ``data_type``, ``is_primary_key``, ``is_foreign_key``). Each edge carries
+``id``, ``from_id`` (FK/child side), ``to_id`` (referenced/parent side),
+``from_columns``, ``to_columns``, ``label``, and ``cardinality``.
+
+``build_erd_data`` maps that shape into the SPA's ``{nodes, edges}`` — the
+React Flow bundle reads ``nodes`` (entities + column flags) and ``edges``
+(FK relationships, ``source``/``target`` keyed by dbt unique_id).
 """
 
 import json
 
 from dbterd.api import DbtErd, default
 
-# Importing the module registers the "json" target with dbterd's PluginRegistry.
-from dbdocs.extract import erd_json  # noqa: F401
-
 
 def erd_algo(dbterd_options: "dict | None" = None) -> str:
     """The dbterd algorithm that detected the ERD relationships.
@@ -46,66 +48,103 @@ def build_erd(dbterd_options: "dict | None" = None, artifacts_dir: "str | None"
 
 
 def build_erd_data(erd: DbtErd) -> dict:
-    """Parse the json target into ``{"nodes": [...], "edges": [...]}``.
-
-    Nodes are entities (with columns, ``is_primary_key``/``is_foreign_key`` flags
-    and the resolved dbt unique_id); edges are foreign-key relationships between
-    them. dbterd's relationships reference tables by *name*, so we map those back
-    to unique_ids via each table's ``node_name``.
+    """Parse dbterd's official ``{nodes, edges}`` payload into the SPA shape.
+
+    dbterd's ``json`` target emits nodes keyed by dbt unique_id (``id`` field).
+    When ``entity_name_format`` is configured, dbterd emits edge ``from_id`` /
+    ``to_id`` as the formatted entity name (e.g. ``orders``) rather than the
+    full unique_id (e.g. ``model.jaffle_shop.orders``). A ``name_to_id`` map
+    resolves those short names back to the node id before building edges, so the
+    SPA's ``source``/``target`` always reference a valid node ``id``.
+
+    Some dbterd algos (e.g. ``model_contract``) do not set ``is_foreign_key``
+    on node columns even when those columns appear in FK edges. After building
+    edges, any column named in an edge's ``from_columns`` (the FK/child side)
+    has its ``is_foreign_key`` flag back-filled to ``True`` on the target node.
+
+    SPA edge direction: ``source`` is the referenced (parent) side, ``target``
+    is the FK (child) side — matching dbterd's ``to_id`` and ``from_id``
+    respectively.
     """
     payload = json.loads(erd.get_erd())
-    tables = payload.get("tables", [])
-    relationships = payload.get("relationships", [])
-
-    # table name (as dbterd refers to it in relationships) → dbt unique_id.
-    name_to_id = {t["name"]: (t.get("node_name") or t["name"]) for t in tables}
-
-    edges, fk_columns = _build_edges(relationships, name_to_id)
-    nodes = [_build_node(t, fk_columns.get(t.get("node_name") or t["name"], set())) for t in tables]
+    raw_nodes = payload.get("nodes", [])
+    raw_edges = payload.get("edges", [])
+    nodes = [_build_node(n) for n in raw_nodes]
+    node_ids = {n["id"] for n in nodes}
+    # Count occurrences first; ambiguous names (more than one node) are excluded
+    # so a collision can't silently resolve to the wrong node.
+    name_counts: dict[str, int] = {}
+    for n in raw_nodes:
+        nm = n.get("name")
+        if nm:
+            name_counts[nm] = name_counts.get(nm, 0) + 1
+    name_to_id = {
+        n.get("name"): n["id"]
+        for n in raw_nodes
+        if n.get("name") and name_counts[n.get("name")] == 1
+    }
+    edges = [_build_edge(e, i, node_ids, name_to_id) for i, e in enumerate(raw_edges)]
+    _backfill_fk_flags(nodes, edges)
     return {"nodes": nodes, "edges": edges}
 
 
-def _build_edges(relationships: list, name_to_id: dict) -> "tuple[list, dict]":
-    """Map relationships → edges and collect each node's FK column names."""
-    edges = []
-    fk_columns: dict = {}
-    for index, rel in enumerate(relationships):
-        parent_name, child_name = rel["table_map"]
-        parent_cols, child_cols = rel["column_map"]
-        source = name_to_id.get(parent_name, parent_name)
-        target = name_to_id.get(child_name, child_name)
-        # The child side holds the foreign key columns.
-        fk_columns.setdefault(target, set()).update(child_cols)
-        edges.append(
-            {
-                "id": rel.get("name") or f"e{index}",
-                "source": source,
-                "target": target,
-                "from_columns": list(parent_cols),
-                "to_columns": list(child_cols),
-                "label": rel.get("relationship_label"),
-                "type": rel.get("type", ""),
-            }
-        )
-    return edges, fk_columns
-
+def _backfill_fk_flags(nodes: "list[dict]", edges: "list[dict]") -> None:
+    """Set is_foreign_key=True on columns named in each edge's from_columns.
 
-def _build_node(table: dict, fk_cols: set) -> dict:
-    node_id = table.get("node_name") or table["name"]
+    Keyed by node id so the lookup is O(1) per column per edge.
+    """
+    nodes_by_id = {n["id"]: n for n in nodes}
+    for edge in edges:
+        target_node = nodes_by_id.get(edge["target"])
+        if target_node is None:
+            continue
+        fk_cols = {c.lower() for c in edge.get("from_columns", [])}
+        if not fk_cols:
+            continue
+        for col in target_node["columns"]:
+            if col["name"].lower() in fk_cols:
+                col["is_foreign_key"] = True
+
+
+def _build_node(node: dict) -> dict:
     return {
-        "id": node_id,
-        "label": table["name"],
-        "database": table.get("database") or "",
-        "schema": table.get("schema") or "",
-        "resource_type": table.get("resource_type") or "model",
+        "id": node["id"],
+        "label": node.get("name") or "",
+        "database": node.get("database") or "",
+        "schema": node.get("schema_name") or "",
+        "resource_type": node.get("resource_type") or "model",
         "columns": [
             {
                 "name": c["name"],
                 "type": c.get("data_type") or "",
                 "description": c.get("description") or "",
                 "is_primary_key": bool(c.get("is_primary_key")),
-                "is_foreign_key": c["name"] in fk_cols,
+                "is_foreign_key": bool(c.get("is_foreign_key")),
             }
-            for c in table.get("columns", [])
+            for c in node.get("columns", [])
         ],
     }
+
+
+def _resolve_edge_id(raw: str, node_ids: "set[str]", name_to_id: "dict[str, str]") -> str:
+    # If raw is already a valid node id, keep it (the no-entity_name_format case).
+    # Otherwise resolve through the name→id map built from node labels.
+    if raw in node_ids:
+        return raw
+    return name_to_id.get(raw, raw)
+
+
+def _build_edge(edge: dict, index: int, node_ids: "set[str]", name_to_id: "dict[str, str]") -> dict:
+    # from_id is the FK/child side; to_id is the referenced/parent side.
+    # SPA convention: source = parent (to_id), target = child (from_id).
+    raw_from = edge.get("from_id") or ""
+    raw_to = edge.get("to_id") or ""
+    return {
+        "id": edge.get("id") or f"e{index}",
+        "source": _resolve_edge_id(raw_to, node_ids, name_to_id),
+        "target": _resolve_edge_id(raw_from, node_ids, name_to_id),
+        "from_columns": edge.get("from_columns") or [],
+        "to_columns": edge.get("to_columns") or [],
+        "label": edge.get("label") or "",
+        "type": edge.get("cardinality") or "",
+    }
diff --git a/dbdocs/extract/erd_json.py b/dbdocs/extract/erd_json.py