Skip to content

feat(markdown): index documentation graph links#361

Open
QingNagi wants to merge 1 commit into
colbymchenry:mainfrom
QingNagi:feature/md-index-graph-links
Open

feat(markdown): index documentation graph links#361
QingNagi wants to merge 1 commit into
colbymchenry:mainfrom
QingNagi:feature/md-index-graph-links

Conversation

@QingNagi
Copy link
Copy Markdown

@QingNagi QingNagi commented May 23, 2026

Why this feature is needed

Agent and skill workflows often mix Markdown instructions, phase documents, script templates, and implementation scripts. Important logic is not always in code; it is frequently defined in SKILL.md, runbooks, checklists, tables, and workflow docs.

This change extends CodeGraph’s Markdown indexing so Markdown files can participate in the same graph-based lookup flow as source code.

Markdown files are now indexed structurally: headings become searchable section nodes, stable table rows such as API-AUTH become searchable nodes, and Markdown references to files, scripts, command templates, and file::symbol targets are extracted as graph edges.

It also adds reverse mapping from code back to Markdown. For example, a Python function that opens docs/setup.md#database-setup now creates a graph edge from that function to the Markdown heading node. This lets agents move both ways: from docs to implementation, and from scripts back to the documentation rules or templates they depend on.

Main benefits:

  • Direct lookup of Markdown headings and table rows.
  • Fewer grep and full-file reads when locating documentation rules.
  • Faster discovery of scripts/functions referenced by Markdown.
  • Easier reverse tracing from code to related Markdown files or sections.
  • Better validation that documentation and implementation stay aligned.

Example graph relationships:

API-AUTH          -> src/auth.ts::loginUser
API-AUTH          -> src/auth.ts::refreshToken
initializeDatabase -> docs/setup.md#database-setup
SETUP_DOC         -> docs/setup.md

Validation covered Markdown extraction, file-symbol resolution, Markdown anchor resolution, and full-pipeline doc/code graph edges.

Infrastructure
src/types.ts — added markdown to the Language union
src/extraction/grammars.ts — added .md, .mdx, and .markdown extension mapping; marked Markdown as a supported custom extractor language
src/extraction/markdown-extractor.ts — new lightweight Markdown extractor for headings, table rows, links, command templates, and file::symbol references
src/extraction/tree-sitter.ts — added Markdown extractor dispatch; added code-string scanning for Markdown path references such as docs/setup.md#database-setup
src/resolution/name-matcher.ts — added resolution for Markdown anchors and file::symbol references into target files
src/resolution/index.ts — updated fast pre-filtering so path-like references with #anchor can still resolve correctly

Test updates
__tests__/extraction.test.ts — added Markdown language detection, heading/link extraction, table row extraction, file-symbol extraction, and code-to-Markdown reference tests
__tests__/resolution.test.ts — added Markdown file, anchor, and file::symbol resolution tests
__tests__/integration/full-pipeline.test.ts — added full-pipeline tests for Markdown-to-script/function edges and code-to-Markdown heading edges

Add Markdown extraction for headings, table rows, command templates, and file-symbol references.

Resolve Markdown anchors and file-symbol references, plus code string references back to Markdown files/headings.

Cover Markdown extraction/resolution and full-pipeline md/code graph edges with tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant