diff --git a/CLAUDE.md b/CLAUDE.md
index 0ec3ed4..cdb8617 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -104,21 +104,21 @@ The XML you provide is wrapped in a minimal `w:document > w:body` structure auto
## MCP Server
-Cloudflare Worker exposing two flavors of MCP tools backed by the same database.
+Cloudflare Worker exposing two tool families over MCP, backed by the same database.
-Semantic search over the spec PDF (powered by `spec_content`):
+Prose search over the spec PDFs (powered by `spec_content`):
-- `search_ecma_spec` - semantic vector search across 18,000+ spec chunks
-- `get_section` - fetch a specific section by ID (e.g., "17.3.1.24")
-- `list_parts` - browse the spec structure
+- `ooxml_search` - semantic vector search across 18,000+ spec chunks
+- `ooxml_section` - fetch a specific section by ID (e.g., "17.3.1.24")
+- `ooxml_parts` - browse the spec structure
Structural queries over the XSD schema graph (powered by `xsd_*` tables):
-- `ooxml_lookup_element` / `ooxml_lookup_type` - canonical symbol info
+- `ooxml_element` / `ooxml_type` - canonical symbol info
- `ooxml_children` - legal children of an element/type/group, in document order
- `ooxml_attributes` - attributes including those inherited and unfolded from attributeGroup refs
- `ooxml_enum` - simpleType enumeration values
-- `ooxml_namespace_info` - vocabularies and per-profile symbol counts for a namespace URI
+- `ooxml_namespace` - vocabularies and per-profile symbol counts for a namespace URI
Uses PostgreSQL with pgvector (Neon serverless in production, Docker locally).
diff --git a/README.md b/README.md
index 768affc..ba11176 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
-[](https://ooxml.dev)
-[](https://api.ooxml.dev/mcp)
+[](https://ooxml.dev)
+[](https://api.ooxml.dev/mcp)
[](https://opensource.org/licenses/MIT)
The OOXML spec, explained by people who actually implemented it.
@@ -23,16 +23,41 @@ We faced this at SuperDoc — building a document engine on native OOXML with no
## MCP Server
-Ask questions in natural language and get answers grounded in the spec, or query the schema graph for precise structural answers.
+Ask questions in natural language and get answers grounded in the spec, or query the schema graph for precise structural answers. Works with Claude Code, Codex CLI, Cursor, and any MCP-compatible client.
+
+**Claude Code**
+
+```bash
+claude mcp add --transport http ooxml https://api.ooxml.dev/mcp
+```
+
+**Codex CLI**
```bash
-claude mcp add --transport http ecma-spec https://api.ooxml.dev/mcp
+codex mcp add ooxml --url https://api.ooxml.dev/mcp
+```
+
+Or in `~/.codex/config.toml`:
+
+```toml
+[mcp_servers.ooxml]
+url = "https://api.ooxml.dev/mcp"
+```
+
+**Cursor** — add to your MCP settings:
+
+```json
+{
+ "mcpServers": {
+ "ooxml": { "url": "https://api.ooxml.dev/mcp" }
+ }
+}
```
-Works with Claude Code, Cursor, and any MCP-compatible client. Two flavors of tools share one server:
+Two tool families share one server:
-- **Semantic** (over the spec PDF): `search_ecma_spec`, `get_section`, `list_parts`
-- **Structural** (over the parsed XSDs): `ooxml_lookup_element`, `ooxml_lookup_type`, `ooxml_children`, `ooxml_attributes`, `ooxml_enum`, `ooxml_namespace_info`
+- **Prose search** (over the spec PDFs): `ooxml_search`, `ooxml_section`, `ooxml_parts`
+- **Schema lookup** (over the parsed XSDs): `ooxml_element`, `ooxml_type`, `ooxml_children`, `ooxml_attributes`, `ooxml_enum`, `ooxml_namespace`
## Development
diff --git a/apps/mcp-server/README.md b/apps/mcp-server/README.md
index bc0db96..64cfd60 100644
--- a/apps/mcp-server/README.md
+++ b/apps/mcp-server/README.md
@@ -1,35 +1,85 @@
-# ECMA-376 Spec MCP Server
+# OOXML Reference MCP Server
-**The world's first ECMA-376 MCP server** - semantic search across the entire Office Open XML specification.
+Cloudflare Worker that exposes ECMA-376 (Office Open XML) over the Model Context Protocol. Two tool families share one server:
-- 18,000+ chunks from all 4 parts of ECMA-376
-- Vector search powered by Voyage embeddings + pgvector
-- Hosted on Cloudflare Workers
+- **Prose search** — semantic search across the four ECMA-376 part PDFs (~18,000 chunks, embedded with Voyage, queried with pgvector).
+- **Schema lookup** — deterministic queries over the parsed XSD graph (profiles, namespaces, symbols, content models, attributes, enums).
-## Connect in Claude Code
+Hosted at `https://api.ooxml.dev/mcp`.
+
+## Connect
+
+### Claude Code
+
+```bash
+claude mcp add --transport http ooxml https://api.ooxml.dev/mcp
+```
+
+### Codex CLI
```bash
-claude mcp add --transport http ecma-spec https://api.ooxml.dev/mcp
+codex mcp add ooxml --url https://api.ooxml.dev/mcp
+```
+
+Or add to `~/.codex/config.toml`:
+
+```toml
+[mcp_servers.ooxml]
+url = "https://api.ooxml.dev/mcp"
+```
+
+### Cursor
+
+Add to your Cursor MCP settings:
+
+```json
+{
+ "mcpServers": {
+ "ooxml": {
+ "url": "https://api.ooxml.dev/mcp"
+ }
+ }
+}
```
-## Endpoints
+### Other clients
+
+Any MCP-compatible client that speaks Streamable HTTP can connect to the endpoint directly.
+
+## Tools
-| Endpoint | Method | Description |
-|----------|--------|-------------|
-| `/mcp` | GET | MCP server info |
-| `/search` | POST | Semantic search (`{query, part?, limit?}`) |
-| `/section` | GET | Get section (`?id=17.3.2&part=1`) |
-| `/stats` | GET | Database stats |
+### Prose search
+
+| Tool | Returns |
+| --- | --- |
+| `ooxml_search` | Semantic search over the spec PDFs |
+| `ooxml_section` | Specific section by ID (e.g. `17.3.2`) |
+| `ooxml_parts` | Spec part / section structure |
+
+### Schema lookup
+
+| Tool | Returns |
+| --- | --- |
+| `ooxml_element` | Canonical info for an element by qname |
+| `ooxml_type` | Canonical info for a complexType or simpleType |
+| `ooxml_children` | Legal children of an element, type, or group (walks inheritance) |
+| `ooxml_attributes` | Attributes including inherited + attributeGroup refs |
+| `ooxml_enum` | Enumeration values for a simpleType |
+| `ooxml_namespace` | Vocabularies and per-profile symbol counts for a namespace URI |
+
+Default profile is `transitional`. Future profiles will compose Transitional with Office extension schemas.
## Development
```bash
-# Install
+# Install (from repo root)
bun install
-# Run locally (needs .dev.vars with DATABASE_URL, VOYAGE_API_KEY)
-wrangler dev
+# Local dev — needs .dev.vars with DATABASE_URL and VOYAGE_API_KEY
+bun run dev:mcp
-# Deploy
-wrangler deploy
+# Deploy (from this directory)
+bun run deploy
```
+
+Database setup, ingest pipelines, and tests live at the repo root — see the top-level `README.md`.
diff --git a/apps/mcp-server/src/index.ts b/apps/mcp-server/src/index.ts
index f50d025..41fa8bf 100644
--- a/apps/mcp-server/src/index.ts
+++ b/apps/mcp-server/src/index.ts
@@ -1,17 +1,16 @@
/**
- * ECMA-376 Spec MCP Server
+ * OOXML Reference MCP Server
*
- * Cloudflare Worker that exposes ECMA-376 specification search via MCP protocol.
- *
- * Tools:
- * - search_ecma_spec: Semantic search across the spec
- * - get_section: Get specific section by ID
- * - list_parts: List spec parts and sections
+ * Cloudflare Worker exposing two tool families over MCP:
+ * - prose search over ECMA-376 PDFs (ooxml_search, ooxml_section, ooxml_parts)
+ * - schema lookup over the parsed XSD graph (ooxml_element, ooxml_type,
+ * ooxml_children, ooxml_attributes, ooxml_enum, ooxml_namespace)
*/
import { createDb } from "./db";
import { embedQuery } from "./embeddings";
-import { handleMcpRequest } from "./mcp";
+import { handleMcpRequest, TOOLS } from "./mcp";
+import { OOXML_TOOL_DEFS } from "./ooxml-tools";
export interface Env {
DATABASE_URL: string;
@@ -169,7 +168,7 @@ export default {
return addCorsHeaders(
new Response(
JSON.stringify({
- name: "ECMA-376 Spec MCP Server",
+ name: "OOXML Reference MCP Server",
version: "0.1.0",
endpoints: {
mcp: "/mcp",
@@ -188,50 +187,15 @@ export default {
},
};
-// MCP info endpoint (GET for debugging)
+// MCP info endpoint (GET for debugging). Tool list is derived from the same
+// canonical exports as the JSON-RPC tools/list response so they can't drift.
function handleMcpInfo(): Response {
return new Response(
JSON.stringify({
- name: "ecma-spec",
+ name: "ooxml",
version: "0.1.0",
- description: "ECMA-376 (Office Open XML) specification search server",
- tools: [
- {
- name: "search_ecma_spec",
- description: "Search the ECMA-376 specification semantically",
- inputSchema: {
- type: "object",
- properties: {
- query: { type: "string", description: "Natural language search query" },
- part: { type: "number", description: "Filter by part number (1-4)" },
- limit: { type: "number", description: "Max results (default: 5)" },
- },
- required: ["query"],
- },
- },
- {
- name: "get_section",
- description: "Get a specific section by ID",
- inputSchema: {
- type: "object",
- properties: {
- section_id: { type: "string", description: "Section ID (e.g., '17.3.2')" },
- part: { type: "number", description: "Part number (1-4)" },
- },
- required: ["section_id"],
- },
- },
- {
- name: "list_parts",
- description: "List spec parts and sections",
- inputSchema: {
- type: "object",
- properties: {
- part: { type: "number", description: "Filter by part number (1-4)" },
- },
- },
- },
- ],
+ description: "OOXML (ECMA-376) reference server: prose search + schema lookup",
+ tools: [...TOOLS, ...OOXML_TOOL_DEFS],
}),
{
headers: { "Content-Type": "application/json" },
diff --git a/apps/mcp-server/src/mcp.ts b/apps/mcp-server/src/mcp.ts
index 6af618f..0621c4c 100644
--- a/apps/mcp-server/src/mcp.ts
+++ b/apps/mcp-server/src/mcp.ts
@@ -37,10 +37,22 @@ const PART_DESCRIPTIONS: Record = {
4: "Transitional Migration Features",
};
+/** Shape of an MCP tool definition. Shared with OOXML_TOOL_DEFS so a future
+ * field added to one (annotations, outputSchema, etc.) widens both arrays. */
+export interface ToolDef {
+ name: string;
+ description: string;
+ inputSchema: {
+ type: "object";
+ properties: Record;
+ required?: string[];
+ };
+}
+
// Tool definitions
-const TOOLS = [
+export const TOOLS: ToolDef[] = [
{
- name: "search_ecma_spec",
+ name: "ooxml_search",
description:
"Semantic search across the ECMA-376 (Office Open XML) specification. Returns relevant sections based on natural language queries about WordprocessingML, SpreadsheetML, PresentationML, and more.",
inputSchema: {
@@ -61,7 +73,7 @@ const TOOLS = [
},
},
{
- name: "get_section",
+ name: "ooxml_section",
description:
"Get a specific section of the ECMA-376 specification by section ID (e.g., '17.3.2' for paragraph properties).",
inputSchema: {
@@ -77,7 +89,7 @@ const TOOLS = [
},
},
{
- name: "list_parts",
+ name: "ooxml_parts",
description: "List ECMA-376 specification parts and their top-level sections.",
inputSchema: {
type: "object" as const,
@@ -124,11 +136,11 @@ function handleInitialize(id: number | string | null): JsonRpcResponse {
tools: {},
},
serverInfo: {
- name: "ecma-spec",
+ name: "ooxml",
version: "0.1.0",
},
instructions:
- "ECMA-376 (Office Open XML) specification search server. Use search_ecma_spec for semantic search, get_section for specific sections, or list_parts to browse the spec structure.",
+ "OOXML (ECMA-376 / Office Open XML) reference server. Two tool families: prose search over the spec PDFs (ooxml_search, ooxml_section, ooxml_parts) and deterministic schema lookup over the parsed XSDs (ooxml_element, ooxml_type, ooxml_children, ooxml_attributes, ooxml_enum, ooxml_namespace).",
},
};
}
@@ -173,7 +185,7 @@ async function handleToolsCall(
}
switch (name) {
- case "search_ecma_spec": {
+ case "ooxml_search": {
const query = args?.query as string;
const part = args?.part as number | undefined;
const limit = Math.min((args?.limit as number) || 5, 20);
@@ -194,7 +206,7 @@ async function handleToolsCall(
break;
}
- case "get_section": {
+ case "ooxml_section": {
const sectionId = args?.section_id as string;
const part = args?.part as number | undefined;
@@ -213,7 +225,7 @@ async function handleToolsCall(
break;
}
- case "list_parts": {
+ case "ooxml_parts": {
const part = args?.part as number | undefined;
const db = createDb(env.DATABASE_URL);
diff --git a/apps/mcp-server/src/ooxml-queries.ts b/apps/mcp-server/src/ooxml-queries.ts
index 7950974..07a83df 100644
--- a/apps/mcp-server/src/ooxml-queries.ts
+++ b/apps/mcp-server/src/ooxml-queries.ts
@@ -1,7 +1,7 @@
/**
* Read-only schema-graph queries powering the OOXML MCP tools:
- * ooxml_lookup_element, ooxml_lookup_type, ooxml_children,
- * ooxml_attributes, ooxml_enum, ooxml_namespace_info.
+ * ooxml_element, ooxml_type, ooxml_children,
+ * ooxml_attributes, ooxml_enum, ooxml_namespace.
*
* These take a tagged-template SQL function (Neon in the deployed Worker,
* postgres.js in local tests). All queries are profile-scoped and walk
diff --git a/apps/mcp-server/src/ooxml-tools.ts b/apps/mcp-server/src/ooxml-tools.ts
index 3dcaad4..1608cc0 100644
--- a/apps/mcp-server/src/ooxml-tools.ts
+++ b/apps/mcp-server/src/ooxml-tools.ts
@@ -2,14 +2,15 @@
* Read-only structural MCP tools backed by the OOXML schema graph.
*
* Tools:
- * ooxml_lookup_element, ooxml_lookup_type, ooxml_children,
- * ooxml_attributes, ooxml_enum, ooxml_namespace_info.
+ * ooxml_element, ooxml_type, ooxml_children,
+ * ooxml_attributes, ooxml_enum, ooxml_namespace.
*
* Default profile is `transitional`. Future profiles (e.g. word-compatible-docx)
* will compose Transitional with Office extension schemas.
*/
import { neon } from "@neondatabase/serverless";
+import type { ToolDef } from "./mcp";
import {
type AttrEntry,
type ChildEdge,
@@ -33,9 +34,9 @@ export interface OoxmlEnv {
DATABASE_URL: string;
}
-export const OOXML_TOOL_DEFS = [
+export const OOXML_TOOL_DEFS: ToolDef[] = [
{
- name: "ooxml_lookup_element",
+ name: "ooxml_element",
description:
"Look up an OOXML element by qname in a profile. Returns canonical symbol info (vocabulary, namespace, declared @type, profile membership, source). Accepts 'w:tbl', '{namespace}localName' (Clark form), or bare 'localName' (defaults to wml-main).",
inputSchema: {
@@ -51,7 +52,7 @@ export const OOXML_TOOL_DEFS = [
},
},
{
- name: "ooxml_lookup_type",
+ name: "ooxml_type",
description:
"Look up a complexType or simpleType by qname in a profile. Tries complexType first, then simpleType.",
inputSchema: {
@@ -107,7 +108,7 @@ export const OOXML_TOOL_DEFS = [
},
},
{
- name: "ooxml_namespace_info",
+ name: "ooxml_namespace",
description:
"Show what's known about a namespace URI: vocabularies, profiles that include it, and how many symbols each profile contributes.",
inputSchema: {
@@ -121,12 +122,12 @@ export const OOXML_TOOL_DEFS = [
];
export type OoxmlToolName =
- | "ooxml_lookup_element"
- | "ooxml_lookup_type"
+ | "ooxml_element"
+ | "ooxml_type"
| "ooxml_children"
| "ooxml_attributes"
| "ooxml_enum"
- | "ooxml_namespace_info";
+ | "ooxml_namespace";
const OOXML_TOOL_NAMES: ReadonlySet = new Set(OOXML_TOOL_DEFS.map((t) => t.name));
@@ -164,7 +165,7 @@ export async function runOoxmlTool(
const profile = (args.profile as string | undefined) ?? DEFAULT_PROFILE;
switch (name) {
- case "ooxml_lookup_element": {
+ case "ooxml_element": {
const q = parseQName(String(args.qname ?? ""));
if (!q.ok) return formatNotFound(`could not parse qname: ${q.reason}`);
const hit = await lookupElement(sql, q.qname.namespace, q.qname.localName, profile);
@@ -177,7 +178,7 @@ export async function runOoxmlTool(
return formatSymbolReport("Element", hit, profile);
}
- case "ooxml_lookup_type": {
+ case "ooxml_type": {
const q = parseQName(String(args.qname ?? ""));
if (!q.ok) return formatNotFound(`could not parse qname: ${q.reason}`);
const hit = await lookupType(sql, q.qname.namespace, q.qname.localName, profile);
@@ -255,7 +256,7 @@ export async function runOoxmlTool(
return formatEnumReport(sym, enums, profile);
}
- case "ooxml_namespace_info": {
+ case "ooxml_namespace": {
const uri = String(args.uri ?? "");
if (!uri) return formatNotFound("namespace URI not provided");
const info = await getNamespaceInfo(sql, uri);
diff --git a/apps/mcp-server/wrangler.toml b/apps/mcp-server/wrangler.toml
index c4cd3eb..faacc0d 100644
--- a/apps/mcp-server/wrangler.toml
+++ b/apps/mcp-server/wrangler.toml
@@ -1,4 +1,4 @@
-name = "ecma-spec-mcp"
+name = "ooxml-mcp"
main = "src/index.ts"
compatibility_date = "2026-01-28"
compatibility_flags = ["nodejs_compat"]
diff --git a/apps/web/public/llms.txt b/apps/web/public/llms.txt
index a10b24d..e5e65f2 100644
--- a/apps/web/public/llms.txt
+++ b/apps/web/public/llms.txt
@@ -33,11 +33,19 @@ Every page combines XML structure, live rendered previews, and implementation no
## MCP Server
-Search the ECMA-376 spec with AI. 18,000+ spec chunks, searchable by meaning.
-
-- `search_ecma_spec`: Semantic search — ask questions in natural language
-- `get_section`: Retrieve a specific section by ID (e.g., "17.3.2")
-- `list_parts`: Browse the specification structure by part (1-4)
+OOXML reference for AI assistants. Two tool families: prose search across 18,000+ ECMA-376 spec chunks, and deterministic schema lookup over the parsed XSDs.
+
+Prose search (over the spec PDFs):
+- `ooxml_search`: Semantic search — ask questions in natural language
+- `ooxml_section`: Retrieve a specific section by ID (e.g., "17.3.2")
+- `ooxml_parts`: Browse the specification structure by part (1-4)
+
+Schema lookup (over the parsed XSDs):
+- `ooxml_element` / `ooxml_type`: Canonical symbol info by qname
+- `ooxml_children`: Legal children of an element, type, or group
+- `ooxml_attributes`: Attributes including inherited + attributeGroup refs
+- `ooxml_enum`: Enumeration values for a simpleType
+- `ooxml_namespace`: Vocabularies and symbol counts for a namespace URI
## About
diff --git a/apps/web/scripts/prerender.ts b/apps/web/scripts/prerender.ts
index 8069d6c..32b6b1b 100644
--- a/apps/web/scripts/prerender.ts
+++ b/apps/web/scripts/prerender.ts
@@ -124,16 +124,25 @@ ${navHtml()}
function mcpPageHtml(): string {
return `
-
Search the ECMA-376 spec with AI
-
Connect your MCP-compatible client to search 18,000+ specification chunks using natural language queries.
-
Available Tools
+
OOXML reference for AI assistants
+
Connect your MCP-compatible client and get both natural-language search across the ECMA-376 spec and deterministic schema lookup over the parsed XSDs.
+
Prose search
-
search_ecma_spec — Semantic search across the specification.
-
get_section — Retrieve a specific section by ID.
-
list_parts — Browse the specification structure.
+
ooxml_search — Semantic search across 18,000+ spec chunks.
+
ooxml_section — Retrieve a specific section by ID.
+
ooxml_parts — Browse the specification structure.
+
+
Schema lookup
+
+
ooxml_element — Canonical info for an element by qname.
+
ooxml_type — Canonical info for a complexType or simpleType.
+
ooxml_children — Legal children of an element, type, or group.
+
ooxml_attributes — Attributes including inherited + attributeGroup refs.
+
ooxml_enum — Enumeration values for a simpleType.
+
ooxml_namespace — What's known about a namespace URI.
What is MCP?
-
The Model Context Protocol (MCP) is an open standard that lets AI assistants connect to external data sources and tools.
+
The Model Context Protocol (MCP) is an open standard that lets AI assistants connect to external data sources and tools. Works with Claude Code, Codex CLI, Cursor, and any MCP-compatible client.
${navHtml()}
`;
}
diff --git a/apps/web/src/pages/Home.tsx b/apps/web/src/pages/Home.tsx
index 017a9df..d50a641 100644
--- a/apps/web/src/pages/Home.tsx
+++ b/apps/web/src/pages/Home.tsx
@@ -35,13 +35,13 @@ export function Home() {
NEW
- Bidirectional Text — RTL layout, logical alignment, tab stop flipping, and bidi pitfalls
+ OOXML MCP — deterministic schema lookup for elements, attributes, types, enums
- Read it →
+ Connect →
diff --git a/apps/web/src/pages/Mcp.tsx b/apps/web/src/pages/Mcp.tsx
index 96a3821..2a4d32e 100644
--- a/apps/web/src/pages/Mcp.tsx
+++ b/apps/web/src/pages/Mcp.tsx
@@ -5,32 +5,66 @@ import { getSeoMeta } from "../data/seo";
import { useDocumentTitle } from "../hooks/useDocumentTitle";
const MCP_ENDPOINT = `${import.meta.env.VITE_API_URL}/mcp`;
-const CLAUDE_COMMAND = `claude mcp add --transport http ecma-spec ${MCP_ENDPOINT}`;
+const CLAUDE_COMMAND = `claude mcp add --transport http ooxml ${MCP_ENDPOINT}`;
+const CODEX_COMMAND = `codex mcp add ooxml --url ${MCP_ENDPOINT}`;
+const CODEX_TOML = `[mcp_servers.ooxml]
+url = "${MCP_ENDPOINT}"`;
-const TOOLS = [
+const PROSE_TOOLS = [
{
- name: "search_ecma_spec",
+ name: "ooxml_search",
description:
- 'Semantic search across the specification. Ask questions like "How do paragraph borders work?" or "What controls table cell margins?"',
+ 'Semantic search across the spec PDFs. Ask questions like "How do paragraph borders work?" or "What controls table cell margins?"',
},
{
- name: "get_section",
+ name: "ooxml_section",
description: 'Retrieve a specific section by ID (e.g., "17.3.2" for paragraph properties).',
},
{
- name: "list_parts",
+ name: "ooxml_parts",
description: "Browse the specification structure. Filter by part (1-4) to explore sections.",
},
];
+const SCHEMA_TOOLS = [
+ {
+ name: "ooxml_element",
+ description:
+ "Look up an OOXML element by qname. Returns vocabulary, namespace, declared @type, and source.",
+ },
+ {
+ name: "ooxml_type",
+ description:
+ "Look up a complexType or simpleType by qname. Tries complexType first, then simpleType.",
+ },
+ {
+ name: "ooxml_children",
+ description:
+ "List the legal children of an element, complexType, or group in document order. Walks inheritance to union content from base types.",
+ },
+ {
+ name: "ooxml_attributes",
+ description:
+ "List the attributes of an element or complexType. Walks inheritance and unfolds attributeGroup refs recursively.",
+ },
+ {
+ name: "ooxml_enum",
+ description: "List enumeration values for a simpleType, in declaration order.",
+ },
+ {
+ name: "ooxml_namespace",
+ description: "Show what's known about a namespace URI: vocabularies, profiles, symbol counts.",
+ },
+];
+
const EXAMPLE_QUERIES = [
"How do I add borders to a table cell?",
- "What's the difference between w:pPr and w:rPr?",
"How does numbering work in WordprocessingML?",
- "Explain the content model for w:document",
+ "What are the legal children of w:CT_Tbl?",
+ "List all attributes of w:CT_R, including inherited ones.",
];
-type TabId = "claude" | "cursor" | "other";
+type TabId = "claude" | "codex" | "cursor" | "other";
export function Mcp() {
useDocumentTitle(getSeoMeta("/mcp").title);
@@ -60,10 +94,10 @@ export function Mcp() {
⚡ MCP Server
-
Search the ECMA-376 spec with AI
+
OOXML reference for AI assistants
- 18,000+ spec chunks, searchable by meaning. Ask questions in natural language, get the
- relevant sections back.
+ Two tool families: prose search across 18,000+ spec chunks, and deterministic schema
+ lookup over the parsed XSDs. Ask in natural language, or query the structure directly.
@@ -92,6 +126,9 @@ export function Mcp() {
setActiveTab("claude")}>
Claude Code
+ setActiveTab("codex")}>
+ Codex CLI
+ setActiveTab("cursor")}>
Cursor
@@ -123,6 +160,26 @@ export function Mcp() {
@@ -215,9 +295,9 @@ export function Mcp() {
tools.
- By connecting to this MCP server, your AI assistant gains the ability to search and
- retrieve information from the ECMA-376 specification—making it much easier to work with
- Office Open XML.
+ By connecting to this MCP server, your AI assistant gains both prose search across the
+ ECMA-376 specification and deterministic schema lookup over the parsed XSDs—making it
+ much easier to work with Office Open XML.
diff --git a/brand.md b/brand.md
index a37e6be..56dd73f 100644
--- a/brand.md
+++ b/brand.md
@@ -52,7 +52,7 @@ The commercial document vendors (Aspose, Syncfusion, TX Text Control, Nutrient)
**Structural differentiators**:
- **Live previews** — Every XML example renders in real-time via SuperDoc. No other OOXML reference shows you what the XML actually produces.
- **Implementation notes from production** — Not spec commentary. Notes from building a shipping document engine against real-world documents.
-- **AI-native search** — MCP server with semantic vector search across 18,000+ spec chunks. The spec is searchable by meaning, not just keywords.
+- **AI-native reference** — MCP server with two tool families: prose search across 18,000+ spec chunks (ask questions in natural language) and deterministic schema lookup over the parsed XSDs (legal children, attribute lists, enum values, namespaces — exact answers, no hallucination).
- **Real document corpus** — Backed by docx-corpus (1M+ real documents). Observations are tested against actual documents in the wild, not just spec examples.
- **Format-first, tool-agnostic** — Useful whether you're building on SuperDoc, Aspose, your own renderer, or just trying to understand a .docx file.
@@ -133,7 +133,7 @@ _Use on homepage hero, social bios, link previews._
**Slogans for different contexts**:
- Developer discovery: "5,000 pages of spec. The 200 that matter. The notes you actually need."
-- AI/MCP context: "Ask the spec anything. Get answers grounded in implementation experience."
+- AI/MCP context: "Ask the spec anything, or query the schema directly. Two tool families, one server."
- Community pitch: "Hard-won OOXML knowledge, shared freely."
- SuperDoc connection: "Built by SuperDoc — DOCX editing and tooling. Open to everyone."
- Credibility: "Every example is a working document."
diff --git a/scripts/ingest-pdf/README.md b/scripts/ingest-pdf/README.md
index 8e051c0..a14cb5e 100644
--- a/scripts/ingest-pdf/README.md
+++ b/scripts/ingest-pdf/README.md
@@ -1,7 +1,7 @@
# PDF ingest (ECMA-376 prose corpus)
-Builds the semantic-search corpus that powers `search_ecma_spec` /
-`get_section` / `list_parts`. Each ECMA-376 part PDF is extracted into
+Builds the prose-search corpus that powers `ooxml_search` /
+`ooxml_section` / `ooxml_parts`. Each ECMA-376 part PDF is extracted into
section-aware markdown, chunked at ~6 KB boundaries, embedded with the
configured provider, and uploaded into `spec_content`.
diff --git a/scripts/ingest-xsd/README.md b/scripts/ingest-xsd/README.md
index 78cdba2..a417f5a 100644
--- a/scripts/ingest-xsd/README.md
+++ b/scripts/ingest-xsd/README.md
@@ -1,6 +1,6 @@
# XSD ingest (ECMA-376 schema graph)
-Builds the structural-query corpus that powers `ooxml_lookup_element`,
+Builds the structural-query corpus that powers `ooxml_element`,
`ooxml_children`, `ooxml_attributes`, etc. The XSDs published by Ecma
International for ECMA-376 Transitional are parsed and persisted as a
profile-scoped relational graph.
diff --git a/scripts/ingest-xsd/ingest.ts b/scripts/ingest-xsd/ingest.ts
index d4bb8be..caa2859 100644
--- a/scripts/ingest-xsd/ingest.ts
+++ b/scripts/ingest-xsd/ingest.ts
@@ -257,7 +257,7 @@ export async function ingestSchemaSet(opts: IngestSchemaSetOptions): Promise inside CT_Para.
- // Should have type_ref AND profile membership so ooxml_lookup_element finds it.
+ // Should have type_ref AND profile membership so ooxml_element finds it.
const [textSym] = await db.sql`
SELECT s.id, s.type_ref FROM xsd_symbols s
WHERE s.local_name = 'text' AND s.kind = 'element' AND s.vocabulary_id = 'wml-main'
diff --git a/tests/mcp-server/ooxml-queries.test.ts b/tests/mcp-server/ooxml-queries.test.ts
index 7d5c5f2..93c8678 100644
--- a/tests/mcp-server/ooxml-queries.test.ts
+++ b/tests/mcp-server/ooxml-queries.test.ts
@@ -285,7 +285,7 @@ test("local element symbols are scoped per-owner (no cross-CT collapse)", async
test("xsd-builtin symbols have profile membership (lookupSymbolByTypeRef can follow xsd:string)", async () => {
// Built-ins like xsd:string are auto-created during inheritance resolution and
- // must be linked to xsd_symbol_profiles, otherwise ooxml_lookup_type for
+ // must be linked to xsd_symbol_profiles, otherwise ooxml_type for
// 'xsd:string' and lookupSymbolByTypeRef for {...XMLSchema}string return null.
const t = await lookupSymbolByTypeRef(
db.sql,
diff --git a/tests/mcp-server/tools-list.test.ts b/tests/mcp-server/tools-list.test.ts
new file mode 100644
index 0000000..ae02a94
--- /dev/null
+++ b/tests/mcp-server/tools-list.test.ts
@@ -0,0 +1,54 @@
+/**
+ * Snapshot the public MCP surface so a future rename or accidental drop
+ * (between TOOLS, OOXML_TOOL_DEFS, and the docs) fails CI.
+ *
+ * No DB access; we exercise tools/list and the initialize handler. tools/call
+ * is covered by the per-tool tests in ooxml-queries.test.ts.
+ */
+
+import { expect, test } from "bun:test";
+import { handleMcpRequest } from "../../apps/mcp-server/src/mcp.ts";
+
+const EXPECTED_TOOL_NAMES = [
+ // Prose search (over the spec PDFs)
+ "ooxml_search",
+ "ooxml_section",
+ "ooxml_parts",
+ // Schema lookup (over the parsed XSDs)
+ "ooxml_element",
+ "ooxml_type",
+ "ooxml_children",
+ "ooxml_attributes",
+ "ooxml_enum",
+ "ooxml_namespace",
+] as const;
+
+interface JsonRpcResponse {
+ jsonrpc: string;
+ id: number | string | null;
+ result?: { tools?: Array<{ name: string }>; serverInfo?: { name: string } };
+ error?: { code: number; message: string };
+}
+
+async function rpc(method: string, params?: unknown): Promise {
+ const req = new Request("https://example.invalid/mcp", {
+ method: "POST",
+ headers: { "Content-Type": "application/json" },
+ body: JSON.stringify({ jsonrpc: "2.0", id: 1, method, params }),
+ });
+ // Env is unused for tools/list and initialize.
+ const env = { DATABASE_URL: "", VOYAGE_API_KEY: "" } as never;
+ const res = await handleMcpRequest(req, env);
+ return (await res.json()) as JsonRpcResponse;
+}
+
+test("tools/list returns the full ooxml_* tool set in the documented order", async () => {
+ const r = await rpc("tools/list");
+ const names = r.result?.tools?.map((t) => t.name) ?? [];
+ expect(names).toEqual([...EXPECTED_TOOL_NAMES]);
+});
+
+test("initialize advertises serverInfo.name as 'ooxml'", async () => {
+ const r = await rpc("initialize");
+ expect(r.result?.serverInfo?.name).toBe("ooxml");
+});