The OOXML spec, explained by people who actually implemented it.
An interactive reference for ECMA-376 (Office Open XML) built by the SuperDoc team. Live previews are rendered with SuperDoc itself — every example on the site is a working document.
The official ECMA-376 spec is 5,000+ pages. Most of it you'll never need, and the parts you do need often omit critical rendering details that only surface when you compare against Word's actual behavior. This site fills that gap with implementation notes from building SuperDoc — a document engine that renders OOXML natively in the browser.
This is also how people discover SuperDoc. By sharing what we've learned, we position ourselves as the OOXML experts. Every page should reflect that authority: practical, specific, from-experience.
Write for implementers, not spec lawyers. The audience is developers building document tools who need to know what the spec doesn't tell them.
Every doc page should answer:
- What does the XML look like? — Structure tree and live examples
- What does Word actually do? — Rendering behavior, especially where it diverges from the spec
- What will trip you up? — Implementation notes from real experience
Keep notes concise (1-2 sentences). Lead with the insight, not the backstory. Use app: "Word" when the behavior is Word-specific.
apps/
web/ React app (Vite, React Router, Tailwind)
src/data/docs.ts ← All doc pages live here (single source of truth)
src/components/ UI components (Sidebar, SuperDocPreview, etc.)
src/pages/ Route pages (Home, Docs, SpecExplorer, Mcp)
mcp-server/ Cloudflare Worker — MCP server for AI spec search
packages/
shared/ Database client, embedding client, types
scripts/
ingest/ PDF → chunks → embeddings → database pipeline
db/
schema.sql PostgreSQL + pgvector schema
dev/
data/ Extracted/chunked/embedded spec content
bun install # Install dependencies
bun dev # Web app at http://localhost:5173
bun dev:mcp # MCP server at http://localhost:8787
bun run build # Production build (web)
bun run typecheck # Type-check all packagesAll documentation lives in apps/web/src/data/docs.ts as a keyed object. Each page has a title, optional badge (OOXML element), and content array of typed blocks.
| Type | Purpose |
|---|---|
heading |
Section heading (level 2, 3, or 4) |
paragraph |
Prose text (supports markdown links) |
code |
Code/structure block with optional language |
preview |
Live OOXML rendered by SuperDoc (editable XML + preview) |
note |
Implementation note (critical / warning / info / tip) with optional app |
table |
Data table with headers and rows |
- Add an entry to the
docsobject inapps/web/src/data/docs.ts - Add a sidebar link in
apps/web/src/components/Sidebar.tsxunder the right section - The page auto-routes to
/docs/{key}
Follow this order (see existing pages for examples):
- Intro paragraph — what the element does, one sentence on why it matters
- Structure — element tree showing hierarchy and attributes
- Examples — live
previewblocks, start simple, build complexity - Implementation Notes — the real value; what the spec doesn't tell you
- Schema — reference table of elements/attributes
- critical — things that will break your implementation if you get them wrong
- warning — non-obvious behavior that affects rendering
- info — good to know, won't break things
- tip — helpful shortcuts or techniques
Use app: "Word" (or "Word, LibreOffice") when the behavior is application-specific. Omit app for universal observations.
The preview block type renders XML with SuperDoc loaded from unpkg. It creates a minimal .docx in-memory (via JSZip), passes it to SuperDoc, and shows a split view: editable XML on the left, live rendering on the right.
The XML you provide is wrapped in a minimal w:document > w:body structure automatically. Just provide the body content (paragraphs, tables, etc.).
Cloudflare Worker exposing three MCP tools for semantic spec search:
search_ecma_spec— semantic vector search across 18,000+ spec chunksget_section— fetch a specific section by ID (e.g., "17.3.1.24")list_parts— browse the spec structure
Uses PostgreSQL with pgvector (Neon serverless in production, Docker locally).
Ingests ECMA-376 PDFs into the vector database:
PDF → extract (Python) → chunk (6KB) → embed (Voyage) → upload (PostgreSQL)
Run the full pipeline: bun scripts/ingest/pipeline.ts
Local dev uses Docker (docker-compose.yml). Production uses NeonDB.
bun run db:up # Start PostgreSQL + pgvector
bun run db:down # Stop
bun run db:reset # Fresh database- Web app: Cloudflare Pages (
wrangler pages deploy dist) - MCP server: Cloudflare Workers (
wrangler deployfromapps/mcp-server/) - Database: NeonDB (serverless PostgreSQL)