CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Botdocs is an npm CLI tool that converts markdown documentation into static websites with optional client-side semantic search chatbots. The chatbot uses vector embeddings and cosine similarity search with Transformers.js running entirely in the browser.

Development Commands

Building

# Build TypeScript (CLI and builder code)
npm run build

# Build client-side code (browser bundle)
npm run build:client

# Watch mode for development
npm run dev

Testing Locally

# Link package globally for testing
npm link

# Run the CLI
botdocs ./test-docs

# Run with verbose output for debugging
botdocs ./test-docs -v

# Disable chatbot
botdocs ./test-docs --no-chat

Publishing

# Prepublish hook automatically runs npm run build
npm publish

Architecture

Dual Build System

The project has two separate build configurations that must both be executed:

Server/CLI Build (npm run build):
- Uses TypeScript compiler (tsc)
- Config: tsconfig.json
- Input: src/ (excludes src/client/)
- Output: dist/
- Builds Node.js CLI and builder code
Client Build (npm run build:client):
- Uses Vite bundler
- Config: tsconfig.client.json + vite.config.ts
- Input: src/client/
- Output: dist-client/
- Creates browser-ready bundle.js with Transformers.js

Code Organization

src/
├── cli/           # CLI entry point (Commander.js)
├── builder/       # Build pipeline (runs at build-time)
│   ├── index.ts              # Build orchestrator
│   ├── markdown-processor.ts # Parses markdown → HTML
│   ├── chunker.ts            # Splits docs into chunks
│   ├── embedder.ts           # Generates embeddings (build-time)
│   ├── vector-db-builder.ts  # Creates vector-db.json
│   ├── site-generator.ts     # Generates HTML pages
│   └── template-engine.ts    # HTML templating
├── client/        # Browser code (runs at runtime)
│   ├── main.ts               # Entry point
│   ├── navigation.ts         # Site navigation
│   ├── theme/                # Dark mode, etc.
│   └── chat/                 # Semantic search chatbot
│       ├── embedder.ts       # Client-side query embeddings
│       ├── vector-search.ts  # Cosine similarity search
│       ├── rag-engine.ts     # Search orchestration and result formatting
│       └── chatbox.ts        # UI component
├── types/         # TypeScript type definitions
├── templates/     # HTML templates
└── styles/        # CSS files (bundled at build time)

Build Pipeline (4 Phases)

When a user runs botdocs ./docs, the build process executes:

Scan & Parse: markdown-processor.ts finds .md files, parses front matter, converts to HTML with markdown-it
Chunk & Embed: chunker.ts splits by headings, embedder.ts generates 384-dim embeddings, vector-db-builder.ts creates vector-db.json
Generate Site: site-generator.ts applies templates, builds navigation, writes HTML files
Bundle Assets: Copies dist-client/bundle.js and CSS to output/assets/

Client-side Search Flow

When a user opens the generated site and uses the chatbot:

Initialization: embedder.ts loads e5-small-v2 embedding model via Transformers.js (cached after first load)
User asks question → embedder.ts embeds query into 384-dim vector
vector-search.ts searches vector-db.json via cosine similarity → retrieves top K most relevant chunks
rag-engine.ts formats results into a conversational response → chatbox.ts displays with citations

Note: The system uses semantic search only - no text generation. Results are presented in a natural format but are direct excerpts from the documentation.

Key Architectural Decisions

ES Modules: Package uses "type": "module" - all imports require .js extensions even for .ts files
Path Resolution: Builder code uses fileURLToPath(import.meta.url) and resolve(__dirname, '../../..') to navigate from dist/ back to project root
No Backend: Generated sites are fully static - vector DB and embedding model run in browser
Semantic Search Only: No LLM text generation - the chatbot retrieves and formats relevant documentation chunks
Build-time vs Runtime: Embeddings generated twice - once at build time (Node.js, for all chunks), once at runtime (browser, for user queries only)
Two tsconfigs: tsconfig.json for Node.js (no DOM), tsconfig.client.json for browser (DOM, no emit)

Configuration

Users can create botdocs.config.json in their docs directory:

{
  "title": "My Documentation",
  "chat": { "enabled": true },
  "build": { "chunkSize": 500, "chunkOverlap": 50, "topK": 5 }
}

CLI options (--no-chat, -v, -o) override config file settings.

Common Development Workflows

Adding a New CLI Option

Update src/cli/options.ts with new option type
Add option to Commander program in src/cli/index.ts
Pass through to build() in src/builder/index.ts
Update BuildOptions type in src/types/config.ts

Modifying the Build Pipeline

Edit relevant builder in src/builder/ (markdown-processor, chunker, embedder, etc.)
Run npm run build to recompile
Test with botdocs ./test-docs -v to see verbose output
Check generated files in output/

Updating Client Code (Chatbot, UI, etc.)

Edit files in src/client/
Run npm run build:client to rebuild browser bundle
Run full build: botdocs ./test-docs
Open output/index.html in browser to test

Debugging Build Issues

Use -v flag for verbose logging
Check that both builds succeeded: ls dist/ and ls dist-client/
Inspect generated output/vector-db.json to verify embeddings
Use browser DevTools to debug client-side code (sourcemaps enabled)

Dependencies

Build-time (Node.js)

@xenova/transformers: Embedding generation during build
markdown-it + plugins: Markdown parsing, syntax highlighting (Shiki)
gray-matter: Front matter parsing
commander: CLI argument parsing
fs-extra, glob: File operations

Client-side (Browser)

Transformers.js: Loads e5-small-v2 embedding model (384-dim, 2.2x faster than all-MiniLM-L6-v2)
Bundled into single bundle.js via Vite

Note: @mlc-ai/web-llm is listed in dependencies but may not be actively used - check model-loader.ts for current implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Development Commands

Building

Testing Locally

Publishing

Architecture

Dual Build System

Code Organization

Build Pipeline (4 Phases)

Client-side Search Flow

Key Architectural Decisions

Configuration

Common Development Workflows

Adding a New CLI Option

Modifying the Build Pipeline

Updating Client Code (Chatbot, UI, etc.)

Debugging Build Issues

Dependencies

Build-time (Node.js)

Client-side (Browser)

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Development Commands

Building

Testing Locally

Publishing

Architecture

Dual Build System

Code Organization

Build Pipeline (4 Phases)

Client-side Search Flow

Key Architectural Decisions

Configuration

Common Development Workflows

Adding a New CLI Option

Modifying the Build Pipeline

Updating Client Code (Chatbot, UI, etc.)

Debugging Build Issues

Dependencies

Build-time (Node.js)

Client-side (Browser)