Docs: nlu-architecture.md and tool-calling-runtime.md describe old FunctionGemma system

## Priority: Medium

Two docs in `docs/` still describe the pre-EmbeddingGemma architecture:

1. **`docs/nlu-architecture.md`** — Proposes `all-MiniLM-L6-v2` (23MB) as the embedding model and regex-only parameter extraction. The actual system uses EmbeddingGemma 308M and Qwen3 for slot filling. Status header has been updated to note this is a historical design doc, but the body still reads as a current proposal.

2. **`docs/tool-calling-runtime.md`** — Describes FunctionGemma 270M's 2K context budget and two-tier optimization. The current system does not use FunctionGemma at all.

### Options

- **Option A**: Rewrite both docs to reflect the current EmbeddingGemma + Qwen3 pipeline
- **Option B**: Keep as historical design docs with a clear header banner: "This document describes the original design. The current implementation uses EmbeddingGemma + Qwen3 — see architecture.md"

### Files

- `docs/nlu-architecture.md`
- `docs/tool-calling-runtime.md`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: nlu-architecture.md and tool-calling-runtime.md describe old FunctionGemma system #11

Priority: Medium

Options

Files

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Docs: nlu-architecture.md and tool-calling-runtime.md describe old FunctionGemma system #11

Description

Priority: Medium

Options

Files

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions