diff --git a/CHANGELOG.md b/CHANGELOG.md index 5b2d621..6785e47 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -38,6 +38,30 @@ tool does not recognize the token and cannot expand the template. ## OVOS-INTENT-2 — Locale Resource Formats +### 2 + +**Breaking change.** Version 2 reinterprets brace handling in `.prompt` and +`.dialog` files. A `.prompt` authored against version 1 changes meaning under +version 2: its `{{ … }}` sequences become substitution points, and its +`` sequences are no longer stripped. + +- §4.4 (`.prompt`) — the substitution variable is now the **double-brace** + `{{name}}` form **only**. A single `{name}`, a lone `{` or `}`, and literal + JSON/markup pass through unchanged, which is why prompts (full of literal + single braces) require double braces. The previous author-comment exception + is **removed**: a `.prompt` has no comment handling and `` is + literal text. `{{name}}` substitution is now the *only* special handling a + `.prompt` receives, dropping the prior fenced-code-block rule (a single-brace + `{name}` is already literal everywhere). +- §4.2 (`.dialog`) — now recognizes **both** named-slot forms, `{name}` and the + equivalent `{{name}}`, per OVOS-INTENT-1 §3.4, treating them identically. The + prior "single-brace only; no `{{ }}` form" restriction is dropped. +- §4.1 (`.intent`) — clarified to match OVOS-INTENT-1 §5.5: lines MAY declare + different sets of named slots, the intent's slot set is their union, and the + engine extracts only the slots of the matched template. The prior "every line + MUST declare the same set of slots; use a separate file" rule is dropped for + `.intent` (it still holds for `.dialog`, §4.2). + ### 1 - The locale folder layout and the plain-text resource file formats diff --git a/README.md b/README.md index e010d99..dc944dd 100644 --- a/README.md +++ b/README.md @@ -96,7 +96,7 @@ pre-spec status quo, `2` for one that is not backwards compatible. | ID | Document | Version | Status | |----|----------|---------|--------| | OVOS-INTENT-1 | [Sentence Template Grammar](intent-1.md) | 2 | Draft | -| OVOS-INTENT-2 | [Locale Resource Formats](intent-2.md) | 1 | Draft | +| OVOS-INTENT-2 | [Locale Resource Formats](intent-2.md) | 2 | Draft | | OVOS-INTENT-3 | [Intent Definition](intent-3.md) | 1 | Draft | | OVOS-INTENT-4 | [Intent and Entity Registration](intent-4.md) | 2 | Draft | diff --git a/intent-2.md b/intent-2.md index e4ba66c..6bd0409 100644 --- a/intent-2.md +++ b/intent-2.md @@ -1,6 +1,6 @@ # Locale Resource Formats Specification -**Spec ID:** OVOS-INTENT-2 · **Version:** 1 · **Status:** Draft +**Spec ID:** OVOS-INTENT-2 · **Version:** 2 · **Status:** Draft This document defines the **locale folder layout** and the **plain-text resource file formats** a skill ships so a voice assistant can recognize what the user @@ -166,9 +166,11 @@ grammar — expansion `(a|b)` / `[x]` and named slots `{name}`. **Role.** Defines an intent: the templates whose expanded samples train the engine to recognize one skill action. Matched against **ASR input**; named slots are filled by the engine at match time (OVOS-INTENT-1 §5.1). The file base name -is the intent name. Every line in the file MUST declare the **same set of named -slots** (OVOS-INTENT-1 §5.5); a phrasing that needs different slots belongs in a -separate `.intent` file. +is the intent name. Lines in the file **MAY** declare **different sets of named +slots** (OVOS-INTENT-1 §5.5); the intent's slot set is the **union** of the +slots declared across its templates, and the engine extracts only the slots of +the template that matched (OVOS-INTENT-1 §5.5, OVOS-INTENT-3 §5.1). A phrasing +that needs different slots therefore MAY live in the same `.intent` file. **Loads as.** The union of the sample sets of all lines (OVOS-INTENT-1 §4) — training data for the intent, with named slots intact. The engine generalizes @@ -203,8 +205,9 @@ a dialog slot, though an implementation MAY consult one. **Limitation.** A `.dialog` phrase uses the metacharacters `( ) [ ] { } |` structurally and therefore cannot contain any of them as literal spoken text. This is an accepted constraint; spoken responses rarely require them. A -`.dialog` file recognizes only the single-brace slot form `{name}` (§4.1); -there is no `{{ }}` double-brace form. +`.dialog` file recognizes **both** named-slot forms defined by OVOS-INTENT-1 +§3.4 — `{name}` and the equivalent double-brace `{{name}}` — and treats them +identically, since the two forms denote the same slot. **Rendering.** To render a dialog, an implementation selects one phrase, fills its named slots with caller-supplied values, and expands its `(a|b)` / `[x]` @@ -279,80 +282,76 @@ trailer **Format.** Prompt: the **whole file**, verbatim, is one prompt. It is plain text — **not** a template in the OVOS-INTENT-1 grammar — so `(`, `)`, `[`, -`]`, `<`, `>`, `|`, every newline, and all other characters are literal. There -is no expansion and no line filtering: a `.prompt` is read whole (§3), `#` -lines and blank lines included, because every character is part of the prompt. - -**Author-only comments.** The one exception to "every character is part of -the prompt" is **HTML-style comments**: a substring matching `` is -**stripped** before the file is handed to a language model. Comments may -span multiple lines; nesting is not supported. A `` in the same file is malformed — loaders **MUST** report it and -**SHOULD** treat the whole file as literal text (skipping comment -processing) rather than rendering a half-stripped prompt. Comments are -author-only notes (titling, attribution, reviewer cues) that never reach -the model. +`]`, `<`, `>`, `|`, every newline, a single `{` or `}`, and all other +characters are literal. There is no expansion and no line filtering: a +`.prompt` is read whole (§3), `#` lines and blank lines included, because every +character is part of the prompt. The **only** special handling a `.prompt` +receives is the **`{{name}}`** substitution described below; nothing else in +the file is interpreted. In particular there is **no comment handling**: an +HTML-style `` sequence is ordinary literal text and reaches the +language model unchanged. **Role.** The localized prompt a skill feeds to a language model. Like every other resource it is shipped per language under `locale//` and resolved through the override precedence of §2.1, so a prompt can be translated, adjusted per region, or overridden by a user. -**Substitution.** The one special construct is the **`{name}`** substitution -point. A slot **name** consists of lowercase ASCII letters, digits, and -underscores, and MUST NOT begin with a digit. At render time a caller supplies -values keyed by name; an occurrence of `{name}` is replaced by the -caller-supplied value **only when all three hold**: - -1. it forms a complete, well-formed `{name}` (a `{` not followed by a valid - name and a closing `}` is literal text — so `{}`, `{ }`, and JSON such as - `{"key": 1}` pass through untouched); -2. the caller supplied a value for that name; -3. it does **not** lie inside a fenced code block — text between a pair of - ```` ``` ```` fences (CommonMark fenced code block) is literal, so a - `{name}` shown as an example inside a code block is never substituted. - Fence detection follows CommonMark: a line whose first non-whitespace - content is three or more backticks opens or closes a fence; an open - fence with no closing fence in the file extends to end-of-file. - Implementations **MAY** use simpler heuristics (counting triple - backticks) so long as the well-formed cases of §4.4 render identically; - pathological inputs (nested fences, indented fences) are - implementation-defined. - -**Slots are optional.** A `{name}` for which the caller supplied no value is +**Substitution.** The one special construct is the **`{{name}}`** substitution +point — the **double-brace** form only. A slot **name** consists of lowercase +ASCII letters, digits, and underscores, and MUST NOT begin with a digit. At +render time a caller supplies values keyed by name; an occurrence of +`{{name}}` is replaced by the caller-supplied value **only when both hold**: + +1. it forms a complete, well-formed `{{name}}` — a double-brace pair enclosing + a valid name; +2. the caller supplied a value for that name. + +A **single** `{name}`, a lone `{` or `}`, and literal JSON or markup such as +`{}`, `{ }`, or `{"key": 1}` are **never** substitution points — they pass +through unchanged. This single-brace pass-through is precisely **why** a +`.prompt` requires the double-brace form: prompts routinely contain literal +single braces (JSON examples, code, set notation), and reserving substitution +to `{{name}}` lets that literal text survive untouched. (This is the opposite +convention to `.intent` / `.dialog`, where `{name}` and `{{name}}` are +equivalent per OVOS-INTENT-1 §3.4, because those templates cannot contain a +literal brace at all.) + +**Slots are optional.** A `{{name}}` for which the caller supplied no value is left **as literal text** — an unfilled slot is not an error. This is the deliberate opposite of `.dialog` (§4.2), where the caller MUST fill every slot and an unfilled one MUST NOT be rendered. A prompt is free-form text that may -legitimately contain brace characters the author never intended as slots, so -substitution is conservative: it touches only the names the caller explicitly -provides. +legitimately contain brace sequences the author never intended as slots, so +substitution is conservative: it touches only the `{{name}}` occurrences whose +names the caller explicitly provides. **Loads as.** The single whole-file string, with substitution applied per the rules above. The full content of a file `weather_report.prompt`, rendered with the caller -value `{"query": "weather in Lisbon"}` (a `#` line is ordinary prompt text -here — it is **not** stripped): +value `{"query": "weather in Lisbon"}` (a `#` line and an `` +sequence are both ordinary prompt text here — neither is stripped): ```` # Weather assistant + You are a concise weather assistant. Answer the user's question. -User asked: {query} +User asked: {{query}} Reply as JSON shaped like {"summary": "...", "temp_c": 0}. The {response} -placeholder shown in the code block below is illustrative only: +single-brace placeholder below is illustrative only: ``` {"summary": "{response}", "temp_c": 18} ``` ```` -`{query}` is substituted; the `# Weather assistant` heading is kept, the -literal JSON braces are left untouched, and the `{response}` inside the fenced -block is not substituted. A `{tone}` slot the caller passed no value for would -likewise stay literal. +`{{query}}` is substituted; the `# Weather assistant` heading and the +`` line are kept verbatim, the literal single-brace JSON is left +untouched, and the single-brace `{response}` is not a substitution point and so +is unchanged (whether or not it sits in a code block). A `{{tone}}` slot the +caller passed no value for would likewise stay literal. --- @@ -391,4 +390,4 @@ specification. the slot fill modes. Five resource roles — `.intent`, `.dialog`, `.entity`, `.voc`, `.blacklist` — are lists of templates written in this grammar. The `.prompt` role (§4.4) is not a template file; it is verbatim - text with optional `{name}` substitution only. + text with optional `{{name}}` double-brace substitution only.