From c1142f4d441ff6baa36430800130c5d4174aa6dd Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 03:03:39 +0100 Subject: [PATCH 01/27] =?UTF-8?q?docs:=20APPENDIX=20=E2=80=94=20audit-driv?= =?UTF-8?q?en=20corrections=20(pipeline=20+=20registration=20model)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Applies corrections found by auditing claims against actual OVOS source code: 1. **§6.7 enable/disable_intent legacy names corrected** to the real `mycroft.skill.enable_intent` / `mycroft.skill.disable_intent`. 2. **§6.4 direct-bus-subscribe claim broadened** — verified the standard ovos-padatious-pipeline-plugin and ovos-adapt-pipeline-plugin both subscribe directly to registration topics, not just downstream plugins. 3. **§6.4 "side-effects during match" softened** — audit confirms the official match_* methods are already side-effect-free; the skill-activation emit is orchestrator-side, not plugin-side. Rule reframed as forward-looking discipline. 4. **§3 / §4 / §6.4: PIPELINE-1 *refines* the plugin model rather than *introducing* it.** OVOSPipelineFactory, pipeline_plugins dict, _PIPELINE_MIGRATION_MAP, and the official plugin set already exist. PIPELINE-1's actual contribution narrows to: formalizing the contract, `:` polymorphism, universal `ovos.utterance.handled` end-marker, and the renames. 5. **§3 / §4 / §6.4: tier convention is compatible, not a divergence.** From the bus each tier is already a distinct `pipeline_id` in `Session.pipeline`. How a Python plugin class internally serves multiple `pipeline_id`s (one class with match_high/medium/low methods, an orchestrator-side suffix-decoder, separate plugin instances, etc.) is implementation choice the spec does not constrain. 6. **§4 / §6.4: registrations-are-broadcast is compatible, not a divergence.** OVOS already broadcasts registrations on the bus; plugins already subscribe directly. INTENT-4 does not change this — it only renames topics into the `ovos.intent.*` namespace (see §6.7). Migration is a string replacement. What IS new is the orchestrator's passive registration index that backs `ovos.intent.list` / `.describe` — that's added as a separate §6.4 divergence ("new orchestrator responsibility, not a change to existing behaviour"). 7. **§6.6 adds note on engine-specific introspection topics** (`intent.service.adapt.*`, `intent.service.padatious.get`) — plugin-defined surface; spec does not claim authority over them. No spec-body changes; APPENDIX only. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 564 ++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 411 insertions(+), 153 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 0f9fc71..b54300d 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -4,10 +4,15 @@ specifications. It records design rationale, comparisons with other systems, the catalogue of *deliberate* divergences from current OVOS code, and topics worth discussing that do not belong in a normative -specification. Nothing here is binding — OVOS-INTENT-1, OVOS-INTENT-2, -OVOS-INTENT-3, and OVOS-MSG-1 are the only normative documents. This -appendix exists so the specs themselves can stay terse and -requirement-focused. +specification. Nothing here is binding — the normative documents are +OVOS-INTENT-1, OVOS-INTENT-2, OVOS-INTENT-3, OVOS-INTENT-4, +OVOS-MSG-1, and OVOS-PIPELINE-1. This appendix exists so the specs +themselves can stay terse and requirement-focused. + +Pointers to specific OVOS code (file paths, class names, function +names) are deliberately kept *out* of the spec bodies and collected +here where appropriate, because implementation code moves and +specifications must not. --- @@ -122,32 +127,81 @@ engine-agnostic contract and the pipeline. --- -## 3. The pipeline — what these specs do not cover - -The intent specs (OVOS-INTENT-1/2/3) formalize **intent definition**: -the grammar, the resource files, what an intent is, the intent-engine -contract. OVOS-MSG-1 formalizes the bus that carries the result. -The piece that sits *around* both — the multi-stage **pipeline** that -decides which intent engine even gets a turn, interleaves -confidence tiers, runs `converse` / `fallback` / `common_query` / -`ocp` / `persona` stages, and produces the universal -`ovos.utterance.handled` end-marker — is not formalized by any spec -in this repository yet. - -That gap is what makes OVOS structurally distinctive (HA and Rhasspy -have no equivalent layer), and what most reviewers ask about -first. The natural next formalization is a pipeline / utterance- -lifecycle specification; see §7 known gaps. - -One observation worth flagging here: **the engine-agnostic intent -contract is already realized**, not hypothetical. `ovos-persona` plugs -into the pipeline as a first-class LLM stage (`persona-high`, -`persona-low`) — the OVOS-INTENT-3 §6.2 non-normative note about -LLM-backed engines describes something that ships today. The -ordered confidence-tier chain (deterministic Adapt before fuzzy -Padatious before an LLM persona last) is also how the system -*bounds* engine generalization in practice: generalization is not -unconstrained, it is bounded by where an engine sits. +## 3. The pipeline-plugin model + +The piece that sits *around* the intent and bus stacks — the +multi-stage orchestrator that decides which engine even gets a +turn, runs `converse` / `fallback` / `common_query` / `ocp` / +`persona` stages, and produces the universal +`ovos.utterance.handled` end-marker — is what makes OVOS +structurally distinctive (Home Assistant and Rhasspy have no +equivalent layer). + +The plugin abstraction is **already in current code**: +`OVOSPipelineFactory` loads pipeline plugins by id at startup, +the orchestrator holds them in a `pipeline_plugins` dict keyed +on `pipeline_id`, and the default `Session.pipeline` is an +ordered list of plugin identifiers (with a migration map +translating legacy `padatious_high`-style names into +modern `ovos-padatious-pipeline-plugin-high`-style ones). The +official `ovos-padatious-pipeline-plugin`, +`ovos-adapt-pipeline-plugin`, `ovos-converse-pipeline-plugin`, +`ovos-fallback-pipeline-plugin`, `ovos-common-query-pipeline-plugin`, +`ovos-ocp-pipeline-plugin`, and the persona plugins all +already conform to this model. + +OVOS-PIPELINE-1's contribution is therefore a **prescriptive +refinement**, not a wholesale new abstraction. It: + +- formalizes the plugin contract (the `match` shape, the `Match` + result, the side-effect-free discipline); +- defines `:` **dispatch polymorphism** so + a plugin can bundle its own handler (a language-model persona, + a chatbot) as a first-class participant alongside skill-owned + handlers; +- prescribes the **universal `ovos.utterance.handled` end-marker** + on every terminal path; +- renames `session.pipeline` → `session.pipeline_stages` and the + `mycroft.skill.handler.*` trio → `ovos.intent.handler.*`. + +The current high/medium/low confidence-tier convention is +**compatible** with PIPELINE-1 and out of scope for the spec. +From the bus's perspective each tier is already a distinct +`pipeline_id` in the session's pipeline list (e.g. +`padatious_high`, `padatious_medium`, `padatious_low`), which is +exactly what the spec prescribes. How a Python plugin class +internally serves multiple `pipeline_id`s — for example one class +with `match_high` / `match_medium` / `match_low` methods, an +orchestrator-side suffix-decoding helper, three separate plugin +instances, etc. — is implementation choice this spec does not +constrain. + +Three properties make the resulting model unusually expressive: + +- **All plugins are equivalent.** No spec-level distinction + between intent engines, converse handlers, fallbacks, + language-model personas, classic chatbots, anything else. + They all expose the same `match` contract. A deployment loads + whichever plugins its skills need. +- **Skills and plugin-bundled handlers are indistinguishable as + handler owners.** From outside, the assistant responded — the + user does not know or care whether a skill matched against a + registered intent or a language-model plugin generated the + response on the fly. +- **The engine-agnostic intent contract is already realized**, + not hypothetical. OVOS persona plugins (`ovos-persona`, + `ovos-persona-server`, `ovos-claude-plugin`, + `ovos-openai-plugin`, etc.) plug into the pipeline as + first-class language-model stages. The ordered chain + (deterministic keyword engines before fuzzy template engines + before language-model fallbacks last) is also how the system + *bounds* generalization in practice. + +What OVOS-PIPELINE-1 deliberately leaves out: **per-plugin +behavioural contracts**. A `converse` plugin, a `fallback` +plugin, a persona plugin: each defines itself. PIPELINE-1 only +defines the contract every plugin conforms to and the universal +utterance lifecycle around the iteration. --- @@ -237,6 +291,73 @@ reasoning, not the requirement. state, and similar concerns are deferred to future specifications; see §5.2 for the model and §7 for the list of planned work. +### Intent registration broadcast (INTENT-4) + +- **Registrations are broadcast — already how OVOS works.** Skills + emit registration messages on the bus; plugins that care about a + particular registration kind subscribe to the corresponding + topic. There has never been a central routing party in OVOS; + INTENT-4 just gives this existing model normative topic names. + The legacy bus topics (`padatious:register_intent`, + `register_vocab`, etc.) are renamed into the `ovos.intent.*` + namespace — see §6.7 for the mapping. A migration to the + prescribed topic names is mostly a string replacement. +- **No "no plugin claimed" error.** Following from the + broadcast model: a registration that no plugin consumes is + silently dropped. The producer gets no signal — the + introspection topics (`ovos.intent.list` / + `ovos.intent.describe`) are the supported way to verify what + the orchestrator's passive index recorded. +- **The orchestrator passively indexes; it does not gate.** The + introspection topics serve from a passive registration index + built by listening to broadcasts (this *is* new — current OVOS + has no central index). The index reflects what skills + *declared*, not what plugins actually match against — + observability-only. + +### Pipeline plugins (PIPELINE-1) + +- **The plugin model is already in place; PIPELINE-1 refines it** + (see §3). The current orchestrator already loads plugins by id + through `OVOSPipelineFactory` and iterates `Session.pipeline`. + PIPELINE-1 tightens the contract rather than introducing the + abstraction. +- **Orchestrator and plugin contracts live in one spec**, since + the orchestrator's job *is* iterating plugins and translating + their matches into bus events. Splitting them would leave + neither coherent. +- **Plugin contract is minimal.** `match(utterance, session) → + Match | None`. Side-effect-free during `match`; everything + else (state, registrations, language-model calls, response + generation) is plugin-internal black box. The smaller the + contract, the wider the set of plugins it accommodates. +- **Tier conventions are out of scope.** The current high / + medium / low suffix is implementation strategy: from the bus, + each tier is already a distinct `pipeline_id` in + `Session.pipeline`. PIPELINE-1 prescribes only that the + orchestrator iterates opaque `pipeline_id`s; whether a Python + plugin class internally serves multiple tiers via + `match_high` / `match_medium` / `match_low` methods, separate + plugin instances, or anything else is implementation choice the + spec does not constrain. The current convention is compatible + with PIPELINE-1 unchanged. +- **Skills and plugins are equivalent handler owners.** Dispatch + topic `:` polymorphism (owner is + `skill_id` or `pipeline_id`) lets a plugin bundle its own + handler — for example, a language-model persona plugin that + has no skills behind it — and still be addressed uniformly. + From outside, the assistant responded; the internal owner + type is invisible. +- **Universal `ovos.utterance.handled` end-marker on every + terminal path.** One reserved invariant lets observers count + turns, route fallbacks, and know "the assistant is idle now" + without per-stage knowledge. +- **`session.pipeline_stages` is per-session.** Different + sessions can carry different pipeline configurations — for + example, a remote-peer session may run a restricted pipeline + that excludes destructive plugins. This composes with the + layer-2 substrate (§5) without orchestrator-side changes. + --- ## 5. The OVOS bus as a substrate @@ -378,84 +499,213 @@ These specifications are *prescriptive*. Some of what they prescribe matches what runs in OVOS today verbatim; some is a deliberate cleanup the implementations are expected to grow into. This section catalogues every known divergence so implementers know what to -migrate and reviewers know what to expect. (OVOS-MSG-1 is by far the -spec closest to current code; the catalogue below is correspondingly -short. Later specs will add more entries.) +migrate and reviewers know what to expect. ### 6.1 Already aligned -The following are formalizations of behaviour that already exists in -current OVOS code paths and need no implementation change: +Formalizations of behaviour that exists in current OVOS code and +needs no implementation change: - The Message envelope (`type` / `data` / `context`) — matches `ovos-bus-client.Message`. -- `source`, `destination` semantics, including the - `Message.reply` swap — matches `ovos-bus-client/message.py`. +- `source`, `destination` semantics including the `Message.reply` + swap — matches `ovos-bus-client/message.py`. - `context.session` as a serialized Session object — matches - `ovos-bus-client/client/client.py`'s `message.context["session"] = - sess.serialize()`. -- `session.session_id == "default"` for device-local origin — matches - `ovos-audio/utils.py`'s `require_default_session` decorator. + `ovos-bus-client/client/client.py`'s + `message.context["session"] = sess.serialize()`. +- `session.session_id == "default"` for device-local origin — + matches `ovos-audio/utils.py`'s `require_default_session` + decorator. - `session.lang` as the user's preferred language — matches the - Session class's `lang` attribute and existing OVOS read paths. + Session class's `lang` attribute. - `forward` / `reply` / `response` derivation semantics — matches `ovos-bus-client.Message.{forward,reply,response}`. -- The `.response` suffix convention — pervasive across OVOS topics - today. - -### 6.2 New, no legacy - -The only thing OVOS-MSG-1 introduces that has no direct precedent in -current code: - -- The **materialize-default-session** rule on `forward` / `reply` / - `response` (MSG-1 §4.3) — formalizes a "MAY" convenience for - in-process subsystems; not currently implemented, but compatible - with current behaviour (today `session` is propagated only when - present, never materialized). - -### 6.3 Things the spec does *not* change - -- The session object's internal shape beyond `session_id` and `lang` - — every other field current OVOS puts inside `context.session` - remains opaque under this spec until the future session - specification. -- The Mycroft-era `mycroft.*` topic prefix outside the intent layer - (e.g. `mycroft.audio.*`) — these are not part of any spec here and - are out of scope. +- The `.response` suffix convention — pervasive across OVOS + topics today. +- The `recognizer_loop:utterance` entry point and + `complete_intent_failure` no-match topic (PIPELINE-1) — match + current topic names verbatim. +- `ovos.utterance.cancelled` and `ovos.utterance.handled` + (PIPELINE-1) — match current topic names verbatim. +- Per-utterance first-match-wins iteration (PIPELINE-1) — matches + `ovos-core/intent_services/service.py`'s + `handle_utterance` / `get_pipeline`. +- Per-session pipeline configuration (PIPELINE-1) — matches + `Session.pipeline` (modulo the field rename in §6.3 below). +- The `:` dispatch topic shape (PIPELINE-1) + — matches current OVOS practice; skills already subscribe to + these topics. + +### 6.2 Prescriptive renames + +| Spec | Current | Prescribed | Notes | +|------|---------|------------|-------| +| INTENT-3 v1.1 | "host" | "orchestrator" | Editorial; conformance unchanged. | +| PIPELINE-1 | `session.pipeline` | `session.pipeline_stages` | The current field name is ambiguous; the new name is explicit. | +| PIPELINE-1 | `mycroft.skill.handler.start` / `.complete` / `.error` | `ovos.intent.handler.start` / `.complete` / `.error` | Renamed into the `ovos.intent.*` namespace for uniformity. Breaks every existing handler-lifecycle observer; the migration cost is real (see §B in PR #11 discussion). | + +### 6.3 Prescriptive shape changes + +- **Keyword intent registration is atomic** (INTENT-4 §5). Today + a keyword intent is built up via multiple `register_vocab` + messages followed by a `register_intent` with an Adapt + `IntentBuilder.__dict__` payload. INTENT-4 collapses this into + a single message with structured `{required, optional, one_of, + excluded}` arrays of vocabulary descriptors. Every skill's + keyword-intent path needs to be rewritten in the worship layer. +- **Template intent registration uses structured identity** + (INTENT-4 §6). Today `padatious:register_intent` carries + `{name, samples, file_name, lang, blacklisted_words}`; the + prescribed shape uses the structured `(skill_id, intent_name, + lang)` triple plus `samples|file` and `blacklist|blacklist_file`. +- **Dispatch payload uses polymorphic `owner_id`** (PIPELINE-1 + §7.1). Today dispatch carries `skill_id` only. PIPELINE-1's + `owner_id` is either a `skill_id` or a `pipeline_id` — same + field, polymorphic value. +- **Handler-lifecycle payload includes `owner_id`** (PIPELINE-1 + §8.2). Today the trio payload is `{name: }`. + Prescribed: `{owner_id, intent_name, optional exception}`. + +### 6.4 Architectural divergences + +- **Intent stages and non-intent stages dissolve into one + abstraction** (PIPELINE-1 §2, §3). Today the orchestrator + treats every loaded plugin uniformly (calls `match`, dispatches + on the returned `match_type`); the conceptual distinction + between "intent engine" and "non-intent stage" is internal to + plugin authors, not to the orchestrator. PIPELINE-1 makes the + uniform model normative by defining `:` + polymorphism for plugin-bundled handlers — letting a + plugin-bundled handler (e.g. a language-model persona) be + addressed on the bus the same way a skill-owned handler is. +- **The orchestrator maintains a passive registration index** + (INTENT-4 §10). Today there is no central index — each plugin + knows what it consumed; nothing aggregates that view. INTENT-4 + prescribes the orchestrator subscribe to all registration + topics in parallel with plugins and serve + `ovos.intent.list` / `ovos.intent.describe` from the passive + view. This is a new orchestrator responsibility, not a change + to existing behaviour. +- **Plugins are side-effect-free during `match`** (PIPELINE-1 + §4.2). This is a forward-looking rule rather than a fix for + current code. The standard `match_high`/`match_medium`/ + `match_low` methods in the official plugins are already + side-effect-free (they compute and return). Where side effects + do happen today, they are orchestrator-side after the match + wins (e.g. the `.activate` emit in + `ovos-core/intent_services/service.py:365`), or in *other* bus + handlers a plugin subscribes to. The spec rule keeps the + current discipline normative as alternative plugin types + (LLM-backed, agent-backed) are written. +- **`ovos.utterance.handled` on every terminal path** (PIPELINE-1 + §9.6). Current `ovos-workshop`'s `_on_event_error` does not + emit it on the handler-error path (`ovos.py:1478-1497`). The + spec requires it. Fix tracked separately as a workshop + implementation bug. + +### 6.5 New topics with no direct precedent + +- **`ovos.intent.matched`** (PIPELINE-1 §9.2). The + positive-match broadcast notification. Current OVOS has + `complete_intent_failure` for the negative case but no + positive equivalent. +- **`ovos.intent.list` / `ovos.intent.describe`** (INTENT-4 §10). + Introspection topics served from the orchestrator's passive + registration index. +- **Materialize-default-session rule** on `forward` / `reply` / + `response` (MSG-1 §4.3). Formalizes a "MAY" convenience for + in-process subsystems; not currently implemented but compatible + with current behaviour. + +### 6.6 Things the specs do *not* change + +- The session object's internal shape beyond `session_id`, + `lang`, and `pipeline_stages` (deferred to a future session + spec). +- The `mycroft.*` topic prefix outside the intent layer (e.g. + `mycroft.audio.*`) — these are not part of any spec here. +- The `:` dispatch topic — kept verbatim + from current OVOS so no skill needs to migrate its handler + subscription. +- **Engine-specific introspection topics.** The standard plugins + expose their own debug / inspection topics — for example + `intent.service.adapt.reply`, + `intent.service.adapt.manifest`, + `intent.service.adapt.vocab.manifest`, and + `intent.service.padatious.get`. These are + plugin-specific surface, parallel to the spec's generic + `ovos.intent.list` / `ovos.intent.describe` (INTENT-4 §10). + The specs do not claim authority over them — they remain + plugin-defined and may continue to coexist with the + orchestrator's generic index. + +### 6.7 Predecessor-topic mapping + +The bus topics formalized by INTENT-4 and PIPELINE-1 replace a +number of legacy names. Implementer migration aid: + +#### Registration topics (INTENT-4) + +| Legacy topic | v1 replacement | Notes | +|--------------|---------------|-------| +| `register_vocab` | folded into `ovos.intent.register.keyword` | Vocabularies in v1 are inline `samples` or `file`-by-path inside the registration. | +| `register_intent` (Adapt parser) | `ovos.intent.register.keyword` | Adapt's `IntentBuilder.__dict__` payload replaced by the structured shape. | +| `padatious:register_intent` | `ovos.intent.register.template` | Same content, structured payload. | +| `padatious:register_entity` | `ovos.entity.register` | Entities are not Padatious-specific. | +| `detach_intent` | `ovos.intent.deregister` | Identity now expressed as the structured triple, not the munged `skill_id:intent_name` string. | +| `detach_skill` | `ovos.skill.deregister` | | +| `mycroft.skill.enable_intent` / `mycroft.skill.disable_intent` | `ovos.intent.enable` / `ovos.intent.disable` | First-class topics under v1, with the prefix dropped. | + +#### Utterance-lifecycle topics (PIPELINE-1) + +| Topic | Status | +|-------|--------| +| `recognizer_loop:utterance` | **unchanged** — kept as the entry point. | +| `complete_intent_failure` | **unchanged** — kept as the no-match signal. | +| `ovos.utterance.cancelled` | **unchanged** — kept as the cancellation signal. | +| `ovos.utterance.handled` | **unchanged** — kept as the universal end-marker. | +| `:` | **unchanged** — kept as the dispatch topic; PIPELINE-1 extends the shape to `:` so plugins can also own handlers. | +| `mycroft.skill.handler.start` / `.complete` / `.error` | renamed to `ovos.intent.handler.start` / `.complete` / `.error` | + +#### Out of scope + +| Topic | Status | +|-------|--------| +| `add_context` / `remove_context` | Adapt conversational context — not part of intent registration. A future spec may define it. | +| `.activate` | Activity-tracking emit currently in `ovos-core`; not part of any spec here. | --- ## 7. Known gaps and planned work -- **A bus-level intent registration and dispatch spec.** OVOS-MSG-1 - defines the envelope and the routing/session keys, but the - *concrete topics* for intent registration, match notification, - handler dispatch, and the handler-lifecycle messages - (`mycroft.skill.handler.{start,complete,error}` etc.) are still - informal. The natural next bus spec is OVOS-INTENT-4, which builds - on OVOS-MSG-1 + OVOS-INTENT-3. -- **A pipeline specification.** Stage ordering, the confidence-tier - model, and the contracts for `converse`, `fallback`, - `common_query`, `ocp`, and `persona` stages are unspecified (§3). +- **Per-plugin behavioural specs.** PIPELINE-1 defines the plugin + contract (the `match` shape, the orchestrator's iteration + semantics) but explicitly defers what each non-trivial plugin + type actually *does*. Real candidates for their own + specifications: `converse`, `fallback`, `common_query`, `ocp`, + `persona`, `stop`. Each defines its own internal behaviour and + its own bus emissions beyond the universal lifecycle PIPELINE-1 + prescribes. - **A session specification.** MSG-1 §4 carries `session` opaquely - and names only `session_id` and `lang`. Everything else about the - session is deferred — see §5.2 for the explicit list: session - lifecycle (start, end, expiry, resumption), the full set of - session preferences current OVOS already carries (`pipeline`, - `site_id`, `persona_id`, `time_format`, `date_format`, + and names only `session_id` and `lang`; PIPELINE-1 §5 adds + `pipeline_stages`. Everything else about the session is + deferred — session lifecycle (start, end, expiry, resumption), + the full set of session preferences current OVOS already carries + (`site_id`, `persona_id`, `time_format`, `date_format`, `system_unit`, `tts_preferences`, …), and the shape of any - conversational state. The future session specification will pick - these up; MSG-1's job is to make sure the carrier is in place. -- **A multi-turn conversation specification.** When a skill asks a - question and waits for the next utterance, the "next utterance - belongs to that pending question" link is not formalized today - (handled informally by `converse` + skill-side state). MSG-1's - async-by-default stance (§5.2) leaves room for this to be - formalized either in the session spec or as a separate one. -- **Intent context.** Adapt's `add_context` / `remove_context` - feature — where one intent's match influences a later intent's - eligibility — is not formalized at the spec level. See §5.2. + conversational state. A future session specification picks + these up. +- **A multi-turn conversation specification.** When a skill asks + the user a question and waits for the next utterance, the "next + utterance belongs to that pending question" link is not + formalized today (handled informally by the `converse` plugin + type plus skill-side state). MSG-1's async-by-default stance + (§5.2) leaves room for this to be formalized either in the + session spec or as a separate one. +- **Intent context.** The Adapt-era `add_context` / + `remove_context` feature — where one intent's match influences a + later intent's eligibility — is not formalized at the spec + level. - **Text normalization of ASR output.** The basis for slot value typing (OVOS-INTENT-1 §5.3). Deferred to its own specification. - **A machine-checkable conformance corpus** of `template → sample @@ -503,19 +753,26 @@ grammar-level conformance corpus (§7). How the specification set was arrived at — context that explains the *why*, but that has no place in a normative document. -### 9.1 The set, in two stacks +### 9.1 The set, in three stacks -Built bottom-up in two stacks: +Built bottom-up in three stacks: -- The **intent stack**, in dependency order: OVOS-INTENT-1 (template - grammar) → OVOS-INTENT-2 (resource files built on it) → - OVOS-INTENT-3 (the intent concept, built on both). +- The **intent stack**, in dependency order: OVOS-INTENT-1 + (template grammar) → OVOS-INTENT-2 (resource files) → + OVOS-INTENT-3 (the intent concept) → OVOS-INTENT-4 (the + registration wire format on the bus). - The **bus stack**, anchored on existing `ovos-bus-client` wire format: OVOS-MSG-1 formalizes the envelope, routing, session carrier, and `forward`/`reply`/`response` derivations. Originally drafted as two specs (envelope + session/routing) and merged once it became clear the derivations could only meaningfully be defined where the routing keys lived. +- The **orchestrator stack**: OVOS-PIPELINE-1 defines the + orchestrator, the pipeline-plugin abstraction, the utterance + lifecycle, and the handler-lifecycle trio. Sits on top of the + bus stack (uses MSG-1's envelope and routing) and around the + intent stack (intent registrations are one kind of input + pipeline plugins consume). Each was a formalization pass over machinery already running in production (§1), not a greenfield design. @@ -523,68 +780,69 @@ production (§1), not a greenfield design. ### 9.2 The reference implementation The specs are implementation-agnostic, but a spec benefits from -one conformant implementation. **ovos-spec-tools** is that for the -intent stack — expander, resource loader, dialog renderer, language -matching, locale linter, in one dependency-light package. It -exists because the same machinery had drifted across six separate -copies in the ecosystem; ovos-spec-tools is what those components -are meant to converge on, and the intended home of the planned -conformance corpus. - -The bus stack does not yet have a comparable reference; -`ovos-bus-client` is the closest match for MSG-1 but predates the -spec. +one conformant implementation. **ovos-spec-tools** is that for +the intent stack — expander, resource loader, dialog renderer, +language matching, locale linter, in one dependency-light +package. It exists because the same machinery had drifted across +six separate copies in the ecosystem; ovos-spec-tools is what +those components are meant to converge on, and the intended home +of the planned conformance corpus. + +The bus and orchestrator stacks do not yet have a comparable +reference; `ovos-bus-client` is the closest match for MSG-1 and +`ovos-core` is the closest match for PIPELINE-1 + INTENT-4, but +both predate the specs. ### 9.3 Audit-driven refinement -Before initial release, each spec was revised across several review -rounds — malformed-form rules, the expansion algorithm, slot -handling, the envelope/routing split (later un-split, see §9.1), -cross-spec terminology. Those rounds happened pre-release, so they -left no intermediate version numbers behind: the audited result -*is* version 1. The CHANGELOG records versioned changes from there -on. +Before initial release, each spec was revised across several +review rounds — malformed-form rules, the expansion algorithm, +slot handling, the envelope/routing split (later un-split, see +§9.1), the host → orchestrator rename, the +intent-stage-vs-non-intent-stage distinction (later dissolved +into the uniform pipeline-plugin abstraction), cross-spec +terminology. Those rounds happened pre-release, so they left no +intermediate version numbers behind: the audited result *is* +version 1 (or 1.1 where editorial-only). The CHANGELOG records +versioned changes from there on. --- ## 10. Compatibility levels -Each specification carries its own integer `Version`, bumped per -PR per the contributing rules in the README. The architecture as a -whole is also spoken of at **compatibility levels** — versioned -snapshots a tool may target, and that `ovos-spec-lint` checks -against. - -The levels defined to date apply to the **intent stack** -(OVOS-INTENT-1/2/3): - -- **V0** — *informal.* The undocumented, de-facto behaviour of - Mycroft- and OVOS-derived code from before these specifications - existed. V0 is not specified anywhere; it is the baseline the - formalization started from, named here only so tools can refer to - "pre-spec" behaviour. V0 has no notion of the `.blacklist` - resource role or of `` references. -- **V1** — the specifications as first formalized: OVOS-INTENT-1, - -2 and -3, each at version 1. V1's headline addition over V0 is - the `.blacklist` role — formalized intent suppression. -- **V2** — V1 plus **inline vocabulary references** (the `` - token): OVOS-INTENT-1 and OVOS-INTENT-2 at version 2. A V2 - template cannot be expanded by a V1 tool, so V2 is not backward - compatible with V1. - -A specification that does not change between levels keeps its -lower version number — OVOS-INTENT-3 is at version 1 in both V1 -and V2. - -### How the bus stack will be layered in - -OVOS-MSG-1 introduces the bus envelope, which is structurally -orthogonal to the intent stack — a tool can implement the intent -stack without the bus envelope and vice versa. As more bus-layer -specs land, the compatibility-level model is expected to evolve; -the current V0–V2 ladder may grow a second axis or be replaced -with per-stack ladders. - -Until that's settled, the bus-layer specs (OVOS-MSG-1 and the -others in the pipeline behind it) are versioned individually but -not yet placed on a compatibility ladder. +Each specification carries its own integer (or minor) `Version`, +bumped per PR per the contributing rules in the README. The +architecture as a whole was previously spoken of at +**compatibility levels** — versioned snapshots a tool may target, +checked against by `ovos-spec-lint`. + +The compatibility-level model was designed when the architecture +was one stack (the intent grammar / resources / intent definition +chain) and a single integer cleanly identified "all the specs at +once." With the addition of the bus and orchestrator stacks, that +single-axis model no longer describes the architecture. + +The historical intent-stack ladder: + +- **V0** — *informal.* The undocumented, de-facto behaviour from + before these specifications existed. V0 is not specified + anywhere; it is the baseline the formalization started from. + V0 has no notion of the `.blacklist` resource role or of + `` references. +- **V1** — the intent stack as first formalized: OVOS-INTENT-1, + -2 and -3, each at version 1. V1's headline addition over V0 + is the `.blacklist` role. +- **V2** — V1 plus **inline vocabulary references** (the + `` token): OVOS-INTENT-1 and OVOS-INTENT-2 at version 2. + A V2 template cannot be expanded by a V1 tool. + +These intent-stack levels continue to make sense in isolation. +The bus stack (OVOS-MSG-1), the registration spec (OVOS-INTENT-4), +and the orchestrator spec (OVOS-PIPELINE-1) are versioned +**individually** and not placed on a unified compatibility +ladder. A tool targeting them today cites per-spec versions: +"MSG-1 v1, INTENT-4 v1, PIPELINE-1 v1." Whether the compat-level +model evolves into a multi-axis grid, per-stack ladders, or is +quietly deprecated in favour of per-spec versions only, is +deferred. + From 4af7908fb03f111229079fad9452f17421519fe9 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 04:06:50 +0100 Subject: [PATCH 02/27] =?UTF-8?q?docs:=20APPENDIX=20=C2=A76.4=20=E2=80=94?= =?UTF-8?q?=20drop=20the=20"dissolution"=20divergence?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Same logic as the broadcast-registrations correction: the orchestrator already treats every loaded plugin uniformly, and `IntentHandlerMatch.match_type` is an opaque string the plugin chooses — nothing in current code prevents a plugin from setting `match_type = ":"` and being dispatched to itself. The `:` polymorphism PIPELINE-1 names is therefore already supported; the spec only writes down a convention current code allows but does not document. Design rationale around the polymorphism stays in §3/§4 — it is useful explicit naming. But it is not a divergence and should not sit in the divergence catalogue. §6.4 now contains a single real divergence: the orchestrator's new passive registration index backing `ovos.intent.list` / `.describe`. Everything else in §6.4 is forward-looking discipline or a workshop bug, not an architectural change. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index b54300d..44f1579 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -568,16 +568,6 @@ needs no implementation change: ### 6.4 Architectural divergences -- **Intent stages and non-intent stages dissolve into one - abstraction** (PIPELINE-1 §2, §3). Today the orchestrator - treats every loaded plugin uniformly (calls `match`, dispatches - on the returned `match_type`); the conceptual distinction - between "intent engine" and "non-intent stage" is internal to - plugin authors, not to the orchestrator. PIPELINE-1 makes the - uniform model normative by defining `:` - polymorphism for plugin-bundled handlers — letting a - plugin-bundled handler (e.g. a language-model persona) be - addressed on the bus the same way a skill-owned handler is. - **The orchestrator maintains a passive registration index** (INTENT-4 §10). Today there is no central index — each plugin knows what it consumed; nothing aggregates that view. INTENT-4 From e3897066353f3d44160761cadf76a1c8b1a31eac Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 04:17:51 +0100 Subject: [PATCH 03/27] APPENDIX: keep session.pipeline (revert the rename row) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PIPELINE-1 now keeps the existing `session.pipeline` field name instead of renaming it to `pipeline_stages`. Drop the §6.2 rename row and revert the prose mentions. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 44f1579..0a30fdf 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -161,8 +161,7 @@ refinement**, not a wholesale new abstraction. It: handlers; - prescribes the **universal `ovos.utterance.handled` end-marker** on every terminal path; -- renames `session.pipeline` → `session.pipeline_stages` and the - `mycroft.skill.handler.*` trio → `ovos.intent.handler.*`. +- renames the `mycroft.skill.handler.*` trio → `ovos.intent.handler.*`. The current high/medium/low confidence-tier convention is **compatible** with PIPELINE-1 and out of scope for the spec. @@ -352,7 +351,7 @@ reasoning, not the requirement. terminal path.** One reserved invariant lets observers count turns, route fallbacks, and know "the assistant is idle now" without per-stage knowledge. -- **`session.pipeline_stages` is per-session.** Different +- **`session.pipeline` is per-session.** Different sessions can carry different pipeline configurations — for example, a remote-peer session may run a restricted pipeline that excludes destructive plugins. This composes with the @@ -541,7 +540,6 @@ needs no implementation change: | Spec | Current | Prescribed | Notes | |------|---------|------------|-------| | INTENT-3 v1.1 | "host" | "orchestrator" | Editorial; conformance unchanged. | -| PIPELINE-1 | `session.pipeline` | `session.pipeline_stages` | The current field name is ambiguous; the new name is explicit. | | PIPELINE-1 | `mycroft.skill.handler.start` / `.complete` / `.error` | `ovos.intent.handler.start` / `.complete` / `.error` | Renamed into the `ovos.intent.*` namespace for uniformity. Breaks every existing handler-lifecycle observer; the migration cost is real (see §B in PR #11 discussion). | ### 6.3 Prescriptive shape changes @@ -610,7 +608,7 @@ needs no implementation change: ### 6.6 Things the specs do *not* change - The session object's internal shape beyond `session_id`, - `lang`, and `pipeline_stages` (deferred to a future session + `lang`, and `pipeline` (deferred to a future session spec). - The `mycroft.*` topic prefix outside the intent layer (e.g. `mycroft.audio.*`) — these are not part of any spec here. @@ -678,7 +676,7 @@ number of legacy names. Implementer migration aid: prescribes. - **A session specification.** MSG-1 §4 carries `session` opaquely and names only `session_id` and `lang`; PIPELINE-1 §5 adds - `pipeline_stages`. Everything else about the session is + `pipeline`. Everything else about the session is deferred — session lifecycle (start, end, expiry, resumption), the full set of session preferences current OVOS already carries (`site_id`, `persona_id`, `time_format`, `date_format`, From 285a1340f8661b38513f4c6f1cd7672bbadf4780 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 04:45:03 +0100 Subject: [PATCH 04/27] =?UTF-8?q?APPENDIX=20=C2=A77:=20note=20utterance-tr?= =?UTF-8?q?ansformer=20chain=20as=20a=20deferred=20spec=20(out=20of=20scop?= =?UTF-8?q?e=20for=20PIPELINE-1)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- APPENDIX.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 0a30fdf..b229f47 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -694,6 +694,15 @@ number of legacy names. Implementer migration aid: `remove_context` feature — where one intent's match influences a later intent's eligibility — is not formalized at the spec level. +- **The utterance-transformer chain.** Current OVOS runs an + ordered chain of *transformers* before the pipeline that can + rewrite the utterance, mutate `message.context`, or cancel the + utterance entirely (via `context["canceled"] = true`, observed + on the bus as `ovos.utterance.cancelled`). PIPELINE-1 + intentionally **does not** cover this — transformers don't + match, don't dispatch, and don't own a handler; their loading, + ordering, and contract are a separate concern. A future + transformer specification picks this up. - **Text normalization of ASR output.** The basis for slot value typing (OVOS-INTENT-1 §5.3). Deferred to its own specification. - **A machine-checkable conformance corpus** of `template → sample From a7a4e02a5c1188bfa367690cb47b78b7f4af9270 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 18:34:58 +0100 Subject: [PATCH 05/27] =?UTF-8?q?APPENDIX=20=C2=A74=20/=20=C2=A77:=20desig?= =?UTF-8?q?n=20notes=20for=20OVOS-CONTEXT-1=20and=20OVOS-TRANSFORM-1?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per the new "dedicated APPENDIX PR" policy, consolidating the prior-art and design-deviation notes from the OVOS-CONTEXT-1 (PR #18) and OVOS-TRANSFORM-1 (PR #20) work into this PR. Those spec PRs are now scoped to their own spec files only; the discussion / cross-spec touchups / in-tree prior art all live here. Adds to §4 Design rationale: - "Intent context (CONTEXT-1)" — the Adapt-only origins, the two-scope (private/shared) formalization, jurebes / nebulento / palavreado as prior art for excludes_context, the engine-side §5.3 mutation pathway resolving the PIPELINE-1 §4.2 contradiction. - "Transformer plugins (TRANSFORM-1)" — the architectural- pattern framing, intent transformers as the system-typing home, the nine concrete in-tree plugins as prior art, the ascending-vs-descending priority deviation called out, cancellation alignment with existing plugin convention, and the language disambiguation hierarchy mirroring current ovos-core code paths. Removes from §7 Known gaps: - "Intent context" bullet (formalized in CONTEXT-1). - "The utterance-transformer chain" bullet (formalized in TRANSFORM-1). --- APPENDIX.md | 128 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 115 insertions(+), 13 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index b229f47..85007cb 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -357,6 +357,113 @@ reasoning, not the requirement. that excludes destructive plugins. This composes with the layer-2 substrate (§5) without orchestrator-side changes. +### Intent context (CONTEXT-1) + +- **Lifts intent context out of Adapt.** The Adapt-era + `add_context` / `remove_context` mechanism, and the + Mycroft-era `mycroft.skill.set_cross_context` / + `mycroft.skill.remove_cross_context` fan-out for cross-skill + use, are Adapt-only at the matcher level — Padatious and + other engines ignore them. CONTEXT-1 generalizes the + mechanism into a session-bound, decaying flat key/value store + consumed by every intent engine uniformly via + `requires_context` and `excludes_context` declarations. +- **Two explicit scopes.** `private` (orchestrator + auto-prefixes with `:`) and `shared` (flat, + cross-skill). The current OVOS code models the same distinction + informally (`MycroftSkill.set_context` auto-prefixes with + `alphanumeric_skill_id`; `set_cross_skill_context` fans out via + a bus event); CONTEXT-1 names the scopes explicitly and routes + both through one bus surface (`intent.context.set` / `.unset` / + `.clear` / `.list`). +- **Prior art for the negative gate.** Three in-tree intent + engines under `/plugins-pipeline/` — + [jurebes](https://github.com/OpenJarbas/jurebes), + [nebulento](https://github.com/OpenJarbas/nebulento), and + [palavreado](https://github.com/OpenJarbas/palavreado) — + independently implement `exclude_context` as a first-class + negative gate. CONTEXT-1's `excludes_context` adopts the same + primitive at the spec level, addressing patterns ("fire once", + "modal suppression") that positive gating alone cannot express. +- **Engine-side mutation as a sanctioned non-bus pathway.** The + Adapt pipeline plugin auto-injects matched entities into context + *inside* `match()`, which conflicts with PIPELINE-1 §4.2's + side-effect-free `match` rule. CONTEXT-1 §5.3 carves an explicit + window between match-accept and dispatch-emit for engine-side + session mutation, with the orchestrator (not the bus) carrying + the write. This both legitimizes the established practice and + resolves the PIPELINE-1 contradiction. + +### Transformer plugins (TRANSFORM-1) + +- **Spec'd as an architectural pattern, not a feature list.** An + orchestrator MAY implement chains at any subset of six + injection points (audio, utterance, metadata, intent, dialog, + TTS); a null-implementation is conformant. For each chain it + does implement, the per-type contract binds. Each injection + point's existence is justified by what the lifecycle holds at + that exact moment — what's possible there that isn't possible + elsewhere. +- **Intent transformers as the system-typing home.** + OVOS-INTENT-1 §5.3 defers slot value typing pending a text + normalization specification. TRANSFORM-1 §3.4 is the spec'd + injection home for typing: a deployer ships date / number / + duration parsing once, and every skill receives typed values + in `Match.captures` regardless of which engine matched. The + OVOS analogue of ASK's `AMAZON.DATE` and Dialogflow's + `@sys.date-time`, but as an injected enrichment rather than a + built-in engine feature. +- **Concrete in-tree plugins as prior art.** Nine plugins live + under `/plugins-transformer/` today, covering five of the six + injection points: utterance transformers + (`ovos-utterance-normalizer`, `ovos-utterance-corrections-plugin`, + `ovos-transcription-validator-plugin`, + `ovos-utterance-plugin-cancel`, + `ovos-bidirectional-translation-plugin`); dialog transformers + (`ovos-dialog-normalizer-plugin`, + `ovos-bidirectional-translation-plugin`, + `ovos-dialog-transformer-openai-plugin`); audio transformers + (`ovos-audio-transformer-plugin-speechbrain-langdetect`, + `ovos-audio-transformer-plugin-ggwave`, + `ovos-audio-transformer-redis-publish`); intent transformers + (`ovos-keyword-template-matcher`, + `ovos-ahocorasick-ner-plugin`). The + `bidirectional-translation` plugin exercises the cross-chain + coordination via `Message.context` that TRANSFORM-1 §7 + formalizes. +- **Ascending priority.** TRANSFORM-1 §4 specifies ascending + priority (lower = earlier, default 50). Current OVOS sorts + transformer chains **descending** + (`ovos_core/transformers.py:53,117,205`, `reverse=True`); the + spec aligns with the **ascending** convention already used by + fallback skills (`fallback_service.py:49`, default 101 = run + last) and the natural "stages count up" reading. Bringing + current plugins into conformance only requires flipping + relative priorities, not rewriting. +- **Cancellation aligned with prior plugin convention.** Two + existing utterance transformers + (`ovos-utterance-plugin-cancel`, + `ovos-transcription-validator-plugin`) already signal the + lifecycle should abort by returning empty utterance lists with + `{canceled: true, cancel_word: }` context keys. + TRANSFORM-1 §8 keeps the convention, renaming `cancel_word` to + `cancel_reason` (the structured concept the field encodes) and + adding orchestrator-stamped `cancel_by: `. The + spec's `ovos.utterance.cancelled` terminal event sits alongside + the existing `complete_intent_failure` from PIPELINE-1, keeping + cancellation and failure observably distinct on the bus. +- **Language disambiguation.** TRANSFORM-1 §7.1 spec'd a + precedence hierarchy for resolving the in-flight utterance's + operative language: `stt_lang` (STT-attested) > + `request_lang` (source-channel-volunteered) > `detected_lang` + (transformer-derived) > `data.lang` (Message producer) > + existing `session.lang` > config default, gated by + `valid_langs`. Mirrors what current OVOS does informally in + `ovos_core/intent_services/service.py:197-222` + (`disambiguate_lang`); the spec elevates it to a normative + hierarchy with reserved context keys, and explicitly deprecates + the legacy top-level `Message.context["lang"]` shortcut. + --- ## 5. The OVOS bus as a substrate @@ -690,19 +797,14 @@ number of legacy names. Implementer migration aid: type plus skill-side state). MSG-1's async-by-default stance (§5.2) leaves room for this to be formalized either in the session spec or as a separate one. -- **Intent context.** The Adapt-era `add_context` / - `remove_context` feature — where one intent's match influences a - later intent's eligibility — is not formalized at the spec - level. -- **The utterance-transformer chain.** Current OVOS runs an - ordered chain of *transformers* before the pipeline that can - rewrite the utterance, mutate `message.context`, or cancel the - utterance entirely (via `context["canceled"] = true`, observed - on the bus as `ovos.utterance.cancelled`). PIPELINE-1 - intentionally **does not** cover this — transformers don't - match, don't dispatch, and don't own a handler; their loading, - ordering, and contract are a separate concern. A future - transformer specification picks this up. +- **Intent context.** Formalized in **OVOS-CONTEXT-1** — see §4 + *Intent context* above. The Adapt-era `add_context` / + `remove_context` feature is lifted to a session-bound, + decaying, engine-agnostic primitive. +- **The utterance-transformer chain.** Formalized in + **OVOS-TRANSFORM-1** — see §4 *Transformer plugins* above — + covering six injection points (audio, utterance, metadata, + intent, dialog, TTS) and their cancellation contract. - **Text normalization of ASR output.** The basis for slot value typing (OVOS-INTENT-1 §5.3). Deferred to its own specification. - **A machine-checkable conformance corpus** of `template → sample From 8ec037b9494f8b17accb87a43dce23d0c7e472e9 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 20:31:10 +0100 Subject: [PATCH 06/27] APPENDIX: SESSION-1 rationale; introspection patterns; revised divergences MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit §4 — new 'Session (SESSION-1)' rationale subsection: why it exists, prescriptive-not-descriptive scope, omission-as-deferral semantics, four language signals. §4 'Transformer plugins' — language-disambiguation note updated: hierarchy moved out of TRANSFORM-1 to SESSION-1 §3.2; transformer types now just named as natural producers of signals, consolidation is consumer's stage-dependent choice. §6.4 architectural divergences — add: handler-trio ownership shifted to orchestrator (third-party handler code carries no obligation); per-pipeline_id intent introspection (PIPELINE-1 §10); CONTEXT-1 scope discriminator. Update ovos.utterance.handled note to reflect the trio-ownership shift (workshop fix is now in the wrapper, not the handler). §6.5.1 (new) — introspection-patterns table comparing INTENT-4, PIPELINE-1, CONTEXT-1, TRANSFORM-1 surfaces. Three shared properties (pull-query is source of truth, no completeness signal, per-process slices under split orchestrators). Notes naming-convention inconsistency as candidate follow-up. §6.6 — remove obsolete 'session shape deferred' note; replace with SESSION-1 ownership statement. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 133 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 115 insertions(+), 18 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 85007cb..5c43edd 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -452,17 +452,50 @@ reasoning, not the requirement. spec's `ovos.utterance.cancelled` terminal event sits alongside the existing `complete_intent_failure` from PIPELINE-1, keeping cancellation and failure observably distinct on the bus. -- **Language disambiguation.** TRANSFORM-1 §7.1 spec'd a - precedence hierarchy for resolving the in-flight utterance's - operative language: `stt_lang` (STT-attested) > - `request_lang` (source-channel-volunteered) > `detected_lang` - (transformer-derived) > `data.lang` (Message producer) > - existing `session.lang` > config default, gated by - `valid_langs`. Mirrors what current OVOS does informally in - `ovos_core/intent_services/service.py:197-222` - (`disambiguate_lang`); the spec elevates it to a normative - hierarchy with reserved context keys, and explicitly deprecates - the legacy top-level `Message.context["lang"]` shortcut. +- **Language signals moved to SESSION-1.** Earlier TRANSFORM-1 + drafts spec'd a binding language-disambiguation hierarchy and + reserved `Message.context` keys for `stt_lang`, `request_lang`, + `detected_lang`. These have moved to OVOS-SESSION-1 §3.2 as + session-scoped fields with normative meanings but a non-binding + consolidation order — the right priority is stage-dependent. + TRANSFORM-1 §7.1 now only names which transformer types are + natural producers of which signals; consolidation is the + consumer's decision per SESSION-1 §3.2.7. + +### Session (SESSION-1) + +- **Why SESSION-1 now.** OVOS-MSG-1 §4 originally named two + internal session fields (`session_id`, `lang`) and deferred the + rest. As PIPELINE-1, CONTEXT-1, and TRANSFORM-1 each claimed + fields (`pipeline`, `context`, the six `*_transformers`), the + session became a load-bearing carrier with no single owner of + its wire contract. SESSION-1 consolidates the wire shape and + fixes a **registry mechanism** so future specs claim fields + without amending SESSION-1 itself. +- **Prescriptive, not descriptive.** Only the fields normatively + claimed by other specs are recognized. Implementations + carrying extra per-session state (current OVOS Session class + has `site_id`, `persona_id`, `system_unit`, `time_format`, + `date_format`, `location`, `is_speaking`, `is_recording`, + `blacklisted_skills`, `blacklisted_intents`) are non-normative + under v1 — they ride through as opaque pass-through (§2.3) and + can be claimed by future per-domain specs. +- **Omission means "let the orchestrator decide".** Single + deferral mechanism: omitted single field, empty `session: {}`, + absent `session`, explicit `session_id: "default"` — all + equivalent on the wire, all resolve at consumption to deployment + defaults filled by each consumer. No `null`, no sentinels. +- **Language signals.** Four BCP-47 fields with normative meanings + but stage-dependent consolidation: `lang` (user preference, + base), `secondary_langs` (additional understood languages, + constrains lang-detect predictions and fallback selection), + `output_lang` (renderer's preferred output language; simplifies + the bidirectional-translation transformer to a fallback role), + `stt_lang` / `request_lang` / `detected_lang` (per-utterance + signals from STT, emitter, and lang-detect respectively). + `request_lang` is an emitter-reported hint (per-wakeword + language assignment in multi-wakeword setups), not an + override. --- @@ -693,10 +726,39 @@ needs no implementation change: current discipline normative as alternative plugin types (LLM-backed, agent-backed) are written. - **`ovos.utterance.handled` on every terminal path** (PIPELINE-1 - §9.6). Current `ovos-workshop`'s `_on_event_error` does not - emit it on the handler-error path (`ovos.py:1478-1497`). The - spec requires it. Fix tracked separately as a workshop - implementation bug. + §9.5). Current `ovos-workshop`'s `_on_event_error` does not + emit it on the handler-error path (`ovos.py:1478-1497`). Under + the revised PIPELINE-1 §8 (handler-trio is orchestrator-owned, + not handler-owned), this concern dissolves at the spec level: + the orchestrator that invokes the handler wraps the call and + emits the trio itself, then emits `ovos.utterance.handled` + unconditionally. Workshop today plays an orchestrator-wrapper + role for some dispatch paths and is missing the wrapper + events; the fix is in the wrapper, not in third-party handler + code. +- **Handler-trio ownership shifted to orchestrator** (PIPELINE-1 + §8). Earlier drafts asked the handler-owning component (skill + or plugin-bundled) to emit `ovos.intent.handler.start` / + `.complete` / `.error`. The revised spec puts that obligation + on the orchestrator that invokes the handler: third-party + handler code carries **no normative obligation**, the + orchestrator wraps every invocation and emits the trio itself. + This is the right ownership — skill authors should not be + writing protocol code — and aligns with how a wrapper would + naturally observe start / return / exception around an opaque + callable. +- **Per-pipeline_id intent introspection** (PIPELINE-1 §10). New + pull-query / scatter-response surface keyed on `pipeline_id`, + giving consumers a way to see *which intents a particular + pipeline plugin's matcher has compiled*, distinct from the + orchestrator's manifest of declared intents (INTENT-4 §10). + No current OVOS analogue. +- **CONTEXT-1 scope discriminator on `requires_context`** + (CONTEXT-1 §6 / §6.1). Adds an OPTIONAL `scope: private|shared` + per entry, default `private`. Closes the footgun where an + unrelated skill's shared `Person` entry could accidentally + satisfy a private gate. Short-form `[Person]` keeps working + (interpreted as `{ key: Person, scope: private }`). ### 6.5 New topics with no direct precedent @@ -712,11 +774,46 @@ needs no implementation change: in-process subsystems; not currently implemented but compatible with current behaviour. +### 6.5.1 Introspection patterns across the specs (informative) + +Four specs in this set define pull-query / scatter-response +introspection surfaces. The shapes are intentionally similar but +serve different scopes: + +| Spec | Topic | Scope | Authoritative responder | +|------|-------|-------|-------------------------| +| INTENT-4 §10 | `ovos.intent.list` / `.describe` | Declared intents observed on the bus | Orchestrator (the manifest) | +| PIPELINE-1 §10 | `ovos.pipeline..intents.list` | Intents currently compiled inside a specific plugin's matcher | The pipeline plugin | +| CONTEXT-1 §5.4 | `intent.context.list` | Post-decay session-context snapshot | The orchestrator process owning the match round | +| TRANSFORM-1 §6 | `transformer..list` | Loaded transformers per injection point | The orchestrator process implementing that chain | + +Three properties hold across all four: + +1. **Pull-query is the source of truth.** Producers MAY broadcast + load-time announcements; consumers MUST NOT rely on having + received them. The bus is asynchronous and gives no delivery + guarantee; a consumer that started late missed the broadcast. +2. **No completeness signal.** A consumer that wants completeness + keeps its own roster of expected responders and times out + non-responders. +3. **Per-process slices under split orchestrators.** When the + orchestrator is split (PIPELINE-1 §2), each process responds + from its own slice; consumers aggregate. + +The naming convention is **not yet uniform** across the four +surfaces — INTENT-4 / PIPELINE-1 use the `ovos.` prefix while +CONTEXT-1 / TRANSFORM-1 do not. Standardization is a candidate +follow-up PR; the wire contract is otherwise consistent. + ### 6.6 Things the specs do *not* change -- The session object's internal shape beyond `session_id`, - `lang`, and `pipeline` (deferred to a future session - spec). +- The session object's internal shape is now owned by + OVOS-SESSION-1; the field set is the closed set defined there + plus whatever future specs claim via SESSION-1 §2.1. The "extra" + fields current OVOS Session carries (`site_id`, `persona_id`, + `system_unit`, `time_format`, `date_format`, etc.) ride through + as non-normative pass-through and may be claimed by future + per-domain specs. - The `mycroft.*` topic prefix outside the intent layer (e.g. `mycroft.audio.*`) — these are not part of any spec here. - The `:` dispatch topic — kept verbatim From 804080f9aa72fe64f69e393ce21a5349df61c95e Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 21:05:39 +0100 Subject: [PATCH 07/27] =?UTF-8?q?APPENDIX:=20update=20=C2=A76.5.1=20topic-?= =?UTF-8?q?naming=20(resolved);=20add=20new=20=C2=A76.4=20divergences?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit §6.5.1: topic-naming inconsistency is now resolved — all four .list surfaces use ovos... Update the table and replace the 'not yet uniform' note with a rename log. §6.4: add four new divergence entries: - Skill self-identification on every emission (INTENT-4 §3.1) - recognizer_loop:utterance de-prescribed (PIPELINE-1 §9.1) - .list topics standardized - (keeps the existing scope-discriminator / handler-trio / per-pipeline_id / utterance.handled entries) Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 5c43edd..e10ec8b 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -759,6 +759,27 @@ needs no implementation change: unrelated skill's shared `Person` entry could accidentally satisfy a private gate. Short-form `[Person]` keeps working (interpreted as `{ key: Person, scope: private }`). +- **Skill self-identification on every emission** (INTENT-4 §3.1). + Every Message a skill emits MUST carry + `Message.context["skill_id"]`. Current OVOS skills set this on + some emissions (registrations, handler responses) but not + uniformly. The spec makes it the authoritative attribution + surface for skill-originated bus traffic — observers attribute + by `context["skill_id"]`, not by parsing topic names. Drives + CONTEXT-1 §5.2 origin-stamping (the orchestrator stamps `origin` + from this field, not from `source`). +- **`recognizer_loop:utterance` de-prescribed** (PIPELINE-1 §9.1). + Earlier drafts treated this topic as normative. The revision + defers the entry-topic name to a future audio-input ↔ + assistant-core wire spec; current deployments use the legacy + name for compatibility but conformant orchestrators MAY adopt + whatever name the future spec settles on. +- **All `.list` topics standardized to `ovos..`** + (TRANSFORM-1 §6, CONTEXT-1 §5). Renames: + `transformer..list` → `ovos.transformer..list`; + `intent.context.set/.unset/.clear/.list` → + `ovos.context.set/.unset/.clear/.list`. INTENT-4 / PIPELINE-1 + topics unchanged. ### 6.5 New topics with no direct precedent @@ -784,8 +805,8 @@ serve different scopes: |------|-------|-------|-------------------------| | INTENT-4 §10 | `ovos.intent.list` / `.describe` | Declared intents observed on the bus | Orchestrator (the manifest) | | PIPELINE-1 §10 | `ovos.pipeline..intents.list` | Intents currently compiled inside a specific plugin's matcher | The pipeline plugin | -| CONTEXT-1 §5.4 | `intent.context.list` | Post-decay session-context snapshot | The orchestrator process owning the match round | -| TRANSFORM-1 §6 | `transformer..list` | Loaded transformers per injection point | The orchestrator process implementing that chain | +| CONTEXT-1 §5.4 | `ovos.context.list` | Post-decay session-context snapshot | The orchestrator process owning the match round | +| TRANSFORM-1 §6 | `ovos.transformer..list` | Loaded transformers per injection point | The orchestrator process implementing that chain | Three properties hold across all four: @@ -800,10 +821,12 @@ Three properties hold across all four: orchestrator is split (PIPELINE-1 §2), each process responds from its own slice; consumers aggregate. -The naming convention is **not yet uniform** across the four -surfaces — INTENT-4 / PIPELINE-1 use the `ovos.` prefix while -CONTEXT-1 / TRANSFORM-1 do not. Standardization is a candidate -follow-up PR; the wire contract is otherwise consistent. +All four surfaces use the unified `ovos..` (or +`ovos...` for per-id introspection) naming +convention. CONTEXT-1's prior `intent.context.*` topics were +renamed to `ovos.context.*`, and TRANSFORM-1's prior +`transformer..list` topics were renamed to +`ovos.transformer..list`, in this round. ### 6.6 Things the specs do *not* change From ee12c4b1d75861c0fb1d9ce3b5ee5485b1d917c6 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 21:09:48 +0100 Subject: [PATCH 08/27] =?UTF-8?q?APPENDIX:=20cleanup=20=E2=80=94=20drop=20?= =?UTF-8?q?draft-history=20meta-commentary?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stand-alone design notes, not a changelog. §4 design rationale: rewrite Session block and TRANSFORM-1 lang bullet to describe current design, not 'moved from earlier draft'. §6.4 divergences: rewrite handler-trio / trio-ownership / scope- discriminator / skill_id-emission / recognizer_loop / topic-naming entries to state current design, not contrast with earlier drafts. §6.5.1 introspection patterns: drop 'in this round' rename note. §9 (rewritten 'Design history' → 'The spec set, in three stacks'): drop §9.3 audit-driven-refinement entirely (changelog content); merge §9.1 + §9.2 into one tighter section about how the eight specs partition and what reference implementations exist. §10 compatibility levels: soften 'was previously spoken of at' to 'is spoken of at'; replace the 'no longer describes' framing with a forward-looking 'tuple covering all eight specs is a planned follow-up'. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 197 ++++++++++++++++++++-------------------------------- 1 file changed, 75 insertions(+), 122 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index e10ec8b..8af6fe1 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -452,26 +452,22 @@ reasoning, not the requirement. spec's `ovos.utterance.cancelled` terminal event sits alongside the existing `complete_intent_failure` from PIPELINE-1, keeping cancellation and failure observably distinct on the bus. -- **Language signals moved to SESSION-1.** Earlier TRANSFORM-1 - drafts spec'd a binding language-disambiguation hierarchy and - reserved `Message.context` keys for `stt_lang`, `request_lang`, - `detected_lang`. These have moved to OVOS-SESSION-1 §3.2 as - session-scoped fields with normative meanings but a non-binding - consolidation order — the right priority is stage-dependent. - TRANSFORM-1 §7.1 now only names which transformer types are - natural producers of which signals; consolidation is the - consumer's decision per SESSION-1 §3.2.7. +- **Language signals live in SESSION-1.** Language signals + (`stt_lang`, `request_lang`, `detected_lang`, alongside `lang`, + `secondary_langs`, `output_lang`) are session-scoped fields with + normative meanings but a non-binding consolidation order — the + right priority is stage-dependent. TRANSFORM-1 §7.1 names which + transformer types are natural producers of which signals; + consolidation is the consumer's decision per SESSION-1 §3.2.7. ### Session (SESSION-1) -- **Why SESSION-1 now.** OVOS-MSG-1 §4 originally named two - internal session fields (`session_id`, `lang`) and deferred the - rest. As PIPELINE-1, CONTEXT-1, and TRANSFORM-1 each claimed - fields (`pipeline`, `context`, the six `*_transformers`), the - session became a load-bearing carrier with no single owner of - its wire contract. SESSION-1 consolidates the wire shape and - fixes a **registry mechanism** so future specs claim fields - without amending SESSION-1 itself. +- **Why a separate session spec.** `Message.context.session` is a + load-bearing carrier claimed by multiple specs (PIPELINE-1, + CONTEXT-1, TRANSFORM-1) — without a single owner, its wire + contract drifts. SESSION-1 consolidates the wire shape and fixes + a **registry mechanism** so future specs claim fields without + amending SESSION-1 itself. - **Prescriptive, not descriptive.** Only the fields normatively claimed by other specs are recognized. Implementations carrying extra per-session state (current OVOS Session class @@ -727,59 +723,48 @@ needs no implementation change: (LLM-backed, agent-backed) are written. - **`ovos.utterance.handled` on every terminal path** (PIPELINE-1 §9.5). Current `ovos-workshop`'s `_on_event_error` does not - emit it on the handler-error path (`ovos.py:1478-1497`). Under - the revised PIPELINE-1 §8 (handler-trio is orchestrator-owned, - not handler-owned), this concern dissolves at the spec level: - the orchestrator that invokes the handler wraps the call and - emits the trio itself, then emits `ovos.utterance.handled` - unconditionally. Workshop today plays an orchestrator-wrapper - role for some dispatch paths and is missing the wrapper - events; the fix is in the wrapper, not in third-party handler - code. -- **Handler-trio ownership shifted to orchestrator** (PIPELINE-1 - §8). Earlier drafts asked the handler-owning component (skill - or plugin-bundled) to emit `ovos.intent.handler.start` / - `.complete` / `.error`. The revised spec puts that obligation - on the orchestrator that invokes the handler: third-party - handler code carries **no normative obligation**, the - orchestrator wraps every invocation and emits the trio itself. - This is the right ownership — skill authors should not be - writing protocol code — and aligns with how a wrapper would - naturally observe start / return / exception around an opaque - callable. -- **Per-pipeline_id intent introspection** (PIPELINE-1 §10). New - pull-query / scatter-response surface keyed on `pipeline_id`, - giving consumers a way to see *which intents a particular + emit it on the handler-error path (`ovos.py:1478-1497`). + PIPELINE-1 §8 places trio emission on the orchestrator-wrapper + around the handler, not on the handler itself — workshop is the + wrapper in current OVOS, and the spec contract requires the + wrapper to emit `ovos.utterance.handled` unconditionally. +- **Handler-trio is orchestrator-owned** (PIPELINE-1 §8). The + orchestrator that invokes the handler wraps the call and emits + `ovos.intent.handler.start` / `.complete` / `.error` around it. + Third-party handler code carries **no normative obligation** to + participate in trio emission. Skill authors are not protocol + authors; the wrapper observes start / return / exception around + an opaque callable. +- **Per-pipeline_id intent introspection** (PIPELINE-1 §10). + Pull-query / scatter-response surface keyed on `pipeline_id`, + giving consumers visibility into *which intents a particular pipeline plugin's matcher has compiled*, distinct from the - orchestrator's manifest of declared intents (INTENT-4 §10). - No current OVOS analogue. + orchestrator's manifest of declared intents (INTENT-4 §10). No + current OVOS analogue. - **CONTEXT-1 scope discriminator on `requires_context`** - (CONTEXT-1 §6 / §6.1). Adds an OPTIONAL `scope: private|shared` - per entry, default `private`. Closes the footgun where an - unrelated skill's shared `Person` entry could accidentally - satisfy a private gate. Short-form `[Person]` keeps working - (interpreted as `{ key: Person, scope: private }`). + (CONTEXT-1 §6 / §6.1). OPTIONAL `scope: private|shared` per + entry, default `private`. Prevents an unrelated skill's shared + `Person` entry from accidentally satisfying a private gate. + Short-form `[Person]` is interpreted as + `{ key: Person, scope: private }`. - **Skill self-identification on every emission** (INTENT-4 §3.1). - Every Message a skill emits MUST carry + Every Message a skill emits carries `Message.context["skill_id"]`. Current OVOS skills set this on some emissions (registrations, handler responses) but not uniformly. The spec makes it the authoritative attribution surface for skill-originated bus traffic — observers attribute - by `context["skill_id"]`, not by parsing topic names. Drives - CONTEXT-1 §5.2 origin-stamping (the orchestrator stamps `origin` - from this field, not from `source`). -- **`recognizer_loop:utterance` de-prescribed** (PIPELINE-1 §9.1). - Earlier drafts treated this topic as normative. The revision - defers the entry-topic name to a future audio-input ↔ - assistant-core wire spec; current deployments use the legacy - name for compatibility but conformant orchestrators MAY adopt - whatever name the future spec settles on. -- **All `.list` topics standardized to `ovos..`** - (TRANSFORM-1 §6, CONTEXT-1 §5). Renames: - `transformer..list` → `ovos.transformer..list`; - `intent.context.set/.unset/.clear/.list` → - `ovos.context.set/.unset/.clear/.list`. INTENT-4 / PIPELINE-1 - topics unchanged. + by `context["skill_id"]`, not by parsing topic names. Enforced + loader-side where possible (orchestrator intercepts the emit + pathway). Drives CONTEXT-1 §5.2 origin-stamping. +- **Entry-point topic is not prescribed** (PIPELINE-1 §9.1). The + utterance-layer entry-topic name is deferred to a future + audio-input ↔ assistant-core wire spec; current deployments + use `recognizer_loop:utterance` for compatibility. +- **All `.list` topics under one prefix**: `ovos..` + (or `ovos...` for per-id introspection). + CONTEXT-1's mutation/introspection events live under + `ovos.context.*`; TRANSFORM-1's introspection events live under + `ovos.transformer.*`; INTENT-4 / PIPELINE-1 use the same prefix. ### 6.5 New topics with no direct precedent @@ -823,10 +808,7 @@ Three properties hold across all four: All four surfaces use the unified `ovos..` (or `ovos...` for per-id introspection) naming -convention. CONTEXT-1's prior `intent.context.*` topics were -renamed to `ovos.context.*`, and TRANSFORM-1's prior -`transformer..list` topics were renamed to -`ovos.transformer..list`, in this round. +convention. ### 6.6 Things the specs do *not* change @@ -967,12 +949,7 @@ grammar-level conformance corpus (§7). --- -## 9. Design history - -How the specification set was arrived at — context that explains -the *why*, but that has no place in a normative document. - -### 9.1 The set, in three stacks +## 9. The spec set, in three stacks Built bottom-up in three stacks: @@ -980,68 +957,44 @@ Built bottom-up in three stacks: (template grammar) → OVOS-INTENT-2 (resource files) → OVOS-INTENT-3 (the intent concept) → OVOS-INTENT-4 (the registration wire format on the bus). -- The **bus stack**, anchored on existing `ovos-bus-client` wire - format: OVOS-MSG-1 formalizes the envelope, routing, session - carrier, and `forward`/`reply`/`response` derivations. - Originally drafted as two specs (envelope + session/routing) and - merged once it became clear the derivations could only - meaningfully be defined where the routing keys lived. +- The **bus stack**: OVOS-MSG-1 formalizes the envelope, routing, + session carrier, and `forward`/`reply`/`response` derivations. + OVOS-SESSION-1 formalizes the wire shape of the session carrier. - The **orchestrator stack**: OVOS-PIPELINE-1 defines the orchestrator, the pipeline-plugin abstraction, the utterance - lifecycle, and the handler-lifecycle trio. Sits on top of the - bus stack (uses MSG-1's envelope and routing) and around the - intent stack (intent registrations are one kind of input - pipeline plugins consume). - -Each was a formalization pass over machinery already running in -production (§1), not a greenfield design. - -### 9.2 The reference implementation - -The specs are implementation-agnostic, but a spec benefits from -one conformant implementation. **ovos-spec-tools** is that for -the intent stack — expander, resource loader, dialog renderer, -language matching, locale linter, in one dependency-light -package. It exists because the same machinery had drifted across -six separate copies in the ecosystem; ovos-spec-tools is what -those components are meant to converge on, and the intended home -of the planned conformance corpus. - + lifecycle, and the handler-lifecycle trio. OVOS-CONTEXT-1 + defines per-session intent-context state. OVOS-TRANSFORM-1 + defines the six injection-point transformer chains. Sits on top + of the bus stack (uses MSG-1's envelope and routing, SESSION-1's + session carrier) and around the intent stack (intent + registrations are one kind of input pipeline plugins consume). + +The **reference implementation** for the intent stack is +**ovos-spec-tools** — expander, resource loader, dialog renderer, +language matching, locale linter, in one dependency-light package. The bus and orchestrator stacks do not yet have a comparable reference; `ovos-bus-client` is the closest match for MSG-1 and `ovos-core` is the closest match for PIPELINE-1 + INTENT-4, but both predate the specs. -### 9.3 Audit-driven refinement - -Before initial release, each spec was revised across several -review rounds — malformed-form rules, the expansion algorithm, -slot handling, the envelope/routing split (later un-split, see -§9.1), the host → orchestrator rename, the -intent-stage-vs-non-intent-stage distinction (later dissolved -into the uniform pipeline-plugin abstraction), cross-spec -terminology. Those rounds happened pre-release, so they left no -intermediate version numbers behind: the audited result *is* -version 1 (or 1.1 where editorial-only). The CHANGELOG records -versioned changes from there on. - --- ## 10. Compatibility levels Each specification carries its own integer (or minor) `Version`, bumped per PR per the contributing rules in the README. The -architecture as a whole was previously spoken of at -**compatibility levels** — versioned snapshots a tool may target, -checked against by `ovos-spec-lint`. - -The compatibility-level model was designed when the architecture -was one stack (the intent grammar / resources / intent definition -chain) and a single integer cleanly identified "all the specs at -once." With the addition of the bus and orchestrator stacks, that -single-axis model no longer describes the architecture. - -The historical intent-stack ladder: +architecture as a whole is spoken of at **compatibility levels** — +versioned snapshots a tool may target, checked against by +`ovos-spec-lint`. + +The compatibility-level model works cleanly for the **intent +stack**, where a single integer identifies a coherent grammar / +resources / intent-definition snapshot. The bus and orchestrator +stacks do not yet map onto the same single-axis ladder; a +specification-set-wide version tuple covering all eight specs is +a planned follow-up. + +The intent-stack ladder: - **V0** — *informal.* The undocumented, de-facto behaviour from before these specifications existed. V0 is not specified From e1964f656256f9f6c332e4bd9ed55e4df766cf39 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 21:56:09 +0100 Subject: [PATCH 09/27] APPENDIX: update divergence catalog for CONTEXT-1 key-shape collapse + dispatch stamping MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit §6.4: rewrite the CONTEXT-1 scope-discriminator entry to reflect the bigger change — scope AND origin both collapsed into the key shape. requires_context discriminator is the surviving surface (default private). §6.4: rewrite the skill_id-on-every-emission entry to lead with the structural enforcement (dispatch stamping + forward/reply inheritance), with loader interception as a follow-up rather than the primary path. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 33 +++++++++++++++++++-------------- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 8af6fe1..9e1831f 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -741,21 +741,26 @@ needs no implementation change: pipeline plugin's matcher has compiled*, distinct from the orchestrator's manifest of declared intents (INTENT-4 §10). No current OVOS analogue. -- **CONTEXT-1 scope discriminator on `requires_context`** - (CONTEXT-1 §6 / §6.1). OPTIONAL `scope: private|shared` per - entry, default `private`. Prevents an unrelated skill's shared - `Person` entry from accidentally satisfying a private gate. - Short-form `[Person]` is interpreted as - `{ key: Person, scope: private }`. -- **Skill self-identification on every emission** (INTENT-4 §3.1). - Every Message a skill emits carries +- **CONTEXT-1 scope and ownership encoded in the key shape** + (CONTEXT-1 §2, §3). A bare key `Person` is shared; a prefixed + key `music.skill:Person` is private to `music.skill`. The `:` + is load-bearing — mirroring the `:` + dispatch topic. Drops separate `scope` and `origin` fields on + stored entries (both were redundant with the key shape). + `requires_context` and `excludes_context` declarations take an + OPTIONAL `scope: private|shared` discriminator (default + `private`) to express which lookup the gate uses; bare-string + declarations default to private to prevent shared-leak. +- **Skill self-identification on every emission** (INTENT-4 + §3.1). Every Message a skill emits carries `Message.context["skill_id"]`. Current OVOS skills set this on - some emissions (registrations, handler responses) but not - uniformly. The spec makes it the authoritative attribution - surface for skill-originated bus traffic — observers attribute - by `context["skill_id"]`, not by parsing topic names. Enforced - loader-side where possible (orchestrator intercepts the emit - pathway). Drives CONTEXT-1 §5.2 origin-stamping. + some emissions but not uniformly. Enforcement is structural on + the dispatch path: the orchestrator stamps + `context.skill_id` from the `:` dispatch + topic prefix (PIPELINE-1 §7.1), and skill emissions via + `forward`/`reply` inherit automatically. Loader-side + interception covers off-dispatch emissions. Drives CONTEXT-1 + §5.2 stored-key computation. - **Entry-point topic is not prescribed** (PIPELINE-1 §9.1). The utterance-layer entry-topic name is deferred to a future audio-input ↔ assistant-core wire spec; current deployments From 82be02bf846c09409240471ffb868c9432a6beee Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Sun, 24 May 2026 23:42:34 +0100 Subject: [PATCH 10/27] APPENDIX: clarify topic-naming claim as prefix-uniform, verb depth varies --- APPENDIX.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 9e1831f..5e7bf8f 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -765,11 +765,13 @@ needs no implementation change: utterance-layer entry-topic name is deferred to a future audio-input ↔ assistant-core wire spec; current deployments use `recognizer_loop:utterance` for compatibility. -- **All `.list` topics under one prefix**: `ovos..` - (or `ovos...` for per-id introspection). - CONTEXT-1's mutation/introspection events live under - `ovos.context.*`; TRANSFORM-1's introspection events live under - `ovos.transformer.*`; INTENT-4 / PIPELINE-1 use the same prefix. +- **All introspection topics share the `ovos..` prefix.** + Verb segments vary by domain — INTENT-4 nests under + `ovos.intent.register.` / `ovos.intent.list`; PIPELINE-1 + uses `ovos.pipeline..intents.list`; CONTEXT-1 uses + `ovos.context.`; TRANSFORM-1 uses + `ovos.transformer..`. Uniformity is at the prefix + level, not at verb depth. ### 6.5 New topics with no direct precedent @@ -811,9 +813,9 @@ Three properties hold across all four: orchestrator is split (PIPELINE-1 §2), each process responds from its own slice; consumers aggregate. -All four surfaces use the unified `ovos..` (or -`ovos...` for per-id introspection) naming -convention. +All four surfaces share the `ovos..` prefix; verb segments +vary by domain (some nest, some don't). The uniformity is in the +namespace, not in a fixed depth. ### 6.6 Things the specs do *not* change From 4999cd8c70dffb76fd15ae0985103e98d2cdae78 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 02:40:51 +0100 Subject: [PATCH 11/27] =?UTF-8?q?APPENDIX=20=C2=A76.5.1:=20flag=20the=20'i?= =?UTF-8?q?ntent'=20word=20collision=20across=20three=20introspection=20to?= =?UTF-8?q?pics?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cross-spec audit B1: 'intent' plays three different roles across the four-spec introspection table — registered intents (INTENT-4), compiled-in-a-matcher intents (PIPELINE-1), and intent-transformer plugins (TRANSFORM-1). The shapes are deliberate and the payloads are distinct, but the topic strings read confusingly at a glance. Added an informative paragraph naming the three meanings and clarifying that ovos.transformer.intent.list follows the per-chain ovos.transformer..list pattern, where 'intent' is the chain type — not a listing of intents. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 5e7bf8f..461e9fb 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -817,6 +817,27 @@ All four surfaces share the `ovos..` prefix; verb segments vary by domain (some nest, some don't). The uniformity is in the namespace, not in a fixed depth. +The word **"intent"** appears in three of the four topic strings +above with three different meanings, which is worth flagging for +implementers wiring observers: + +- `ovos.intent.list` (INTENT-4 §10) — list of registered *intents* + (the things skills declare; `data` entries name `intent_name`). +- `ovos.pipeline..intents.list` (PIPELINE-1 §10) — + list of *intents currently compiled by one plugin's matcher* + (`data` entries name `intent_name`). +- `ovos.transformer.intent.list` (TRANSFORM-1 §6) — list of + *intent-transformer plugins* loaded at the intent-transformer + injection point (`data` entries name `transformer_id`). Despite + the topic shape, this is **not** an intent-listing surface; it + follows the per-chain pattern `ovos.transformer..list` + where `` happens to be `intent` for this chain (alongside + `audio`, `utterance`, `metadata`, `dialog`, `tts`). + +The collision is at the human-reading level only; payload shapes +are distinct and a consumer subscribing to one cannot accidentally +parse responses from another. + ### 6.6 Things the specs do *not* change - The session object's internal shape is now owned by From 01935448e465e4640c70b91688c1672ca1f7a677 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 03:03:29 +0100 Subject: [PATCH 12/27] =?UTF-8?q?APPENDIX=20=C2=A74=20Transformer:=20desig?= =?UTF-8?q?n=20note=20on=20the=20six=20per-type=20self-identification=20ke?= =?UTF-8?q?ys?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Document the rationale for TRANSFORM-1 §1.3 claiming six per-type context keys (audio_transformer_id, utterance_transformer_id, ...) rather than a single generic transformer_id. Two arguments: (1) role preservation across the six-stage chain, mirroring the per-type partition that already exists in §1.1 registries, §5 session overrides, and §6 introspection topics; (2) multi-type- plugin disambiguation, since §1.1 permits a single transformer_id across types and a generic context key would erase the role at emit time. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 461e9fb..d3dd641 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -459,6 +459,23 @@ reasoning, not the requirement. right priority is stage-dependent. TRANSFORM-1 §7.1 names which transformer types are natural producers of which signals; consolidation is the consumer's decision per SESSION-1 §3.2.7. +- **Per-type self-identification keys.** TRANSFORM-1 §1.3 claims + six `Message.context` keys — one per transformer type + (`audio_transformer_id`, `utterance_transformer_id`, + `metadata_transformer_id`, `intent_transformer_id`, + `dialog_transformer_id`, `tts_transformer_id`) — rather than a + single generic `transformer_id`. Two reasons. First, the role + matters: a Message at the dialog stage may have been touched by + five transformer types in sequence, and lumping them into one + slot loses the role partitioning that exists in every other + surface of the spec (the per-type registries of §1.1, the + per-type `*_transformers` overrides of SESSION-1 §3, the + per-type introspection topics of §6). Second, multi-type + plugins disambiguate: a plugin shipping both an utterance and + a dialog transformer under the same `transformer_id` (permitted + by §1.1) would, with a single generic key, leave consumers + unable to tell which role emitted; per-type keys make the role + unambiguous on the wire. ### Session (SESSION-1) From 311b53ce7778c82f91c31d274ac31aeee6fec5ce Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 03:31:49 +0100 Subject: [PATCH 13/27] =?UTF-8?q?APPENDIX=20=C2=A74=20Transformer:=20recor?= =?UTF-8?q?d=20list-valued=20attribution,=20denylist=20symmetry,=20and=20t?= =?UTF-8?q?he=20per-type=20field-count=20tradeoff?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Four design notes capturing the recent TRANSFORM-1 evolution: - Update the existing per-type self-id bullet to reflect the plural list-valued context keys (audio_transformer_ids etc., not the older singular names). - New bullet: list-valued attribution preserves full chain provenance per type; the last entry is the most-recent stamp. Skills and pipelines stay single-string because they originate rather than chain. - New bullet: per-type denylists (six blacklisted_*_transformers) complete the policy surface, mirroring PIPELINE-1's pipeline/blacklisted_pipelines pair. Three-stage composition (preference → availability → policy) parallels PIPELINE-1 §5.5. - New bullet: acknowledge the per-type 'explosion' (12 session fields + 6 context keys), defend the choice against the transformer_: prefix-encoding alternative (direct lookup vs prefix parsing), note that SHOULD-omit makes the common case zero-cost on the wire, and document the object-valued form as a clean fallback if the field count ever proves painful in practice. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 59 ++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 49 insertions(+), 10 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index d3dd641..6208b7a 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -461,21 +461,60 @@ reasoning, not the requirement. consolidation is the consumer's decision per SESSION-1 §3.2.7. - **Per-type self-identification keys.** TRANSFORM-1 §1.3 claims six `Message.context` keys — one per transformer type - (`audio_transformer_id`, `utterance_transformer_id`, - `metadata_transformer_id`, `intent_transformer_id`, - `dialog_transformer_id`, `tts_transformer_id`) — rather than a - single generic `transformer_id`. Two reasons. First, the role - matters: a Message at the dialog stage may have been touched by - five transformer types in sequence, and lumping them into one - slot loses the role partitioning that exists in every other - surface of the spec (the per-type registries of §1.1, the - per-type `*_transformers` overrides of SESSION-1 §3, the - per-type introspection topics of §6). Second, multi-type + (`audio_transformer_ids`, `utterance_transformer_ids`, + `metadata_transformer_ids`, `intent_transformer_ids`, + `dialog_transformer_ids`, `tts_transformer_ids`) — rather than + a single generic `transformer_ids`. Two reasons. First, the + role matters: a Message at the dialog stage may have been + touched by five transformer types in sequence, and lumping + them into one slot loses the role partitioning that exists in + every other surface of the spec (the per-type registries of + §1.1, the per-type `*_transformers` overrides of SESSION-1 §3, + the per-type introspection topics of §6). Second, multi-type plugins disambiguate: a plugin shipping both an utterance and a dialog transformer under the same `transformer_id` (permitted by §1.1) would, with a single generic key, leave consumers unable to tell which role emitted; per-type keys make the role unambiguous on the wire. +- **List-valued attribution preserves chain provenance.** + Each of the six attribution context keys is a *list* of + `transformer_id` strings, not a single string. Transformers + chain by design — multiple transformers of the same type run + sequentially against the same Message-in-flight (§4) — and the + list preserves the full chain on the wire, in order of touch. + The last entry is the most-recent stamper. Skill and pipeline + identity keys (`context["skill_id"]`, `context["pipeline_id"]`) + remain single strings because skills and pipeline plugins + *originate* Messages rather than chain over them. +- **Per-type denylists complete the policy surface.** + TRANSFORM-1 §5.2 claims six `blacklisted__transformers` + session fields, paralleling the six `_transformers` + chain-ordering fields of §5.1 and the + `pipeline`/`blacklisted_pipelines` pair of OVOS-PIPELINE-1 §5. + Three-stage composition (preference → availability → policy) + in §5.3 mirrors PIPELINE-1 §5.5 exactly. +- **The per-type "explosion" of session fields is deliberate.** + Counting transformer-related session-field claims: six chain + orderings (§5.1) + six denylists (§5.2) = twelve fields, plus + six `Message.context` attribution keys. That is a lot of + wire-surface names, and it is a deliberate tradeoff against + the alternative of a `transformer_:` prefix-encoded + single namespace. The per-type partition gives direct key + lookup, avoids prefix parsing in CONTEXT-1 §5.2 attribution and + in §5.3 chain composition, and matches the per-type + partitioning that already exists in the §1.1 registries, the + §4 chain ordering rules, and the §6 introspection topics. + Under the canonical SHOULD-omit rule of SESSION-1 §3.4, the + common case carries zero of these fields on the wire — a + session diverges from deployment defaults only as needed. If + the field count ever proves painful in practice, the cleanest + fallback is an object-valued form + (`session.transformers: {audio: [...], ...}` and + `session.blacklisted_transformers: {audio: [...], ...}`), + collapsing twelve flat fields into two structured ones with + the per-type partition preserved as object keys. The flat form + was chosen for parallelism with `pipeline` (array, not object) + and for direct field access. ### Session (SESSION-1) From e2cce68e0b5e1cb54836e9ab69d59c5eb6a79df2 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 03:55:29 +0100 Subject: [PATCH 14/27] =?UTF-8?q?APPENDIX=20=C2=A74=20CONTEXT-1:=20rationa?= =?UTF-8?q?le=20for=20default-private=20scope?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add design-rationale paragraph explaining why ovos.context.set defaults to private scope when the canonical worked example (Person → Bob) is naturally cross-skill. Three reasons: migration fidelity (current Adapt set_context is effectively skill-private), safer footgun direction (accidental shared-leak is harder to debug than accidental cross-skill miss), and authorability (cross-skill coordination deserves a conscious explicit scope). Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 6208b7a..ad4e3e3 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -376,6 +376,21 @@ reasoning, not the requirement. a bus event); CONTEXT-1 names the scopes explicitly and routes both through one bus surface (`intent.context.set` / `.unset` / `.clear` / `.list`). +- **Why private is the default.** A skill that calls + `ovos.context.set` without specifying `scope` gets a private + entry. This optimises for the safer case at the cost of being + the less-useful case: the spec's own worked example + (Person → Bob) is naturally cross-skill, and a reader might + expect shared to be the default. The choice favours migration + fidelity (the current Adapt `set_context` pattern is + effectively skill-private — keyed under `alphanumeric_skill_id`), + the safer footgun direction (a cross-skill leak from an + accidentally-shared entry is harder to debug than a + cross-skill miss from an accidentally-private entry), and + authorability (cross-skill coordination is a conscious decision + that deserves an explicit `scope: "shared"`). Skills that + routinely act across the skill boundary set the scope + explicitly; skills that don't get safety by default. - **Prior art for the negative gate.** Three in-tree intent engines under `/plugins-pipeline/` — [jurebes](https://github.com/OpenJarbas/jurebes), From 4e6e277c46ea67571bc5472e6e5a6d01f316eda4 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 04:04:55 +0100 Subject: [PATCH 15/27] =?UTF-8?q?APPENDIX=20=C2=A76:=20record=20recognizer?= =?UTF-8?q?=5Floop:utterance=20->=20ovos.utterance.handle=20rename?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Move the entry-topic from §6.1 'already aligned' to §6.4 'architectural divergences' — it is no longer a name kept verbatim, since PIPELINE-1 §9.1 now prescribes ovos.utterance.handle. Rationale paragraph cites the three MSG-1 §2.1.2 naming convention violations: ':' as separator, implementation-role leading segment, missing request/terminal verb pairing. Migration cost spelled out (every audio-input service emits, every intent-service handler subscribes: ovos-dinkum-listener, ovos-simple-listener, ovos-audio, ovos-core/intent_services). §6.7 predecessor-topic table updated. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index ad4e3e3..168ed1d 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -728,9 +728,8 @@ needs no implementation change: `ovos-bus-client.Message.{forward,reply,response}`. - The `.response` suffix convention — pervasive across OVOS topics today. -- The `recognizer_loop:utterance` entry point and - `complete_intent_failure` no-match topic (PIPELINE-1) — match - current topic names verbatim. +- The `complete_intent_failure` no-match topic (PIPELINE-1) — + matches current topic name verbatim. - `ovos.utterance.cancelled` and `ovos.utterance.handled` (PIPELINE-1) — match current topic names verbatim. - Per-utterance first-match-wins iteration (PIPELINE-1) — matches @@ -832,10 +831,24 @@ needs no implementation change: `forward`/`reply` inherit automatically. Loader-side interception covers off-dispatch emissions. Drives CONTEXT-1 §5.2 stored-key computation. -- **Entry-point topic is not prescribed** (PIPELINE-1 §9.1). The - utterance-layer entry-topic name is deferred to a future - audio-input ↔ assistant-core wire spec; current deployments - use `recognizer_loop:utterance` for compatibility. +- **Entry-point topic renamed `ovos.utterance.handle`** + (PIPELINE-1 §9.1). Current deployments use the Mycroft-era + `recognizer_loop:utterance`. That name fails the naming + conventions of OVOS-MSG-1 §2.1.2 on three counts: it uses `:` + as a segment separator (where `:` is reserved for + `:` dispatch topics, §2.1.1); its + leading segment names an implementation role (the audio-input + "recognizer loop") rather than a stable assistant root; and + it does not pair with the past-tense terminal event + `ovos.utterance.handled`. The rename to + `ovos.utterance.handle` fixes all three: dot-separated + hierarchy, stable `ovos.` root, request/terminal pair + (`handle` ↔ `handled`) sharing a root verb. Migration cost + is real — every audio-input service emits this, every + intent-service handler subscribes — touching + `ovos-dinkum-listener`, `ovos-simple-listener`, `ovos-audio`, + and `ovos-core/intent_services/service.py`. A transitional + deployment MAY subscribe to both names during migration. - **All introspection topics share the `ovos..` prefix.** Verb segments vary by domain — INTENT-4 nests under `ovos.intent.register.` / `ovos.intent.list`; PIPELINE-1 @@ -956,7 +969,7 @@ number of legacy names. Implementer migration aid: | Topic | Status | |-------|--------| -| `recognizer_loop:utterance` | **unchanged** — kept as the entry point. | +| `recognizer_loop:utterance` | renamed to `ovos.utterance.handle` — see §6.4 above. | | `complete_intent_failure` | **unchanged** — kept as the no-match signal. | | `ovos.utterance.cancelled` | **unchanged** — kept as the cancellation signal. | | `ovos.utterance.handled` | **unchanged** — kept as the universal end-marker. | From a7735d3de8de5a5e854aff226f0296e88577b716 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 15:24:29 +0100 Subject: [PATCH 16/27] =?UTF-8?q?APPENDIX:=20=C2=A72.5=20Rasa/hassil/ASK/M?= =?UTF-8?q?ycroft=20comparisons;=20=C2=A76.5.2=20session-field=20+=20stamp?= =?UTF-8?q?-rule=20cheat-sheets?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two informative additions: - §2.5 (new): extends the §2 comparison set with Rasa, hassil, ASK / Dialogflow, and Mycroft. Locates the CONTEXT-1 design against Rasa's policy-engine-coupled forms; locates TRANSFORM-1 §3.4 against ASK/Dialogflow built-in entity types as the injectable open contract; documents Mycroft as the predecessor whose ad-hoc model the spec family formalizes. - §6.5.2 (new): session-field cheat-sheet consolidating the 26 fields claimed across SESSION-1, PIPELINE-1, TRANSFORM-1, and CONTEXT-1 into a single reference table — owner spec, role (preference / policy / signal / identity), empty-array semantics. Followed by a stamp-rule cheat-sheet covering the three component-identity context-key surfaces (skill_id, pipeline_id, _transformer_ids) and their behaviour across origination, .reply / .response, and .forward. Both reduce cross-spec bouncing for implementers. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 113 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 168ed1d..79aece0 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -125,6 +125,55 @@ not an architectural one, and one OVOS now has tooling for in ovos-localize (§8). The grammar itself is a commodity shared by all three; the OVOS bet is the engine-agnostic contract and the pipeline. +### 2.5 Comparison with Rasa, ASK / Dialogflow, and Mycroft + +Beyond HA and Rhasspy, four more comparators are worth a brief note since the +specifications make decisions in territory they each occupy: + +- **Rasa.** The closest comparator for OVOS-CONTEXT-1. Rasa's "active forms" + and slot mappings perform context-aware matching, but they are baked into + the policy engine; you cannot run a Rasa NLU pipeline without Rasa policies. + CONTEXT-1 separates **gating** (`requires_context` / `excludes_context`, + §6 / §6.1) from **match-time capture** (the context-supplied capture rule, + §7) from **engine matching hints** (engine-internal use of values, §6), so + every intent engine that consumes OVOS-INTENT-3 registrations can gate + uniformly without buying into a particular dialog policy. Rasa wins, + however, on conversation-level evaluation infrastructure — story-based + testing, end-to-end success metrics — for which the OVOS specs have no + analogue yet (APPENDIX §7 catalogues this as a known gap). +- **hassil.** The Home Assistant template-matcher, comparable only to + OVOS-INTENT-1 / -2 / -3 (grammar + locale resources + intent concept). + hassil has no equivalent of OVOS-MSG-1 (no bus envelope), OVOS-PIPELINE-1 + (no pipeline notion — HA runs a single matcher), OVOS-TRANSFORM-1 (no + per-utterance transformers), or OVOS-CONTEXT-1 (no decaying session state). + The grammar layer is broadly equivalent (§2.1 above); everything above the + grammar is OVOS-only. +- **Amazon ASK / Alexa Skills Kit, Google Dialogflow.** Both are closed- + domain centrally-trained stacks. Their built-in entity-type systems + (`AMAZON.DATE`, `@sys.date-time`) are what OVOS-TRANSFORM-1 §3.4 replicates + as an *injectable, deployer-replaceable, engine-agnostic* contract — at the + spec level OVOS is strictly more flexible, though OVOS defers the **typed + value formats themselves** (date encoding, number representation, duration + units) to a future text-normalization spec (APPENDIX §7), while ASK and + Dialogflow ship them as built-ins. Neither ASK nor Dialogflow has a + `session.pipeline`-equivalent (the assistant picks one matcher per skill); + neither has anything like the layer-2 substrate of OVOS-MSG-1 §3.4. +- **Mycroft (the predecessor).** The merged OVOS specifications are + effectively Mycroft plus corrections: single-flip routing (formalized in + OVOS-MSG-1 §5), the `:` dispatch shape generalized + beyond skills (OVOS-PIPELINE-1 §7), the per-injection-point transformer + contracts of OVOS-TRANSFORM-1, the explicit gating semantics of + OVOS-CONTEXT-1, the entry-topic rename to `ovos.utterance.handle` + (PIPELINE-1 §9.1). Mycroft had `Message.context` and ad-hoc + audio / utterance / TTS hooks but no normative contracts. The OVOS spec + family is the formalization Mycroft never produced. + +The novelty concentration across the family: PIPELINE-1 §5 / §5.5 (preference / +availability / policy composition), TRANSFORM-1 §3 (all six per-type +contracts), and CONTEXT-1 §3 / §6 / §7 (key-shape-encoded scope, gating +decoupled from capture, decaying state). Each of these moves is either novel +or meaningfully cleaner than what the comparators do. + --- ## 3. The pipeline-plugin model @@ -922,6 +971,70 @@ The collision is at the human-reading level only; payload shapes are distinct and a consumer subscribing to one cannot accidentally parse responses from another. +### 6.5.2 Session-field cheat-sheet (informative) + +Every spec in the family that claims a `session` field does so +via the OVOS-SESSION-1 §2.1 registry mechanism. The full set +spans three specs; this table consolidates them for +implementer reference. All fields follow the canonical +SHOULD-omit / `[]`-equivalent-to-omission wire-weight rule of +OVOS-SESSION-1 §3.4. + +| Field | Owner spec | Role | Empty-array semantics | +|-------|------------|------|------------------------| +| `session_id` | SESSION-1 §3.1 | identity / channel | n/a (string; `"default"` reserved) | +| `lang` | SESSION-1 §3.2.1 | preference (user) | n/a (string) | +| `secondary_langs` | SESSION-1 §3.2.2 | preference (user) | ≡ absent | +| `output_lang` | SESSION-1 §3.2.3 | preference (renderer) | n/a (string) | +| `stt_lang` | SESSION-1 §3.2.4 | signal (per-utterance) | n/a (string) | +| `request_lang` | SESSION-1 §3.2.5 | signal (emitter hint) | n/a (string) | +| `detected_lang` | SESSION-1 §3.2.6 | signal (lang-detect) | n/a (string) | +| `site_id` | SESSION-1 §3.3 | opaque group identifier | n/a (string) | +| `pipeline` | PIPELINE-1 §5.1 | preference (ordering) | ≡ absent | +| `blacklisted_pipelines` | PIPELINE-1 §5.2 | policy (denylist) | ≡ absent | +| `blacklisted_skills` | PIPELINE-1 §5.3 | policy (denylist) | ≡ absent | +| `blacklisted_intents` | PIPELINE-1 §5.4 | policy (denylist) | ≡ absent | +| `audio_transformers` | TRANSFORM-1 §5.1 | preference (chain) | ≡ absent | +| `utterance_transformers` | TRANSFORM-1 §5.1 | preference (chain) | ≡ absent | +| `metadata_transformers` | TRANSFORM-1 §5.1 | preference (chain) | ≡ absent | +| `intent_transformers` | TRANSFORM-1 §5.1 | preference (chain) | ≡ absent | +| `dialog_transformers` | TRANSFORM-1 §5.1 | preference (chain) | ≡ absent | +| `tts_transformers` | TRANSFORM-1 §5.1 | preference (chain) | ≡ absent | +| `blacklisted_audio_transformers` | TRANSFORM-1 §5.2 | policy (denylist) | ≡ absent | +| `blacklisted_utterance_transformers` | TRANSFORM-1 §5.2 | policy (denylist) | ≡ absent | +| `blacklisted_metadata_transformers` | TRANSFORM-1 §5.2 | policy (denylist) | ≡ absent | +| `blacklisted_intent_transformers` | TRANSFORM-1 §5.2 | policy (denylist) | ≡ absent | +| `blacklisted_dialog_transformers` | TRANSFORM-1 §5.2 | policy (denylist) | ≡ absent | +| `blacklisted_tts_transformers` | TRANSFORM-1 §5.2 | policy (denylist) | ≡ absent | +| `intent_context` | CONTEXT-1 §2 | per-session state | object; absent ≡ empty | + +**Role glossary:** + +- *Preference* — populated by the session origin to request + specific behaviour. Orchestrator narrows the request by + availability and policy. +- *Policy* — populated by deployment / layer-2 substrate to + enforce constraints. Overrides preference at the composition + stage (PIPELINE-1 §5.5, TRANSFORM-1 §5.3). +- *Signal* — recorded by a producer or earlier lifecycle stage + to communicate information about this specific utterance. +- *Identity / channel* — names the session itself; not a + preference or policy knob. + +**Stamp-rule cheat-sheet (component identities, not session +fields — for reference alongside the table above):** + +| Context key | Owner spec | Stamps on | Stamps on .reply / .response | Stamps on .forward | +|-------------|------------|-----------|-------------------------------|---------------------| +| `skill_id` | INTENT-4 §3.1 | every origination + modify-in-place | yes (authorial) | no (preserves inherited) | +| `pipeline_id` | PIPELINE-1 §3.1 | every origination + modify-in-place | yes (authorial) | no (preserves inherited) | +| `_transformer_ids` (six) | TRANSFORM-1 §1.3 | every origination + modify-in-place | yes (append to list) | no (list rides through) | + +All three identity surfaces coexist freely on a single Message +when the derivation chain crosses component boundaries. +Attribution consumers apply the eight-level precedence of +CONTEXT-1 §5.2 to pick a single owner when needed. + ### 6.6 Things the specs do *not* change - The session object's internal shape is now owned by From 5fafadc186f7a3207ca604548c6fc613aa6dadbe Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 15:35:28 +0100 Subject: [PATCH 17/27] APPENDIX: reorganize from 10 sections to 7, restructure for flow MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The appendix had become a dumping ground after multiple rounds of additions. Restructured with clear narrative flow: §1 About the OVOS specifications — formalization framing, the three-stack overview (was §9), compatibility levels (was §10), reference implementations + ecosystem tooling (folds in ovos-spec-tools from §9 and ovos-localize from §8). §2 Comparison with other voice-assistant systems — merges the HA/Rhasspy material (was §2) with the Rasa/ASK/ Dialogflow/Mycroft/hassil material (was §2.5) into a single comparator section, ordered by relevance: HA & Rhasspy (shared lineage) → open-vs-closed structural argument → Mycroft (predecessor) → Rasa (CONTEXT-1 comparator) → ASK/Dialogflow → hassil (grammar-only) → summary of where OVOS leads/follows/differs. §3 Architectural patterns — the bus as substrate (was §5) and the pipeline-plugin model (was §3) grouped as the two cross-cutting architectural moves. Bus-substrate section gains an explicit subsection on the layer-2 authorization story (preference / policy split). §4 Design rationale, per specification — was §4 itself but now systematically per-spec (INTENT-1+2+3 grouped, MSG-1, SESSION-1, INTENT-4, PIPELINE-1, CONTEXT-1, TRANSFORM-1). Stale references purged; recently added rationales (most-specific-wins precedence, bidirectional lang propagation, per-type denylists, etc.) folded in. §5 Where the specs differ from current OVOS code — was §6 but reorganized: removed the §6.5.1 introspection- patterns table and §6.5.2 cheat-sheets (they aren't divergences from code, they're implementer reference — moved to §6). Renumbered to §5.1–§5.7. §6 Implementer reference — new top-level section gathering the cross-spec reference tables that were scattered: topic-name conventions (with the 'intent' overload clarification), session-field cheat-sheet, component-identity stamp-rule cheat-sheet, introspection patterns table. These don't belong inside a 'divergences from code' section; they're how-to material for fresh implementers. §7 Known gaps and planned work — unchanged content, last section. Trimmed stale entries about CONTEXT-1 and TRANSFORM-1 as 'planned' (they've shipped); added conversation-level evaluation infrastructure as a gap. Net: same content, far more navigable. Cross-references updated throughout. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 2303 +++++++++++++++++++++++++++------------------------ 1 file changed, 1232 insertions(+), 1071 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 79aece0..e485f87 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -3,985 +3,1269 @@ **Non-normative.** This document is a companion to the OVOS formal specifications. It records design rationale, comparisons with other systems, the catalogue of *deliberate* divergences from current OVOS -code, and topics worth discussing that do not belong in a normative -specification. Nothing here is binding — the normative documents are -OVOS-INTENT-1, OVOS-INTENT-2, OVOS-INTENT-3, OVOS-INTENT-4, -OVOS-MSG-1, and OVOS-PIPELINE-1. This appendix exists so the specs -themselves can stay terse and requirement-focused. +code, and implementer-facing reference material that does not belong +in a normative specification body. Nothing here is binding — the +normative documents are OVOS-INTENT-1, OVOS-INTENT-2, +OVOS-INTENT-3, OVOS-INTENT-4, OVOS-MSG-1, OVOS-SESSION-1, +OVOS-PIPELINE-1, OVOS-CONTEXT-1, and OVOS-TRANSFORM-1. Pointers to specific OVOS code (file paths, class names, function -names) are deliberately kept *out* of the spec bodies and collected -here where appropriate, because implementation code moves and -specifications must not. +names) and to specific real projects (HiveMind, Mycroft, Adapt, +padatious, ovos-audio, ovos-workshop, …) are deliberately kept +*out* of the spec bodies and collected here, because implementation +code moves and specifications must not. --- -## 1. These specifications formalize an existing system - -The OVOS stack — the engines (padatious, Adapt), the skill ecosystem, -the resource file formats, the pipeline, the bus, the session model — -already exists and runs in production. These specifications were -written **after** the system they describe. They are a *formalization -pass*: they document an existing design implementation-agnostically, -tighten under-defined corners, and remove accidental inconsistencies, -so the contracts can be implemented by new engines, new hosts, and -adopted by other assistants. - -This matters for how to read them. They are **prescriptive** — each -spec states a clean target, and where it diverges from current OVOS -behaviour the divergence is a deliberate cleanup (catalogued in §6) — -but they are not speculative. The target is a lightly-cleaned version -of a working system, not a greenfield design. `padacioso`, -`ovos-workshop`, and `ovos-bus-client` are the closest existing -implementations; none yet fully conforms, and bringing them into -conformance is planned work. OVOS-MSG-1 is the closest to current code -of all the specs — it is largely a verbatim formalization of what -`ovos-bus-client` already does. +## 1. About the OVOS specifications + +### 1.1 Formalization of an existing system + +The OVOS stack — the engines (padatious, Adapt), the skill +ecosystem, the resource file formats, the pipeline, the bus, the +session model — already exists and runs in production. The +specifications were written **after** the system they describe. +They are a *formalization pass*: they document an existing design +implementation-agnostically, tighten under-defined corners, and +remove accidental inconsistencies, so the contracts can be +implemented by new engines, new hosts, and adopted by other +assistants. + +This matters for how to read them. They are **prescriptive** — +each spec states a clean target, and where it diverges from +current OVOS behaviour the divergence is a deliberate cleanup +(catalogued in §5) — but they are not speculative. The target is +a lightly-cleaned version of a working system, not a greenfield +design. `padacioso`, `ovos-workshop`, and `ovos-bus-client` are +the closest existing implementations; none yet fully conforms, +and bringing them into conformance is planned work. OVOS-MSG-1 +is the closest to current code of all the specs — it is largely +a verbatim formalization of what `ovos-bus-client` already does. + +### 1.2 The spec set, in three stacks + +The specifications are built bottom-up in three stacks: + +- **The intent stack**, in dependency order: OVOS-INTENT-1 + (template grammar) → OVOS-INTENT-2 (resource files) → + OVOS-INTENT-3 (the intent concept) → OVOS-INTENT-4 (the + registration wire format on the bus). +- **The bus stack**: OVOS-MSG-1 formalizes the envelope, routing, + session carrier, and `forward`/`reply`/`response` derivations. + OVOS-SESSION-1 formalizes the wire shape of the session + carrier and the field-registry mechanism by which other specs + claim session fields. +- **The orchestrator stack**: OVOS-PIPELINE-1 defines the + orchestrator, the pipeline-plugin abstraction, the utterance + lifecycle, and the handler-lifecycle trio. OVOS-CONTEXT-1 + defines per-session intent-context state. OVOS-TRANSFORM-1 + defines the six injection-point transformer chains. The + orchestrator stack sits on top of the bus stack (uses MSG-1's + envelope and routing, SESSION-1's session carrier) and around + the intent stack (intent registrations are one kind of input + pipeline plugins consume). + +### 1.3 Compatibility levels + +Each specification carries its own integer `Version`, bumped per +PR per the contributing rules in the README. The architecture as +a whole is spoken of at **compatibility levels** — versioned +snapshots a tool may target, checked against by +`ovos-spec-lint`. + +The compatibility-level model works cleanly for the **intent +stack**, where a single integer identifies a coherent grammar / +resources / intent-definition snapshot. The bus and orchestrator +stacks do not yet map onto the same single-axis ladder; a +specification-set-wide version tuple covering every spec is a +planned follow-up. + +The intent-stack ladder: + +- **V0** — *informal.* The undocumented, de-facto behaviour from + before these specifications existed. V0 is not specified + anywhere; it is the baseline the formalization started from. + V0 has no notion of the `.blacklist` resource role or of + `` references. +- **V1** — the intent stack as first formalized: + OVOS-INTENT-1, -2 and -3, each at version 1. V1's headline + addition over V0 is the `.blacklist` role. +- **V2** — V1 plus **inline vocabulary references** (the + `` token): OVOS-INTENT-1 and OVOS-INTENT-2 at version 2. + A V2 template cannot be expanded by a V1 tool. + +The bus stack (OVOS-MSG-1), the registration spec +(OVOS-INTENT-4), the session spec (OVOS-SESSION-1), the +orchestrator spec (OVOS-PIPELINE-1), the context spec +(OVOS-CONTEXT-1), and the transformer spec (OVOS-TRANSFORM-1) +are versioned **individually** and not placed on a unified +compatibility ladder. A tool targeting them today cites per-spec +versions: "MSG-1 v2, INTENT-4 v1, PIPELINE-1 v2." Whether the +compat-level model evolves into a multi-axis grid, per-stack +ladders, or is quietly deprecated in favour of per-spec +versions only, is deferred. + +### 1.4 Reference implementations and ecosystem tooling + +The **reference implementation for the intent stack** is +**`ovos-spec-tools`** — expander, resource loader, dialog +renderer, language matching, locale linter — in one +dependency-light Python package. New tools that consume locale +folders or expand templates should depend on it rather than +reimplementing. + +The bus and orchestrator stacks do not yet have a comparable +ground-up reference implementation; `ovos-bus-client` is the +closest match for OVOS-MSG-1 and `ovos-core` is the closest +match for OVOS-PIPELINE-1 + OVOS-INTENT-4, but both predate the +specs. + +**`ovos-localize`** is the i18n-operation layer atop the intent +stack: a GitHub-native localization platform for OVOS skills, +built specifically around the resource roles of OVOS-INTENT-2. +It scans skill repositories for locale files; analyzes each +skill's Python source (via AST) to recover the **handler +context** of a resource — which function uses a file, what its +slots mean, what dialog it triggers, which is exactly the +intent↔handler binding of OVOS-INTENT-3 §1; validates +translations against a rule set (slot preservation, expansion +validity, variant counts); and lets translators browse, edit, +preview, and submit translations as pull requests. It is the +OVOS counterpart to Home Assistant's managed `intents` +repository. + +Two honest notes on `ovos-localize`: it is currently +**descriptive** of real OVOS skills — it also handles legacy +file types these specs deliberately drop — so as the specs and +the ecosystem converge, its file-type coverage and the specs +will need to meet in the middle; and its translation validators +are a natural home for spec conformance checks, distinct from +but related to the planned grammar-level conformance corpus +(§7). --- -## 2. Comparison with Home Assistant and Rhasspy - -OVOS, Home Assistant (HA), and Rhasspy share a common lineage. The -bracket-expansion grammar of OVOS-INTENT-1 — `(a|b)` alternatives, `[optional]` -segments, `{slot}` placeholders — is the same family as HA's `hassil` sentence -templates and Rhasspy's `sentences.ini`. The *syntax* is not novel. What is -distinctive about the OVOS approach is everything around the grammar. - -### 2.1 What the OVOS design does differently - -- **An implementation-agnostic spec at all.** HA and Rhasspy have no - format-level specification independent of their implementation — the code is - the contract. OVOS now has one, which is what lets multiple engines (and - other assistants) implement the same contract. -- **Engine-agnostic matching.** OVOS-INTENT-1 §4 treats templates as *training - data* and leaves matching, scoring, and generalization to the engine. HA's - core matching is `hassil`, a deterministic template matcher; Rhasspy compiles - templates into a closed ASR grammar. The OVOS contract accommodates a - deterministic matcher, a neural classifier, or an LLM behind one interface. -- **Templates are training data, not a closed grammar.** A capable OVOS engine - generalizes beyond the authored samples. Rhasspy's closed-grammar model is - deterministic and offline-guaranteed but brittle — an utterance not derivable +## 2. Comparison with other voice-assistant systems + +The OVOS specifications occupy territory adjacent to several +existing voice-assistant systems. This section locates the +design choices against each comparator. The summary in §2.7 +records where OVOS leads architecturally, where it follows, and +where it makes a deliberately different choice. + +### 2.1 Home Assistant and Rhasspy — shared grammar lineage + +OVOS, Home Assistant (HA), and Rhasspy share a common lineage. +The bracket-expansion grammar of OVOS-INTENT-1 — `(a|b)` +alternatives, `[optional]` segments, `{slot}` placeholders — is +the same family as HA's `hassil` sentence templates and +Rhasspy's `sentences.ini`. The *syntax* is not novel. What is +distinctive about the OVOS approach is everything around the +grammar. + +**What OVOS does differently:** + +- **An implementation-agnostic spec at all.** HA and Rhasspy + have no format-level specification independent of their + implementation — the code is the contract. OVOS now has one, + which is what lets multiple engines (and other assistants) + implement the same contract. +- **Engine-agnostic matching.** OVOS-INTENT-1 §4 treats + templates as *training data* and leaves matching, scoring, + and generalization to the engine. HA's core matching is + `hassil`, a deterministic template matcher; Rhasspy compiles + templates into a closed ASR grammar. The OVOS contract + accommodates a deterministic matcher, a neural classifier, + or an LLM behind one interface. +- **Templates are training data, not a closed grammar.** A + capable OVOS engine generalizes beyond the authored samples. + Rhasspy's closed-grammar model is deterministic and + offline-guaranteed but brittle — an utterance not derivable from `sentences.ini` cannot be recognized at all. -- **A multi-stage pipeline** (see §3). Intent engines are two stage kinds among - many. Neither HA nor Rhasspy exposes an intent layer this structured. -- **An intent is bound to one handler, owned by one skill** (OVOS-INTENT-3 §1). - See §2.3 — this follows necessarily from the open skill ecosystem. -- **A bus substrate that is openable to layer-2 systems** (OVOS-MSG-1 - §3.4, §4.4). The `source`/`destination` boundary pair plus - `session.session_id` give third parties everything they need to - layer authentication, routing, and remote participation on top of - OVOS without modifying it. HiveMind is the canonical example. - Neither HA nor Rhasspy exposes their bus this openly. See §5. - -### 2.2 What Home Assistant and Rhasspy do better - -- **Reusable template fragments.** `hassil` has `expansion_rules` and Rhasspy - has `` references — named, reusable sub-templates that let authors share - common fragments (politeness prefixes, articles, recurring phrasings). The - version-1 OVOS grammar had no equivalent. **OVOS-INTENT-1 version 2 closes - this** with the `` inline vocabulary reference (issue #1), which expands - a named `.voc` in place — reusing the existing slot-free format rather than - adding a new construct (see §4). -- **i18n corpus maturity.** HA's community `intents` repository is a large, - managed, professionally-translated corpus covering many languages. OVOS has - the tooling counterpart in **ovos-localize** (§8) — a GitHub-native - localization platform built around the OVOS-INTENT-2 resource roles — so the - gap here is the *scale and maturity* of the corpus, not the absence of +- **A multi-stage pipeline** (§3.2). Intent engines are two + stage kinds among many. Neither HA nor Rhasspy exposes an + intent layer this structured. +- **An intent is bound to one handler, owned by one skill** + (OVOS-INTENT-3 §1). See §2.2 — this follows necessarily from + the open skill ecosystem. +- **A bus substrate openable to layer-2 systems** (§3.1). + Neither HA nor Rhasspy exposes their bus this openly. + +**What HA and Rhasspy do better:** + +- **Reusable template fragments.** `hassil` has + `expansion_rules` and Rhasspy has `` references — + named, reusable sub-templates that let authors share common + fragments (politeness prefixes, articles, recurring + phrasings). OVOS-INTENT-1 version 2 closes this with the + `` inline vocabulary reference, which expands a named + `.voc` in place — reusing the existing slot-free format + rather than adding a new construct. +- **i18n corpus maturity.** HA's community `intents` + repository is a large, managed, professionally-translated + corpus covering many languages. OVOS has the tooling + counterpart in `ovos-localize` (§1.4) — so the gap here is + the *scale and maturity* of the corpus, not the absence of tooling. -- **Concrete, testable completeness.** HA and Rhasspy ship systems where the - hard parts — matching, number and range handling, slot typing — are solved - concretely. The OVOS specs deliberately defer some of these (slot typing to a - future normalization spec; matching to the engine). That deferral is - intellectually consistent but means the specs' value depends on the engines - and tooling that fill the gaps. - -### 2.3 Closed domain vs open ecosystem - -The sharpest difference is not technical but structural. **Home Assistant is a -curated, closed domain**: home automation, with a vendor-managed intent -vocabulary. HA can treat an intent such as `HassTurnOn` as a *shared contract* -honoured uniformly across hundreds of integrations and many languages, because -HA controls and curates that vocabulary. - -**OVOS is an open ecosystem.** Skills are arbitrary third-party Python packages, -installed by pip, developed independently, running as arbitrary code in -process. A skill can do anything; OVOS voice-enables anything. In that setting a -shared global intent vocabulary is not a missing feature — it is incoherent. -When skills are unbounded, an intent *must* be private to the skill that -defines it and bound directly to that skill's handler. OVOS-INTENT-3's "an -intent is not an event" stance is therefore the correct model for an open -ecosystem, just as HA's shared-vocabulary model is correct for a curated one. -The two models are right for different platforms; neither is universally -better. - -### 2.4 Summary - -OVOS is not out-designed by HA or Rhasspy at the architecture level — at the -pipeline layer (§3) it is ahead of both, and its intent-as-handler-binding -model is the correct consequence of being an open platform. HA's real advantage -is the maturity and scale of its translation corpus — an ecosystem investment, -not an architectural one, and one OVOS now has tooling for in ovos-localize -(§8). The grammar itself is a commodity shared by all three; the OVOS bet is the -engine-agnostic contract and the pipeline. - -### 2.5 Comparison with Rasa, ASK / Dialogflow, and Mycroft - -Beyond HA and Rhasspy, four more comparators are worth a brief note since the -specifications make decisions in territory they each occupy: - -- **Rasa.** The closest comparator for OVOS-CONTEXT-1. Rasa's "active forms" - and slot mappings perform context-aware matching, but they are baked into - the policy engine; you cannot run a Rasa NLU pipeline without Rasa policies. - CONTEXT-1 separates **gating** (`requires_context` / `excludes_context`, - §6 / §6.1) from **match-time capture** (the context-supplied capture rule, - §7) from **engine matching hints** (engine-internal use of values, §6), so - every intent engine that consumes OVOS-INTENT-3 registrations can gate - uniformly without buying into a particular dialog policy. Rasa wins, - however, on conversation-level evaluation infrastructure — story-based - testing, end-to-end success metrics — for which the OVOS specs have no - analogue yet (APPENDIX §7 catalogues this as a known gap). -- **hassil.** The Home Assistant template-matcher, comparable only to - OVOS-INTENT-1 / -2 / -3 (grammar + locale resources + intent concept). - hassil has no equivalent of OVOS-MSG-1 (no bus envelope), OVOS-PIPELINE-1 - (no pipeline notion — HA runs a single matcher), OVOS-TRANSFORM-1 (no - per-utterance transformers), or OVOS-CONTEXT-1 (no decaying session state). - The grammar layer is broadly equivalent (§2.1 above); everything above the - grammar is OVOS-only. -- **Amazon ASK / Alexa Skills Kit, Google Dialogflow.** Both are closed- - domain centrally-trained stacks. Their built-in entity-type systems - (`AMAZON.DATE`, `@sys.date-time`) are what OVOS-TRANSFORM-1 §3.4 replicates - as an *injectable, deployer-replaceable, engine-agnostic* contract — at the - spec level OVOS is strictly more flexible, though OVOS defers the **typed - value formats themselves** (date encoding, number representation, duration - units) to a future text-normalization spec (APPENDIX §7), while ASK and - Dialogflow ship them as built-ins. Neither ASK nor Dialogflow has a - `session.pipeline`-equivalent (the assistant picks one matcher per skill); - neither has anything like the layer-2 substrate of OVOS-MSG-1 §3.4. -- **Mycroft (the predecessor).** The merged OVOS specifications are - effectively Mycroft plus corrections: single-flip routing (formalized in - OVOS-MSG-1 §5), the `:` dispatch shape generalized - beyond skills (OVOS-PIPELINE-1 §7), the per-injection-point transformer - contracts of OVOS-TRANSFORM-1, the explicit gating semantics of - OVOS-CONTEXT-1, the entry-topic rename to `ovos.utterance.handle` - (PIPELINE-1 §9.1). Mycroft had `Message.context` and ad-hoc - audio / utterance / TTS hooks but no normative contracts. The OVOS spec - family is the formalization Mycroft never produced. - -The novelty concentration across the family: PIPELINE-1 §5 / §5.5 (preference / -availability / policy composition), TRANSFORM-1 §3 (all six per-type -contracts), and CONTEXT-1 §3 / §6 / §7 (key-shape-encoded scope, gating -decoupled from capture, decaying state). Each of these moves is either novel -or meaningfully cleaner than what the comparators do. +- **Concrete, testable completeness.** HA and Rhasspy ship + systems where the hard parts — matching, number and range + handling, slot typing — are solved concretely. The OVOS + specs deliberately defer some of these (slot typing to a + future normalization spec; matching to the engine). That + deferral is intellectually consistent but means the specs' + value depends on the engines and tooling that fill the gaps. + +### 2.2 Closed domain vs open ecosystem + +The sharpest difference between OVOS and HA is not technical +but structural. **Home Assistant is a curated, closed domain**: +home automation, with a vendor-managed intent vocabulary. HA +can treat an intent such as `HassTurnOn` as a *shared contract* +honoured uniformly across hundreds of integrations and many +languages, because HA controls and curates that vocabulary. + +**OVOS is an open ecosystem.** Skills are arbitrary third-party +Python packages, installed by pip, developed independently, +running as arbitrary code in process. A skill can do anything; +OVOS voice-enables anything. In that setting a shared global +intent vocabulary is not a missing feature — it is incoherent. +When skills are unbounded, an intent *must* be private to the +skill that defines it and bound directly to that skill's +handler. OVOS-INTENT-3's "an intent is not an event" stance is +therefore the correct model for an open ecosystem, just as HA's +shared-vocabulary model is correct for a curated one. The two +models are right for different platforms; neither is +universally better. + +### 2.3 Mycroft — the predecessor + +The merged OVOS specifications are effectively Mycroft plus +corrections. The bus model, the `mycroft.skill.handler.*` +trio, the `recognizer_loop:utterance` entry topic, the session +concept — all inherited. Mycroft never wrote any of this down. + +OVOS's contribution is the *formalization*, plus the cleanups +of the prescriptive divergence catalogue (§5): + +- Single-flip routing (formalized in OVOS-MSG-1 §5). +- The `:` dispatch shape generalized + beyond skills (OVOS-PIPELINE-1 §7). +- The per-injection-point transformer contracts of + OVOS-TRANSFORM-1 (Mycroft had ad-hoc audio / utterance / + TTS hooks but no normative IO contract). +- The explicit gating semantics of OVOS-CONTEXT-1 (Mycroft + had `Message.context` adjacency rules but no normative + gate semantics). +- The handler-lifecycle trio renamed `mycroft.skill.handler.*` + → `ovos.intent.handler.*`. +- The entry-topic rename `recognizer_loop:utterance` → + `ovos.utterance.handle` (OVOS-PIPELINE-1 §9.1). + +### 2.4 Rasa — closest comparator for intent context + +Rasa's "active forms" and slot mappings perform context-aware +matching, but they are baked into the policy engine; you +cannot run a Rasa NLU pipeline without Rasa policies. +OVOS-CONTEXT-1 separates **gating** (`requires_context` / +`excludes_context`, §6 / §6.1 of that spec) from **match-time +capture** (the context-supplied capture rule, §7) from **engine +matching hints** (engine-internal use of values, §6), so every +intent engine that consumes OVOS-INTENT-3 registrations can +gate uniformly without buying into a particular dialog policy. + +Rasa wins on conversation-level evaluation infrastructure — +story-based testing, end-to-end success metrics — for which +the OVOS specs have no analogue yet (§7 catalogues this as a +known gap). + +Rasa's NLU pipeline is also the closest analogue to +OVOS-TRANSFORM-1's utterance / metadata / intent chains, but +it is a single sequence per language model and the +policy/preference split (TRANSFORM-1 §5.3) does not exist. +TRANSFORM-1's six-injection-point model is genuinely more +expressive. + +### 2.5 Amazon ASK / Alexa Skills Kit, Google Dialogflow + +Both are closed-domain centrally-trained stacks. Their +built-in entity-type systems (`AMAZON.DATE`, +`@sys.date-time`) are what OVOS-TRANSFORM-1 §3.4 replicates as +an *injectable, deployer-replaceable, engine-agnostic* +contract — at the spec level OVOS is strictly more flexible, +though OVOS defers the **typed value formats themselves** +(date encoding, number representation, duration units) to a +future text-normalization spec (§7), while ASK and Dialogflow +ship them as built-ins. + +Neither ASK nor Dialogflow has a `session.pipeline`-equivalent +(the assistant picks one matcher per skill); neither has +anything like the layer-2 substrate of OVOS-MSG-1 §3.4. ASK +has built-in intents (`AMAZON.HelpIntent`) but they are +handled inside the skill; Dialogflow has fallback intents but +they do not have first-class dispatch identity. OVOS-PIPELINE-1's +`:` lets a non-skill component +advertise its own intent identity *to the user* on the bus, +indistinguishable from a skill — original to OVOS. + +### 2.6 hassil — comparable only at the grammar layer + +The Home Assistant template-matcher, comparable only to +OVOS-INTENT-1 / -2 / -3 (grammar + locale resources + intent +concept). hassil has no equivalent of OVOS-MSG-1 (no bus +envelope), OVOS-PIPELINE-1 (no pipeline notion — HA runs a +single matcher), OVOS-TRANSFORM-1 (no per-utterance +transformers), or OVOS-CONTEXT-1 (no decaying session state). +The grammar layer is broadly equivalent to OVOS-INTENT-1; +everything above the grammar is OVOS-only. + +### 2.7 Summary — where OVOS leads, follows, and differs + +**OVOS leads architecturally** in three places: + +- **The pipeline-plugin model with first-class dispatch + polymorphism.** No comparator lets a non-skill component + (LLM persona, chatbot, fallback) be a first-class handler + owner on the same dispatch surface. +- **The six-injection-point transformer chain with per-session + preference/policy separation.** Nothing in HA, Rhasspy, + Mycroft, Rasa, ASK, or Dialogflow has a comparable + lifecycle-uniform extensibility surface. +- **Negative gating (`excludes_context` "match if absent") + in CONTEXT-1.** ASK/Dialogflow contexts are purely + positive; Rasa forms are not engine-agnostic; HA has no + context model. The fire-once and modal-suppression patterns + fall out of negative gating. + +**OVOS follows** where ecosystem investment matters more than +architecture: + +- HA's translation corpus scale (the `intents` repository). +- ASK / Dialogflow's typed entity systems. +- Rasa's conversation-level evaluation infrastructure. + +**OVOS makes a deliberately different choice** in two places: + +- *Engine-agnostic templates as training data* (OVOS-INTENT-1 + §4) rather than Rhasspy-style closed grammars. The trade-off: + generalization beyond authored samples vs. offline-deterministic + recognition guarantees. +- *Open skill ecosystem with skill-private intents* + (OVOS-INTENT-3 §1) rather than HA-style curated vocabulary. + The trade-off: skill author freedom vs. cross-integration + vocabulary sharing. --- -## 3. The pipeline-plugin model +## 3. Architectural patterns + +Two patterns recur across the spec family and are worth a +dedicated treatment. + +### 3.1 The bus as a substrate + +Under OVOS-MSG-1's `source` / `destination` / `session` model, +the bus is not just an internal transport — it is the +**substrate higher-level systems plug into without modifying +the assistant core**. Two mechanics make that work: +**single-flip routing** (§3.1.1), which keeps the routing pair +correct end-to-end without per-component effort; and **no +central state or correlation** (§3.1.2), which makes layer-2 +systems composable. HiveMind is the canonical example of what +both together enable (§3.1.3). + +#### 3.1.1 The single-flip routing model + +The most important bus invariant in OVOS, and the one most +often reinvented incorrectly. The routing pair (`source`, +`destination`) flips **exactly once per conversational turn**, +performed by ovos-core, before the intent dispatch is emitted. +From that point on, every handler-side emission is *already* +addressed back at the user. + +Three steps: + +1. **The user side emits.** An external component — + microphone service, chat UI, satellite client, test harness + — emits an utterance with `source` set to itself: + + context: { source: "audio", destination: null, session: {...} } + +2. **ovos-core flips, then dispatches.** When the intent + service matches an intent it derives the dispatch via + `Message.reply(match_type, data)` + (`ovos-core/.../service.py:340`). The `.reply` rule of + MSG-1 §5.2 swaps the routing pair: + + context: { source: "ovos-core", destination: "audio", session: {...} } + + The dispatch goes out on the per-intent topic + `:`. The flip has already classified + the message as *going back at the user*, even though a + skill handler is what actually runs. + +3. **The handler `.forward`s.** Every message the skill emits + in response — `speak`, the handler lifecycle trio, GUI + events — uses `Message.forward(...)` + (`ovos-workshop/.../ovos.py:1461, 1472, …`). `.forward` + preserves `context` unchanged, so every handler emission is + already addressed back at the original user-side component. + +Two consequences fall out: + +- **The boundary is user ↔ assistant, not core ↔ handler.** + Skill handlers are on OVOS's side of the boundary; from + outside, OVOS is one thing. The user doesn't know or care + which skill answered them. +- **Handler authors never write addressing code.** Because + `.forward` preserves the flipped pair, no skill anywhere + needs to understand `source` / `destination`. Get the + inversion right once in ovos-core, and every downstream + skill is automatically correct. + +What this rules out: no per-hop addressing (handlers don't +pick their own `destination`); no second flip (handlers +`.forward`, they don't `.reply` to the dispatch); the dispatch +topic `:` selects the handler, not +`destination` (the destination belongs to the user). +Implementers using `.reply` where `.forward` is appropriate +produce mis-routed messages that work in local tests but +silently break layer-2 routing. + +#### 3.1.2 No central correlation, no central state + +The bus is **fully asynchronous**. OVOS does not centrally +correlate request/response chains, and does not centrally +track per-conversation state. There is no per-message +identifier, no in-reply-to field, no host-side index mapping +a `.response` back to its request, no shared "current +conversation" record. + +`session.session_id` identifies an **interaction channel** — +nothing more. Two messages sharing a `session_id` are on the +same channel, but the spec guarantees nothing about ordering, +state continuity, or pending requests. + +Every component — skills, pipeline plugins, external clients, +layer-2 systems — owns whatever state it needs. An asker that +wants `.response` correlation keeps its own outstanding-request +table; a skill that wants conversational memory keeps its own +per-session store; a layer-2 system that wants per-peer state +keys on `session_id`. Whatever a later consumer needs is **in +the Message** (`data` / `context` / `session`) or **out of +band** — never recovered from a hidden host-side index. + +This is what lets layer-2 systems plug in cleanly: if OVOS +kept a central correlation index or a central conversation +state, every layer-2 system would have to replicate it, hook +into it, or work around it. Because OVOS keeps neither, they +compose without contention. + +Several real concerns are deferred by this stance and are +listed under §7 (Known gaps): multi-turn conversation, the +other session knobs current OVOS carries beyond `session_id` +and `lang` (`persona_id`, `time_format`, `date_format`, +`system_unit`, `tts_preferences`, …), and the eventual shape +of conversational state. The async-by-default model means +those future specs only need to define *what* the state is, +not *how* it travels. + +#### 3.1.3 Why HiveMind works + +HiveMind is the canonical layer-2 system this design enables. +A HiveMind satellite is just another user-side emitter — it +sets `source` to its peer ID, populates `session` with a +per-peer session, and emits a Message. Inside OVOS: + +- ovos-core runs the same `.reply` flip (§3.1.1 step 2) — + `destination` becomes the satellite's peer ID instead of + the local microphone. +- Skills `.forward` as usual — `destination` stays the + satellite ID through every handler emission. +- HiveMind, watching the bus, sees each message addressed to + its peer and routes it back over the HiveMind transport. + +The pre-existing `session_id == "default"` rule keeps +device-local TTS on the device's speakers (per +`ovos-audio/utils.py`'s `require_default_session`), because +remote HiveMind sessions carry their own `session_id` and +never `"default"`. + +None of this required HiveMind to modify OVOS core. The +mechanism that makes it work — single-flip routing + opaque +per-session identifiers + no central state — was already in +`ovos-bus-client/message.py:194-198`; OVOS-MSG-1 just names +and formalizes it. + +A layer-2 substrate also has a uniform **authorization +surface** in the spec family without inventing a separate +channel: client sessions populate the preference fields of +OVOS-SESSION-1 (`pipeline`, the six `_transformers`) +to request behaviour, while the layer-2 substrate populates +the policy fields (`blacklisted_pipelines`, +`blacklisted_skills`, `blacklisted_intents`, the six +`blacklisted__transformers`) from the peer's grant. +OVOS-PIPELINE-1 §5.5 and OVOS-TRANSFORM-1 §5.3 compose them +deterministically (preference → availability → policy) at the +orchestrator without per-hop re-authorization. + +### 3.2 The pipeline-plugin model The piece that sits *around* the intent and bus stacks — the -multi-stage orchestrator that decides which engine even gets a -turn, runs `converse` / `fallback` / `common_query` / `ocp` / +multi-stage orchestrator that decides which engine even gets +a turn, runs `converse` / `fallback` / `common_query` / `ocp` / `persona` stages, and produces the universal `ovos.utterance.handled` end-marker — is what makes OVOS -structurally distinctive (Home Assistant and Rhasspy have no -equivalent layer). +structurally distinctive (HA and Rhasspy have no equivalent +layer). The plugin abstraction is **already in current code**: `OVOSPipelineFactory` loads pipeline plugins by id at startup, -the orchestrator holds them in a `pipeline_plugins` dict keyed -on `pipeline_id`, and the default `Session.pipeline` is an -ordered list of plugin identifiers (with a migration map -translating legacy `padatious_high`-style names into -modern `ovos-padatious-pipeline-plugin-high`-style ones). The +the orchestrator holds them in a `pipeline_plugins` dict +keyed on `pipeline_id`, and the default `Session.pipeline` is +an ordered list of plugin identifiers (with a migration map +translating legacy `padatious_high`-style names into modern +`ovos-padatious-pipeline-plugin-high`-style ones). The official `ovos-padatious-pipeline-plugin`, `ovos-adapt-pipeline-plugin`, `ovos-converse-pipeline-plugin`, -`ovos-fallback-pipeline-plugin`, `ovos-common-query-pipeline-plugin`, +`ovos-fallback-pipeline-plugin`, +`ovos-common-query-pipeline-plugin`, `ovos-ocp-pipeline-plugin`, and the persona plugins all already conform to this model. OVOS-PIPELINE-1's contribution is therefore a **prescriptive refinement**, not a wholesale new abstraction. It: -- formalizes the plugin contract (the `match` shape, the `Match` - result, the side-effect-free discipline); -- defines `:` **dispatch polymorphism** so - a plugin can bundle its own handler (a language-model persona, - a chatbot) as a first-class participant alongside skill-owned - handlers; -- prescribes the **universal `ovos.utterance.handled` end-marker** - on every terminal path; -- renames the `mycroft.skill.handler.*` trio → `ovos.intent.handler.*`. +- formalizes the plugin contract (the `match` shape, the + `Match` result, the side-effect-free discipline); +- defines `:` **dispatch + polymorphism** so a plugin can bundle its own handler (a + language-model persona, a chatbot) as a first-class + participant alongside skill-owned handlers; +- prescribes the **universal `ovos.utterance.handled` + end-marker** on every terminal path; +- renames the `mycroft.skill.handler.*` trio → + `ovos.intent.handler.*`. The current high/medium/low confidence-tier convention is **compatible** with PIPELINE-1 and out of scope for the spec. From the bus's perspective each tier is already a distinct `pipeline_id` in the session's pipeline list (e.g. -`padatious_high`, `padatious_medium`, `padatious_low`), which is -exactly what the spec prescribes. How a Python plugin class -internally serves multiple `pipeline_id`s — for example one class -with `match_high` / `match_medium` / `match_low` methods, an -orchestrator-side suffix-decoding helper, three separate plugin -instances, etc. — is implementation choice this spec does not -constrain. +`padatious_high`, `padatious_medium`, `padatious_low`), which +is exactly what the spec prescribes. How a Python plugin +class internally serves multiple `pipeline_id`s — one class +with `match_high` / `match_medium` / `match_low` methods, +three separate plugin instances, an orchestrator-side +suffix-decoding helper — is implementation choice the spec +does not constrain. -Three properties make the resulting model unusually expressive: +Three properties make the resulting model unusually +expressive: - **All plugins are equivalent.** No spec-level distinction between intent engines, converse handlers, fallbacks, language-model personas, classic chatbots, anything else. - They all expose the same `match` contract. A deployment loads - whichever plugins its skills need. -- **Skills and plugin-bundled handlers are indistinguishable as - handler owners.** From outside, the assistant responded — the - user does not know or care whether a skill matched against a - registered intent or a language-model plugin generated the - response on the fly. -- **The engine-agnostic intent contract is already realized**, - not hypothetical. OVOS persona plugins (`ovos-persona`, - `ovos-persona-server`, `ovos-claude-plugin`, - `ovos-openai-plugin`, etc.) plug into the pipeline as - first-class language-model stages. The ordered chain - (deterministic keyword engines before fuzzy template engines - before language-model fallbacks last) is also how the system - *bounds* generalization in practice. + They all expose the same `match` contract. +- **Skills and plugin-bundled handlers are indistinguishable + as handler owners.** From outside, the assistant + responded — the user does not know or care whether a skill + matched against a registered intent or a language-model + plugin generated the response on the fly. +- **The engine-agnostic intent contract is already + realized**, not hypothetical. OVOS persona plugins + (`ovos-persona`, `ovos-persona-server`, + `ovos-claude-plugin`, `ovos-openai-plugin`, etc.) plug into + the pipeline as first-class language-model stages. The + ordered chain (deterministic keyword engines before fuzzy + template engines before language-model fallbacks last) is + also how the system *bounds* generalization in practice. What OVOS-PIPELINE-1 deliberately leaves out: **per-plugin behavioural contracts**. A `converse` plugin, a `fallback` -plugin, a persona plugin: each defines itself. PIPELINE-1 only -defines the contract every plugin conforms to and the universal -utterance lifecycle around the iteration. +plugin, a persona plugin: each defines itself. PIPELINE-1 +only defines the contract every plugin conforms to and the +universal utterance lifecycle around the iteration. --- -## 4. Design rationale - -Short notes on *why* the specifications make the choices they do — the -reasoning, not the requirement. - -### Intent grammar and resources (INTENT-1, -2, -3) - -- **ASR-normalized input, no escaping** (OVOS-INTENT-1 §2). The grammar targets - voice input. By contract, text reaching an engine is already lowercased, - punctuation-stripped, single-spaced. Bracket metacharacters therefore cannot - occur as literal input, so no escape mechanism is needed. This is a - simplification *bought* by scoping the grammar to voice. -- **Templates are training data** (OVOS-INTENT-1 §4). Enumerating every - phrasing is futile for natural speech. A template describes the *shape* of - the training data; the engine generalizes. This is why expansion is defined - precisely but matching is not. -- **An intent is not an event** (OVOS-INTENT-3 §1). See §2.3 — necessary for an - open skill ecosystem. -- **Two non-interoperable methods** (OVOS-INTENT-3 §2). Keyword and template - intents describe a command in fundamentally different shapes. Rather than - forcing one model, the spec keeps both and makes engines declare which they - accept. The cost is that a developer must choose per intent and know which - engines an installation runs. -- **Slot typing is deferred** (OVOS-INTENT-1 §5.3). Interpreting a slot value - as a number or date is inseparable from how ASR output is normalized — and - normalization is not yet specified. Specifying typing first would be - incoherent, so a value is, for now, an opaque sequence of words. -- **`.blacklist` vs `excluded`** (OVOS-INTENT-3 §4.2, §5.4). The template - grammar is purely generative — it cannot express "not this". Template intents - therefore need a separate `.blacklist` artifact for suppression. Keyword - intents express the same idea natively with the `excluded` constraint role. - The asymmetry follows from the grammar, not from inconsistency. -- **No regular expressions** (OVOS-INTENT-3 §4.4). Free-form structured text is - a slot — use a template intent and the slot extractor. Regexes are also - notoriously hard to localize, which conflicts with the per-language model. -- **Inline vocabulary references reuse `.voc`** (OVOS-INTENT-1 §3.7). A - reusable template fragment and a keyword vocabulary are the same thing — a - named, slot-free phrase set — so `` resolves to a `.voc` rather than - introducing a new file role. The change is one grammar token plus an - expander step. - -### Bus, session, and routing (MSG-1) - -- **One spec, not two.** Envelope + routing + session + derivations - are tightly coupled — every routing key lives in `context`, every - derivation manipulates routing or session, and all of them - formalize *existing* OVOS code. Splitting them was tried; the split - did not survive the derivations (which can only meaningfully be - defined where the routing keys are), so they were merged into a - single bus-message spec. -- **`context` is extensible by design.** Only the keys other systems - already key behaviour off (`source`, `destination`, `session`) are - given normative meaning. Everything else — GUI routing, tracing, - security — is layered by other specs without touching the - envelope. -- **`source`/`destination` are informational, not authorization** - (MSG-1 §3.3). The bus is not a security boundary. Layer-2 systems - (HiveMind) build authentication and routing enforcement on top of - the pair without OVOS itself learning about peers. -- **The boundary is user ↔ assistant, not core ↔ handler.** The - `(source, destination)` pair marks who is currently talking to whom - across one boundary only: the external participant (user, chat UI, - satellite client, test harness) on one side, the assistant — OVOS - core *and* every skill handler — on the other. Skills are not on the - other side of this boundary from OVOS core; from the user's - perspective the assistant is one thing. The flip happens **once** - per conversational turn (§5.1), not on every internal hop. -- **`session_id == "default"` is the only normative-magic value** - (MSG-1 §4.1). It marks "originated by the device itself" and is the - hook `ovos-audio` already uses to decide whether to play TTS - locally. One reserved string, one well-defined consequence — enough - for layer-2 routing without specifying a full session model. -- **Absent `session` equals `session_id: "default"`** (MSG-1 §4.3). - Code paths that never set a session shouldn't accidentally get - treated as untrusted; the rule makes the substrate forgiving for - in-process subsystems while keeping the policy hook intact. -- **No central correlation, no central state** (MSG-1 §5.4). The bus - is fully asynchronous. There is no per-message ID, no - in-reply-to chain, no host-managed request/response index, and no - spec-level state tracking of any kind. Components that need to - correlate or remember things do it themselves, keyed on - `session.session_id` (the interaction-channel identifier — §5.2 - below). Multi-turn conversation, intent context, cross-skill - state, and similar concerns are deferred to future specifications; - see §5.2 for the model and §7 for the list of planned work. - -### Intent registration broadcast (INTENT-4) - -- **Registrations are broadcast — already how OVOS works.** Skills - emit registration messages on the bus; plugins that care about a - particular registration kind subscribe to the corresponding - topic. There has never been a central routing party in OVOS; - INTENT-4 just gives this existing model normative topic names. - The legacy bus topics (`padatious:register_intent`, - `register_vocab`, etc.) are renamed into the `ovos.intent.*` - namespace — see §6.7 for the mapping. A migration to the - prescribed topic names is mostly a string replacement. +## 4. Design rationale, per specification + +Short notes on *why* the specifications make the choices they +do — the reasoning, not the requirement. Cross-reference into +the normative sections. + +### 4.1 Intent grammar and resources (INTENT-1, -2, -3) + +- **ASR-normalized input, no escaping** (INTENT-1 §2). The + grammar targets voice input. By contract, text reaching an + engine is already lowercased, punctuation-stripped, + single-spaced. Bracket metacharacters therefore cannot + occur as literal input, so no escape mechanism is needed. + A simplification *bought* by scoping the grammar to voice. +- **Templates are training data** (INTENT-1 §4). Enumerating + every phrasing is futile for natural speech. A template + describes the *shape* of the training data; the engine + generalizes. This is why expansion is defined precisely + but matching is not. +- **An intent is not an event** (INTENT-3 §1). Necessary for + an open skill ecosystem — see §2.2. +- **Two non-interoperable methods** (INTENT-3 §2). Keyword + and template intents describe a command in fundamentally + different shapes. Rather than forcing one model, the spec + keeps both and makes engines declare which they accept. + The cost is that a developer must choose per intent and + know which engines an installation runs. +- **Slot typing is deferred** (INTENT-1 §5.3). Interpreting + a slot value as a number or date is inseparable from how + ASR output is normalized — and normalization is not yet + specified. Specifying typing first would be incoherent, so + a value is, for now, an opaque sequence of words. +- **`.blacklist` vs `excluded`** (INTENT-3 §4.2, §5.4). The + template grammar is purely generative — it cannot express + "not this". Template intents therefore need a separate + `.blacklist` artifact for suppression. Keyword intents + express the same idea natively with the `excluded` + constraint role. The asymmetry follows from the grammar, + not from inconsistency. +- **No regular expressions** (INTENT-3 §4.4). Free-form + structured text is a slot — use a template intent and the + slot extractor. Regexes are also notoriously hard to + localize, which conflicts with the per-language model. +- **Inline vocabulary references reuse `.voc`** (INTENT-1 + §3.7). A reusable template fragment and a keyword + vocabulary are the same thing — a named, slot-free phrase + set — so `` resolves to a `.voc` rather than + introducing a new file role. The change is one grammar + token plus an expander step. + +### 4.2 Bus message envelope (MSG-1) + +- **One spec, not two.** Envelope + routing + derivations + are tightly coupled — every routing key lives in + `context`, every derivation manipulates routing, and all + of them formalize *existing* OVOS code. Splitting them + was tried; the split did not survive the derivations + (which can only meaningfully be defined where the routing + keys are), so they were merged into a single bus-message + spec. The session carrier, by contrast, did split out + cleanly into OVOS-SESSION-1. +- **`context` is extensible by design.** Only the keys + other systems already key behaviour off (`source`, + `destination`, `session`) are given normative meaning. + Everything else — GUI routing, tracing, security — is + layered by other specs without touching the envelope. +- **`source`/`destination` are informational, not + authorization** (MSG-1 §3.3). The bus is not a security + boundary. Layer-2 systems (HiveMind) build authentication + and routing enforcement on top of the pair without OVOS + itself learning about peers. +- **The boundary is user ↔ assistant, not core ↔ handler.** + The `(source, destination)` pair marks who is currently + talking to whom across one boundary only: the external + participant on one side, the assistant — core and every + skill handler — on the other. The flip happens **once** + per conversational turn (§3.1.1), not on every internal + hop. +- **No central correlation, no central state** (MSG-1 §5.4, + §3.1.2 above). The bus is fully asynchronous. Components + that need correlation or state own it themselves, keyed + on `session.session_id`. Multi-turn conversation, intent + context, cross-skill state, and similar concerns are + deferred to other specifications. +- **Topic naming conventions** (MSG-1 v2 §2.1.2). The + conventions other specs in the family already follow are + now codified as SHOULD-rules: dot-separated hierarchy + with `:` reserved for component-pair shapes; stable + ecosystem-identifying root; verb-tense pattern for the + trailing segment; request/terminal pairs sharing a root + verb (`handle` ↔ `handled`); `.response` suffix for + response derivations; per-instance + `...` form. + +### 4.3 Session carrier (SESSION-1) + +- **Why a separate session spec.** `Message.context.session` + is a load-bearing carrier claimed by multiple specs + (PIPELINE-1, CONTEXT-1, TRANSFORM-1) — without a single + owner, its wire contract drifts. SESSION-1 consolidates + the wire shape and fixes a **registry mechanism** so + future specs claim fields without amending SESSION-1 + itself. +- **Prescriptive, not descriptive.** Only the fields + normatively claimed by other specs are recognized. + Implementations carrying extra per-session state + (current OVOS Session has `persona_id`, `system_unit`, + `time_format`, `date_format`, `location`, `is_speaking`, + `is_recording`, …) are non-normative under v1 — they + ride through as opaque pass-through and can be claimed + by future per-domain specs. +- **Omission means "let the orchestrator decide".** Single + deferral mechanism: omitted single field, empty + `session: {}`, absent `session`, explicit + `session_id: "default"` — all equivalent on the wire, + all resolve at consumption to deployment defaults filled + by each consumer. No `null`, no sentinels. +- **Language signals.** Six BCP-47 fields with normative + meanings but stage-dependent consolidation: `lang` (user + preference, base), `secondary_langs` (additional + understood languages, constrains lang-detect predictions + and fallback selection), `output_lang` (renderer's + preferred output language; simplifies the + bidirectional-translation transformer to a fallback role), + `stt_lang` / `request_lang` / `detected_lang` + (per-utterance signals from STT, emitter, and lang-detect + respectively). `request_lang` is an emitter-reported hint + (per-wakeword language assignment in multi-wakeword + setups), not an override. + +### 4.4 Intent registration broadcast (INTENT-4) + +- **Registrations are broadcast — already how OVOS works.** + Skills emit registration messages on the bus; plugins + that care about a particular registration kind subscribe + to the corresponding topic. There has never been a + central routing party in OVOS; INTENT-4 just gives this + existing model normative topic names. The legacy bus + topics (`padatious:register_intent`, `register_vocab`, + etc.) are renamed into the `ovos.intent.*` namespace — + see §5.7 for the mapping. Migration is mostly a string + replacement. - **No "no plugin claimed" error.** Following from the - broadcast model: a registration that no plugin consumes is - silently dropped. The producer gets no signal — the + broadcast model: a registration that no plugin consumes + is silently dropped. The producer gets no signal — the introspection topics (`ovos.intent.list` / - `ovos.intent.describe`) are the supported way to verify what - the orchestrator's passive index recorded. -- **The orchestrator passively indexes; it does not gate.** The - introspection topics serve from a passive registration index - built by listening to broadcasts (this *is* new — current OVOS - has no central index). The index reflects what skills - *declared*, not what plugins actually match against — - observability-only. - -### Pipeline plugins (PIPELINE-1) - -- **The plugin model is already in place; PIPELINE-1 refines it** - (see §3). The current orchestrator already loads plugins by id - through `OVOSPipelineFactory` and iterates `Session.pipeline`. - PIPELINE-1 tightens the contract rather than introducing the - abstraction. -- **Orchestrator and plugin contracts live in one spec**, since - the orchestrator's job *is* iterating plugins and translating - their matches into bus events. Splitting them would leave - neither coherent. -- **Plugin contract is minimal.** `match(utterance, session) → - Match | None`. Side-effect-free during `match`; everything - else (state, registrations, language-model calls, response - generation) is plugin-internal black box. The smaller the - contract, the wider the set of plugins it accommodates. -- **Tier conventions are out of scope.** The current high / - medium / low suffix is implementation strategy: from the bus, - each tier is already a distinct `pipeline_id` in - `Session.pipeline`. PIPELINE-1 prescribes only that the - orchestrator iterates opaque `pipeline_id`s; whether a Python - plugin class internally serves multiple tiers via - `match_high` / `match_medium` / `match_low` methods, separate - plugin instances, or anything else is implementation choice the - spec does not constrain. The current convention is compatible - with PIPELINE-1 unchanged. -- **Skills and plugins are equivalent handler owners.** Dispatch - topic `:` polymorphism (owner is - `skill_id` or `pipeline_id`) lets a plugin bundle its own - handler — for example, a language-model persona plugin that - has no skills behind it — and still be addressed uniformly. - From outside, the assistant responded; the internal owner - type is invisible. + `ovos.intent.describe`) are the supported way to verify + what the orchestrator's passive index recorded. +- **The orchestrator passively indexes; it does not + gate.** The introspection topics serve from a passive + registration index built by listening to broadcasts + (this *is* new — current OVOS has no central index). The + index reflects what skills *declared*, not what plugins + actually match against — observability-only. +- **Skill self-identification on every emission** + (INTENT-4 §3.1). Every Message a skill emits or + modifies in place carries `Message.context["skill_id"]`. + Enforcement is structural on the dispatch path: the + orchestrator stamps `context.skill_id` from the + `:` dispatch topic prefix + (PIPELINE-1 §7.1), and skill emissions via + `forward`/`reply` inherit automatically. + +### 4.5 Pipeline and lifecycle (PIPELINE-1) + +- **The plugin model is already in place; PIPELINE-1 + refines it** (§3.2). The current orchestrator already + loads plugins by id through `OVOSPipelineFactory` and + iterates `Session.pipeline`. PIPELINE-1 tightens the + contract rather than introducing the abstraction. +- **Orchestrator and plugin contracts live in one spec**, + since the orchestrator's job *is* iterating plugins and + translating their matches into bus events. Splitting + them would leave neither coherent. +- **Plugin contract is minimal.** `match(utterance, lang, + session) → Match | None`. Side-effect-free during + `match`; everything else (state, registrations, + language-model calls, response generation) is + plugin-internal black box. The smaller the contract, the + wider the set of plugins it accommodates. +- **`lang` parameter is propagation-only.** The + orchestrator passes `lang` through from + `Message.data.lang`; it **MUST NOT** synthesize a value + from `session.lang` or any per-utterance signal field + when `data.lang` is absent. Absence is a faithful + "unknown" signal; consumer-side fallback policy is the + consumer's. +- **Tier conventions are out of scope.** The current + high / medium / low suffix is implementation strategy: + from the bus, each tier is already a distinct + `pipeline_id` in `Session.pipeline`. The current + convention is compatible with PIPELINE-1 unchanged. +- **Skills and plugins are equivalent handler owners.** + Dispatch topic `:` polymorphism + (owner is `skill_id` or `pipeline_id`) lets a plugin + bundle its own handler — for example, a language-model + persona plugin that has no skills behind it — and still + be addressed uniformly. - **Universal `ovos.utterance.handled` end-marker on every - terminal path.** One reserved invariant lets observers count - turns, route fallbacks, and know "the assistant is idle now" - without per-stage knowledge. -- **`session.pipeline` is per-session.** Different - sessions can carry different pipeline configurations — for - example, a remote-peer session may run a restricted pipeline - that excludes destructive plugins. This composes with the - layer-2 substrate (§5) without orchestrator-side changes. - -### Intent context (CONTEXT-1) + terminal path.** One reserved invariant lets observers + count turns, route fallbacks, and know "the assistant + is idle now" without per-stage knowledge. +- **Three-stage composition** (PIPELINE-1 §5.5) — + preference (from `session.pipeline` or default-session + pipeline) → availability (drop unloaded plugins) → + policy (drop denylisted). Mirrors TRANSFORM-1 §5.3 + exactly. The same shape supports the + client-requests/layer-2-enforces split (§3.1). + +### 4.6 Intent context (CONTEXT-1) - **Lifts intent context out of Adapt.** The Adapt-era `add_context` / `remove_context` mechanism, and the Mycroft-era `mycroft.skill.set_cross_context` / - `mycroft.skill.remove_cross_context` fan-out for cross-skill - use, are Adapt-only at the matcher level — Padatious and - other engines ignore them. CONTEXT-1 generalizes the - mechanism into a session-bound, decaying flat key/value store + `remove_cross_context` fan-out for cross-skill use, are + Adapt-only at the matcher level — Padatious and other + engines ignore them. CONTEXT-1 generalizes the mechanism + into a session-bound, decaying flat key/value store consumed by every intent engine uniformly via `requires_context` and `excludes_context` declarations. -- **Two explicit scopes.** `private` (orchestrator - auto-prefixes with `:`) and `shared` (flat, - cross-skill). The current OVOS code models the same distinction - informally (`MycroftSkill.set_context` auto-prefixes with - `alphanumeric_skill_id`; `set_cross_skill_context` fans out via - a bus event); CONTEXT-1 names the scopes explicitly and routes - both through one bus surface (`intent.context.set` / `.unset` / - `.clear` / `.list`). +- **Two explicit scopes encoded in the key shape.** + `private` (orchestrator auto-prefixes with + `:`) and `shared` (flat, cross-skill). The + current OVOS code models the same distinction informally + (`MycroftSkill.set_context` auto-prefixes with + `alphanumeric_skill_id`; `set_cross_skill_context` fans + out via a bus event); CONTEXT-1 names the scopes + explicitly and routes both through one bus surface. - **Why private is the default.** A skill that calls - `ovos.context.set` without specifying `scope` gets a private - entry. This optimises for the safer case at the cost of being - the less-useful case: the spec's own worked example - (Person → Bob) is naturally cross-skill, and a reader might - expect shared to be the default. The choice favours migration - fidelity (the current Adapt `set_context` pattern is - effectively skill-private — keyed under `alphanumeric_skill_id`), - the safer footgun direction (a cross-skill leak from an - accidentally-shared entry is harder to debug than a - cross-skill miss from an accidentally-private entry), and - authorability (cross-skill coordination is a conscious decision - that deserves an explicit `scope: "shared"`). Skills that - routinely act across the skill boundary set the scope - explicitly; skills that don't get safety by default. -- **Prior art for the negative gate.** Three in-tree intent - engines under `/plugins-pipeline/` — + `ovos.context.set` without specifying `scope` gets a + private entry. This optimises for the safer case: a + cross-skill leak from an accidentally-shared entry is + harder to debug than a cross-skill miss from an + accidentally-private entry. The current Adapt + `set_context` pattern is effectively skill-private; the + default preserves migration fidelity. Cross-skill + coordination is a conscious decision that deserves an + explicit `scope: "shared"`. +- **Prior art for the negative gate.** Three in-tree + intent engines under `/plugins-pipeline/` — [jurebes](https://github.com/OpenJarbas/jurebes), - [nebulento](https://github.com/OpenJarbas/nebulento), and - [palavreado](https://github.com/OpenJarbas/palavreado) — - independently implement `exclude_context` as a first-class - negative gate. CONTEXT-1's `excludes_context` adopts the same - primitive at the spec level, addressing patterns ("fire once", - "modal suppression") that positive gating alone cannot express. -- **Engine-side mutation as a sanctioned non-bus pathway.** The - Adapt pipeline plugin auto-injects matched entities into context - *inside* `match()`, which conflicts with PIPELINE-1 §4.2's - side-effect-free `match` rule. CONTEXT-1 §5.3 carves an explicit - window between match-accept and dispatch-emit for engine-side - session mutation, with the orchestrator (not the bus) carrying - the write. This both legitimizes the established practice and - resolves the PIPELINE-1 contradiction. - -### Transformer plugins (TRANSFORM-1) - -- **Spec'd as an architectural pattern, not a feature list.** An - orchestrator MAY implement chains at any subset of six - injection points (audio, utterance, metadata, intent, dialog, - TTS); a null-implementation is conformant. For each chain it - does implement, the per-type contract binds. Each injection - point's existence is justified by what the lifecycle holds at - that exact moment — what's possible there that isn't possible - elsewhere. + [nebulento](https://github.com/OpenJarbas/nebulento), + and [palavreado](https://github.com/OpenJarbas/palavreado) + — independently implement `exclude_context` as a + first-class negative gate. CONTEXT-1's `excludes_context` + adopts the same primitive at the spec level, addressing + patterns ("fire once", "modal suppression") that + positive gating alone cannot express. +- **Engine-side mutation as a sanctioned non-bus + pathway.** The Adapt pipeline plugin auto-injects matched + entities into context *inside* `match()`, which conflicts + with PIPELINE-1 §4.2's side-effect-free `match` rule. + CONTEXT-1 §5.3 carves an explicit window between + match-accept and dispatch-emit for engine-side session + mutation, with the orchestrator (not the bus) carrying + the write. This both legitimizes the established + practice and resolves the PIPELINE-1 contradiction. +- **Eight-level lifecycle-position owner precedence** + (CONTEXT-1 §5.2). When a Message carries multiple + component-identity keys (skill_id, pipeline_id, the six + `_transformer_ids`) from a derivation chain that + crossed component boundaries, the orchestrator picks the + owner by lifecycle position: the latest stage to run is + the most specific. + +### 4.7 Transformer plugins (TRANSFORM-1) + +- **Spec'd as an architectural pattern, not a feature + list.** An orchestrator MAY implement chains at any + subset of six injection points (audio, utterance, + metadata, intent, dialog, TTS); a null-implementation is + conformant. For each chain it does implement, the + per-type contract binds. Each injection point's + existence is justified by what the lifecycle holds at + that exact moment — what's possible there that isn't + possible elsewhere. - **Intent transformers as the system-typing home.** - OVOS-INTENT-1 §5.3 defers slot value typing pending a text - normalization specification. TRANSFORM-1 §3.4 is the spec'd - injection home for typing: a deployer ships date / number / - duration parsing once, and every skill receives typed values - in `Match.captures` regardless of which engine matched. The - OVOS analogue of ASK's `AMAZON.DATE` and Dialogflow's - `@sys.date-time`, but as an injected enrichment rather than a - built-in engine feature. -- **Concrete in-tree plugins as prior art.** Nine plugins live - under `/plugins-transformer/` today, covering five of the six - injection points: utterance transformers - (`ovos-utterance-normalizer`, `ovos-utterance-corrections-plugin`, + INTENT-1 §5.3 defers slot value typing pending a text + normalization specification. TRANSFORM-1 §3.4 is the + spec'd injection home for typing: a deployer ships + date / number / duration parsing once, and every skill + receives typed values in `Match.captures` regardless of + which engine matched. The OVOS analogue of ASK's + `AMAZON.DATE` and Dialogflow's `@sys.date-time`, but as + an injected enrichment rather than a built-in engine + feature. +- **Concrete in-tree plugins as prior art.** Nine plugins + live under `/plugins-transformer/` today, covering five + of the six injection points: utterance transformers + (`ovos-utterance-normalizer`, + `ovos-utterance-corrections-plugin`, `ovos-transcription-validator-plugin`, `ovos-utterance-plugin-cancel`, - `ovos-bidirectional-translation-plugin`); dialog transformers - (`ovos-dialog-normalizer-plugin`, + `ovos-bidirectional-translation-plugin`); dialog + transformers (`ovos-dialog-normalizer-plugin`, `ovos-bidirectional-translation-plugin`, - `ovos-dialog-transformer-openai-plugin`); audio transformers + `ovos-dialog-transformer-openai-plugin`); audio + transformers (`ovos-audio-transformer-plugin-speechbrain-langdetect`, `ovos-audio-transformer-plugin-ggwave`, - `ovos-audio-transformer-redis-publish`); intent transformers - (`ovos-keyword-template-matcher`, + `ovos-audio-transformer-redis-publish`); intent + transformers (`ovos-keyword-template-matcher`, `ovos-ahocorasick-ner-plugin`). The - `bidirectional-translation` plugin exercises the cross-chain - coordination via `Message.context` that TRANSFORM-1 §7 - formalizes. -- **Ascending priority.** TRANSFORM-1 §4 specifies ascending - priority (lower = earlier, default 50). Current OVOS sorts - transformer chains **descending** - (`ovos_core/transformers.py:53,117,205`, `reverse=True`); the - spec aligns with the **ascending** convention already used by - fallback skills (`fallback_service.py:49`, default 101 = run - last) and the natural "stages count up" reading. Bringing - current plugins into conformance only requires flipping - relative priorities, not rewriting. -- **Cancellation aligned with prior plugin convention.** Two - existing utterance transformers + `bidirectional-translation` plugin exercises the + cross-chain coordination via `Message.context` that + TRANSFORM-1 §7 formalizes. +- **Ascending priority.** TRANSFORM-1 §4 specifies + ascending priority (lower = earlier, default 50). + Current OVOS sorts transformer chains **descending** + (`ovos_core/transformers.py:53,117,205`, `reverse=True`); + the spec aligns with the **ascending** convention + already used by fallback skills (`fallback_service.py:49`, + default 101 = run last) and the natural "stages count + up" reading. Bringing current plugins into conformance + only requires flipping relative priorities, not + rewriting. +- **Cancellation aligned with prior plugin convention.** + Two existing utterance transformers (`ovos-utterance-plugin-cancel`, - `ovos-transcription-validator-plugin`) already signal the - lifecycle should abort by returning empty utterance lists with - `{canceled: true, cancel_word: }` context keys. - TRANSFORM-1 §8 keeps the convention, renaming `cancel_word` to - `cancel_reason` (the structured concept the field encodes) and - adding orchestrator-stamped `cancel_by: `. The - spec's `ovos.utterance.cancelled` terminal event sits alongside - the existing `complete_intent_failure` from PIPELINE-1, keeping - cancellation and failure observably distinct on the bus. -- **Language signals live in SESSION-1.** Language signals - (`stt_lang`, `request_lang`, `detected_lang`, alongside `lang`, - `secondary_langs`, `output_lang`) are session-scoped fields with - normative meanings but a non-binding consolidation order — the - right priority is stage-dependent. TRANSFORM-1 §7.1 names which - transformer types are natural producers of which signals; - consolidation is the consumer's decision per SESSION-1 §3.2.7. -- **Per-type self-identification keys.** TRANSFORM-1 §1.3 claims - six `Message.context` keys — one per transformer type - (`audio_transformer_ids`, `utterance_transformer_ids`, - `metadata_transformer_ids`, `intent_transformer_ids`, - `dialog_transformer_ids`, `tts_transformer_ids`) — rather than - a single generic `transformer_ids`. Two reasons. First, the - role matters: a Message at the dialog stage may have been - touched by five transformer types in sequence, and lumping - them into one slot loses the role partitioning that exists in - every other surface of the spec (the per-type registries of - §1.1, the per-type `*_transformers` overrides of SESSION-1 §3, - the per-type introspection topics of §6). Second, multi-type - plugins disambiguate: a plugin shipping both an utterance and - a dialog transformer under the same `transformer_id` (permitted - by §1.1) would, with a single generic key, leave consumers - unable to tell which role emitted; per-type keys make the role - unambiguous on the wire. -- **List-valued attribution preserves chain provenance.** - Each of the six attribution context keys is a *list* of - `transformer_id` strings, not a single string. Transformers - chain by design — multiple transformers of the same type run - sequentially against the same Message-in-flight (§4) — and the - list preserves the full chain on the wire, in order of touch. - The last entry is the most-recent stamper. Skill and pipeline - identity keys (`context["skill_id"]`, `context["pipeline_id"]`) - remain single strings because skills and pipeline plugins - *originate* Messages rather than chain over them. + `ovos-transcription-validator-plugin`) already signal + the lifecycle should abort by returning empty utterance + lists with `{canceled: true, cancel_word: }` + context keys. TRANSFORM-1 §8 keeps the convention, + renaming `cancel_word` to `cancel_reason` (the structured + concept the field encodes) and adding orchestrator-stamped + `cancel_by: `. The spec's + `ovos.utterance.cancelled` terminal event sits alongside + `complete_intent_failure`, keeping cancellation and + failure observably distinct on the bus. +- **`lang` parameter is bidirectional** (TRANSFORM-1 §3.0). + Four of the six per-type contracts (audio, utterance, + dialog, TTS) take `lang` as input and return it as + output. A bidirectional-translation transformer that + takes Spanish in and produces English out returns the + destination language; the orchestrator writes the + chain's final `lang` back into `Message.data.lang` for + downstream stages. Language-detector and clearing cases + fall out of the same channel. +- **Per-type self-identification keys, list-valued.** + TRANSFORM-1 §1.3 claims six `Message.context` keys — one + per transformer type (`audio_transformer_ids`, …, + `tts_transformer_ids`) — rather than a single generic + key. Two reasons. First, role matters: a Message at the + dialog stage may have been touched by five transformer + types in sequence, and lumping them into one slot loses + the role partitioning that exists in every other + surface of the spec (per-type registries, per-type + `*_transformers` overrides, per-type introspection + topics). Second, multi-type plugins disambiguate: a + plugin shipping both an utterance and a dialog + transformer under the same `transformer_id` would, with + a single generic key, leave consumers unable to tell + which role emitted. The keys are *lists*, not single + strings, because transformers chain by design — the + list preserves the full per-type chain on the wire in + order of touch. - **Per-type denylists complete the policy surface.** - TRANSFORM-1 §5.2 claims six `blacklisted__transformers` - session fields, paralleling the six `_transformers` - chain-ordering fields of §5.1 and the - `pipeline`/`blacklisted_pipelines` pair of OVOS-PIPELINE-1 §5. - Three-stage composition (preference → availability → policy) - in §5.3 mirrors PIPELINE-1 §5.5 exactly. -- **The per-type "explosion" of session fields is deliberate.** - Counting transformer-related session-field claims: six chain - orderings (§5.1) + six denylists (§5.2) = twelve fields, plus - six `Message.context` attribution keys. That is a lot of - wire-surface names, and it is a deliberate tradeoff against - the alternative of a `transformer_:` prefix-encoded - single namespace. The per-type partition gives direct key - lookup, avoids prefix parsing in CONTEXT-1 §5.2 attribution and - in §5.3 chain composition, and matches the per-type - partitioning that already exists in the §1.1 registries, the - §4 chain ordering rules, and the §6 introspection topics. - Under the canonical SHOULD-omit rule of SESSION-1 §3.4, the - common case carries zero of these fields on the wire — a - session diverges from deployment defaults only as needed. If - the field count ever proves painful in practice, the cleanest - fallback is an object-valued form - (`session.transformers: {audio: [...], ...}` and - `session.blacklisted_transformers: {audio: [...], ...}`), - collapsing twelve flat fields into two structured ones with - the per-type partition preserved as object keys. The flat form - was chosen for parallelism with `pipeline` (array, not object) - and for direct field access. - -### Session (SESSION-1) - -- **Why a separate session spec.** `Message.context.session` is a - load-bearing carrier claimed by multiple specs (PIPELINE-1, - CONTEXT-1, TRANSFORM-1) — without a single owner, its wire - contract drifts. SESSION-1 consolidates the wire shape and fixes - a **registry mechanism** so future specs claim fields without - amending SESSION-1 itself. -- **Prescriptive, not descriptive.** Only the fields normatively - claimed by other specs are recognized. Implementations - carrying extra per-session state (current OVOS Session class - has `site_id`, `persona_id`, `system_unit`, `time_format`, - `date_format`, `location`, `is_speaking`, `is_recording`, - `blacklisted_skills`, `blacklisted_intents`) are non-normative - under v1 — they ride through as opaque pass-through (§2.3) and - can be claimed by future per-domain specs. -- **Omission means "let the orchestrator decide".** Single - deferral mechanism: omitted single field, empty `session: {}`, - absent `session`, explicit `session_id: "default"` — all - equivalent on the wire, all resolve at consumption to deployment - defaults filled by each consumer. No `null`, no sentinels. -- **Language signals.** Four BCP-47 fields with normative meanings - but stage-dependent consolidation: `lang` (user preference, - base), `secondary_langs` (additional understood languages, - constrains lang-detect predictions and fallback selection), - `output_lang` (renderer's preferred output language; simplifies - the bidirectional-translation transformer to a fallback role), - `stt_lang` / `request_lang` / `detected_lang` (per-utterance - signals from STT, emitter, and lang-detect respectively). - `request_lang` is an emitter-reported hint (per-wakeword - language assignment in multi-wakeword setups), not an - override. - ---- - -## 5. The OVOS bus as a substrate - -Under MSG-1's `source` / `destination` / `session` model, the bus is -not just an internal transport — it is the **substrate higher-level -systems plug into without modifying OVOS**. Two mechanics make that -work: **single-flip routing** (§5.1), which keeps the routing pair -correct end-to-end without per-component effort; and **no central -state or correlation** (§5.2), which makes layer-2 systems -composable. HiveMind is the canonical example of what both -together enable (§5.3). - -### 5.1 The single-flip routing model - -The most important bus invariant in OVOS, and the one most often -reinvented incorrectly. The routing pair (`source`, `destination`) -flips **exactly once per conversational turn**, performed by -ovos-core, before the intent dispatch is emitted. From that point -on, every handler-side emission is *already* addressed back at the -user. - -Three steps: - -1. **The user side emits.** An external component — microphone - service, chat UI, satellite client, test harness — emits an - utterance with `source` set to itself: - - context: { source: "audio", destination: null, session: {...} } - -2. **ovos-core flips, then dispatches.** When the intent service - matches an intent it derives the dispatch via - `Message.reply(match_type, data)` (`ovos-core/.../service.py:340`). - The `.reply` rule of MSG-1 §5.2 swaps the routing pair: - - context: { source: "ovos-core", destination: "audio", session: {...} } - - The dispatch goes out on the per-intent topic - `:`. The flip has already classified the - message as *going back at the user*, even though a skill handler - is what actually runs. - -3. **The handler `.forward`s.** Every message the skill emits in - response — `speak`, the handler lifecycle trio, GUI events — - uses `Message.forward(...)` (`ovos-workshop/.../ovos.py:1461, - 1472, …`). `.forward` preserves `context` unchanged, so every - handler emission is already addressed back at the original - user-side component. - -Two consequences fall out: - -- **The boundary is user ↔ assistant, not core ↔ handler.** Skill - handlers are on OVOS's side of the boundary; from outside, OVOS - is one thing. The user doesn't know or care which skill answered - them. -- **Handler authors never write addressing code.** Because - `.forward` preserves the flipped pair, no skill anywhere needs - to understand `source` / `destination`. Get the inversion right - once in ovos-core, and every downstream skill is automatically - correct. - -What this rules out: no per-hop addressing (handlers don't pick -their own `destination`); no second flip (handlers `.forward`, -they don't `.reply` to the dispatch); the dispatch topic -`:` selects the handler, not `destination` -(the destination belongs to the user). Implementers using `.reply` -where `.forward` is appropriate produce mis-routed messages that -work in local tests but silently break layer-2 routing. - -### 5.2 No central correlation, no central state - -The bus is **fully asynchronous**. OVOS does not centrally -correlate request/response chains, and does not centrally track -per-conversation state. There is no per-message identifier, no -in-reply-to field, no host-side index mapping a `.response` back to -its request, no shared "current conversation" record. - -`session.session_id` identifies an **interaction channel** — -nothing more. Two messages sharing a `session_id` are on the same -channel, but the spec guarantees nothing about ordering, state -continuity, or pending requests. - -Every component — skills, pipeline plugins, external clients, -layer-2 systems — owns whatever state it needs. An asker that -wants `.response` correlation keeps its own outstanding-request -table; a skill that wants conversational memory keeps its own -per-session store; a layer-2 system that wants per-peer state -keys on `session_id`. Whatever a later consumer needs is **in the -Message** (`data` / `context` / `session`) or **out of band** — -never recovered from a hidden host-side index. - -This is what lets layer-2 systems plug in cleanly: if OVOS kept a -central correlation index or a central conversation state, every -layer-2 system would have to replicate it, hook into it, or work -around it. Because OVOS keeps neither, they compose without -contention. - -Several real concerns are deferred by this stance and are listed -under §7 known gaps: multi-turn conversation, intent context -(adapt's `add_context`/`remove_context`), the other session knobs -current OVOS carries beyond `session_id` and `lang` (`pipeline`, -`site_id`, `persona_id`, `time_format`, `date_format`, -`system_unit`, `tts_preferences`, …), and the eventual shape of -conversational state. The async-by-default model means those -future specs only need to define *what* the state is, not *how* -it travels. - -### 5.3 Why HiveMind works - -HiveMind is the canonical layer-2 system this design enables. A -HiveMind satellite is just another user-side emitter — it sets -`source` to its peer ID, populates `session` with a per-peer -session, and emits a Message. Inside OVOS: - -- ovos-core runs the same `.reply` flip (§5.1 step 2) — - `destination` becomes the satellite's peer ID instead of the - local microphone. -- Skills `.forward` as usual — `destination` stays the satellite - ID through every handler emission. -- HiveMind, watching the bus, sees each message addressed to its - peer and routes it back over the HiveMind transport. - -The pre-existing `session_id == "default"` rule keeps device-local -TTS on the device's speakers (per `ovos-audio/utils.py`'s -`require_default_session`), because remote HiveMind sessions -carry their own `session_id` and never `"default"`. - -None of this required HiveMind to modify OVOS core. The mechanism -that makes it work — single-flip routing + opaque per-session -identifiers + no central state — was already in -`ovos-bus-client/message.py:194-198`; MSG-1 just names and -formalizes it. + TRANSFORM-1 §5.2 claims six + `blacklisted__transformers` session fields, + paralleling the six `_transformers` chain-ordering + fields of §5.1 and the + `pipeline` / `blacklisted_pipelines` pair of PIPELINE-1 + §5. Three-stage composition (preference → availability + → policy) in §5.3 mirrors PIPELINE-1 §5.5 exactly. +- **The per-type "explosion" is deliberate.** Counting + transformer-related session-field claims: six chain + orderings + six denylists = twelve fields, plus six + `Message.context` attribution keys. The alternative — a + `transformer_:` prefix-encoded single + namespace — would require prefix parsing at every + lookup. The per-type partition matches the partitioning + that already exists in the §1.1 registries, the §4 + chain ordering rules, and the §6 introspection topics. + Under the canonical SHOULD-omit rule of SESSION-1 §3.4, + the common case carries zero of these fields on the + wire. If the field count ever proves painful in + practice, the cleanest fallback is an object-valued form + (`session.transformers: {audio: [...], ...}`), + collapsing twelve flat fields into two structured ones + with the per-type partition preserved as object keys. +- **Language signals live in SESSION-1.** Language signals + (`stt_lang`, `request_lang`, `detected_lang`, alongside + `lang`, `secondary_langs`, `output_lang`) are + session-scoped fields with normative meanings but a + non-binding consolidation order — the right priority is + stage-dependent. TRANSFORM-1 §7.1 names which + transformer types are natural producers of which + signals; consolidation is the consumer's decision per + SESSION-1 §3.2.7. --- -## 6. Where the specs differ from current OVOS code +## 5. Where the specs differ from current OVOS code -These specifications are *prescriptive*. Some of what they prescribe -matches what runs in OVOS today verbatim; some is a deliberate -cleanup the implementations are expected to grow into. This section -catalogues every known divergence so implementers know what to -migrate and reviewers know what to expect. +These specifications are *prescriptive*. Some of what they +prescribe matches what runs in OVOS today verbatim; some is a +deliberate cleanup the implementations are expected to grow +into. This section catalogues every known divergence so +implementers know what to migrate and reviewers know what to +expect. -### 6.1 Already aligned +### 5.1 Already aligned -Formalizations of behaviour that exists in current OVOS code and -needs no implementation change: +Formalizations of behaviour that exists in current OVOS code +and needs no implementation change: - The Message envelope (`type` / `data` / `context`) — matches `ovos-bus-client.Message`. -- `source`, `destination` semantics including the `Message.reply` - swap — matches `ovos-bus-client/message.py`. +- `source`, `destination` semantics including the + `Message.reply` swap — matches `ovos-bus-client/message.py`. - `context.session` as a serialized Session object — matches `ovos-bus-client/client/client.py`'s `message.context["session"] = sess.serialize()`. - `session.session_id == "default"` for device-local origin — matches `ovos-audio/utils.py`'s `require_default_session` decorator. -- `session.lang` as the user's preferred language — matches the - Session class's `lang` attribute. -- `forward` / `reply` / `response` derivation semantics — matches - `ovos-bus-client.Message.{forward,reply,response}`. +- `session.lang` as the user's preferred language — matches + the Session class's `lang` attribute. +- `forward` / `reply` / `response` derivation semantics — + matches `ovos-bus-client.Message.{forward,reply,response}`. - The `.response` suffix convention — pervasive across OVOS topics today. - The `complete_intent_failure` no-match topic (PIPELINE-1) — matches current topic name verbatim. - `ovos.utterance.cancelled` and `ovos.utterance.handled` (PIPELINE-1) — match current topic names verbatim. -- Per-utterance first-match-wins iteration (PIPELINE-1) — matches - `ovos-core/intent_services/service.py`'s +- Per-utterance first-match-wins iteration (PIPELINE-1) — + matches `ovos-core/intent_services/service.py`'s `handle_utterance` / `get_pipeline`. - Per-session pipeline configuration (PIPELINE-1) — matches - `Session.pipeline` (modulo the field rename in §6.3 below). -- The `:` dispatch topic shape (PIPELINE-1) - — matches current OVOS practice; skills already subscribe to - these topics. + `Session.pipeline`. +- The `:` dispatch topic shape + (PIPELINE-1) — matches current OVOS practice; skills + already subscribe to these topics. -### 6.2 Prescriptive renames +### 5.2 Prescriptive renames | Spec | Current | Prescribed | Notes | |------|---------|------------|-------| | INTENT-3 v1.1 | "host" | "orchestrator" | Editorial; conformance unchanged. | -| PIPELINE-1 | `mycroft.skill.handler.start` / `.complete` / `.error` | `ovos.intent.handler.start` / `.complete` / `.error` | Renamed into the `ovos.intent.*` namespace for uniformity. Breaks every existing handler-lifecycle observer; the migration cost is real (see §B in PR #11 discussion). | - -### 6.3 Prescriptive shape changes - -- **Keyword intent registration is atomic** (INTENT-4 §5). Today - a keyword intent is built up via multiple `register_vocab` - messages followed by a `register_intent` with an Adapt - `IntentBuilder.__dict__` payload. INTENT-4 collapses this into - a single message with structured `{required, optional, one_of, - excluded}` arrays of vocabulary descriptors. Every skill's - keyword-intent path needs to be rewritten in the worship layer. +| PIPELINE-1 | `mycroft.skill.handler.start` / `.complete` / `.error` | `ovos.intent.handler.start` / `.complete` / `.error` | Renamed into the `ovos.intent.*` namespace for uniformity. Breaks every existing handler-lifecycle observer; the migration cost is real. | +| PIPELINE-1 | `recognizer_loop:utterance` | `ovos.utterance.handle` | See §5.4 entry. Migration touches `ovos-dinkum-listener`, `ovos-simple-listener`, `ovos-audio`, and `ovos-core/intent_services/service.py`. | + +### 5.3 Prescriptive shape changes + +- **Keyword intent registration is atomic** (INTENT-4 §5). + Today a keyword intent is built up via multiple + `register_vocab` messages followed by a `register_intent` + with an Adapt `IntentBuilder.__dict__` payload. INTENT-4 + collapses this into a single message with structured + `{required, optional, one_of, excluded}` arrays of + vocabulary descriptors. Every skill's keyword-intent path + needs to be rewritten in the workshop layer. - **Template intent registration uses structured identity** (INTENT-4 §6). Today `padatious:register_intent` carries `{name, samples, file_name, lang, blacklisted_words}`; the - prescribed shape uses the structured `(skill_id, intent_name, - lang)` triple plus `samples|file` and `blacklist|blacklist_file`. -- **Dispatch payload uses polymorphic `owner_id`** (PIPELINE-1 - §7.1). Today dispatch carries `skill_id` only. PIPELINE-1's - `owner_id` is either a `skill_id` or a `pipeline_id` — same - field, polymorphic value. -- **Handler-lifecycle payload includes `owner_id`** (PIPELINE-1 - §8.2). Today the trio payload is `{name: }`. - Prescribed: `{owner_id, intent_name, optional exception}`. - -### 6.4 Architectural divergences + prescribed shape uses the structured `(skill_id, + intent_name, lang)` triple plus `samples|file` and + `blacklist|blacklist_file`. +- **Dispatch payload uses polymorphic `owner_id`** + (PIPELINE-1 §7.1). Today dispatch carries `skill_id` only. + PIPELINE-1's `owner_id` is either a `skill_id` or a + `pipeline_id` — same field, polymorphic value. +- **Handler-lifecycle payload includes `owner_id`** + (PIPELINE-1 §8.2). Today the trio payload is + `{name: }`. Prescribed: `{owner_id, + intent_name, optional exception}`. + +### 5.4 Architectural divergences - **The orchestrator maintains a passive registration index** - (INTENT-4 §10). Today there is no central index — each plugin - knows what it consumed; nothing aggregates that view. INTENT-4 - prescribes the orchestrator subscribe to all registration - topics in parallel with plugins and serve - `ovos.intent.list` / `ovos.intent.describe` from the passive - view. This is a new orchestrator responsibility, not a change - to existing behaviour. + (INTENT-4 §10). Today there is no central index — each + plugin knows what it consumed; nothing aggregates that + view. INTENT-4 prescribes the orchestrator subscribe to + all registration topics in parallel with plugins and serve + `ovos.intent.list` / `ovos.intent.describe` from the + passive view. This is a new orchestrator responsibility, + not a change to existing behaviour. - **Plugins are side-effect-free during `match`** (PIPELINE-1 - §4.2). This is a forward-looking rule rather than a fix for - current code. The standard `match_high`/`match_medium`/ - `match_low` methods in the official plugins are already - side-effect-free (they compute and return). Where side effects - do happen today, they are orchestrator-side after the match - wins (e.g. the `.activate` emit in - `ovos-core/intent_services/service.py:365`), or in *other* bus - handlers a plugin subscribes to. The spec rule keeps the - current discipline normative as alternative plugin types - (LLM-backed, agent-backed) are written. -- **`ovos.utterance.handled` on every terminal path** (PIPELINE-1 - §9.5). Current `ovos-workshop`'s `_on_event_error` does not - emit it on the handler-error path (`ovos.py:1478-1497`). - PIPELINE-1 §8 places trio emission on the orchestrator-wrapper - around the handler, not on the handler itself — workshop is the - wrapper in current OVOS, and the spec contract requires the - wrapper to emit `ovos.utterance.handled` unconditionally. -- **Handler-trio is orchestrator-owned** (PIPELINE-1 §8). The - orchestrator that invokes the handler wraps the call and emits - `ovos.intent.handler.start` / `.complete` / `.error` around it. - Third-party handler code carries **no normative obligation** to - participate in trio emission. Skill authors are not protocol - authors; the wrapper observes start / return / exception around - an opaque callable. + §4.2). This is a forward-looking rule rather than a fix + for current code. The standard + `match_high` / `match_medium` / `match_low` methods in the + official plugins are already side-effect-free (they + compute and return). Where side effects do happen today, + they are orchestrator-side after the match wins (e.g. the + `.activate` emit in + `ovos-core/intent_services/service.py:365`), or in + *other* bus handlers a plugin subscribes to. The spec + rule keeps the current discipline normative as alternative + plugin types (LLM-backed, agent-backed) are written. +- **`ovos.utterance.handled` on every terminal path** + (PIPELINE-1 §9.5). Current `ovos-workshop`'s + `_on_event_error` does not emit it on the handler-error + path (`ovos.py:1478-1497`). PIPELINE-1 §8 places trio + emission on the orchestrator-wrapper around the handler, + not on the handler itself — workshop is the wrapper in + current OVOS, and the spec contract requires the wrapper + to emit `ovos.utterance.handled` unconditionally. +- **Handler-trio is orchestrator-owned** (PIPELINE-1 §8). + The orchestrator that invokes the handler wraps the call + and emits `ovos.intent.handler.start` / `.complete` / + `.error` around it. Third-party handler code carries **no + normative obligation** to participate in trio emission. + Skill authors are not protocol authors; the wrapper + observes start / return / exception around an opaque + callable. - **Per-pipeline_id intent introspection** (PIPELINE-1 §10). - Pull-query / scatter-response surface keyed on `pipeline_id`, - giving consumers visibility into *which intents a particular - pipeline plugin's matcher has compiled*, distinct from the - orchestrator's manifest of declared intents (INTENT-4 §10). No - current OVOS analogue. + Pull-query / scatter-response surface keyed on + `pipeline_id`, giving consumers visibility into *which + intents a particular pipeline plugin's matcher has + compiled*, distinct from the orchestrator's manifest of + declared intents (INTENT-4 §10). No current OVOS analogue. - **CONTEXT-1 scope and ownership encoded in the key shape** - (CONTEXT-1 §2, §3). A bare key `Person` is shared; a prefixed - key `music.skill:Person` is private to `music.skill`. The `:` - is load-bearing — mirroring the `:` - dispatch topic. Drops separate `scope` and `origin` fields on - stored entries (both were redundant with the key shape). - `requires_context` and `excludes_context` declarations take an - OPTIONAL `scope: private|shared` discriminator (default - `private`) to express which lookup the gate uses; bare-string + (CONTEXT-1 §2, §3). A bare key `Person` is shared; a + prefixed key `music.skill:Person` is private to + `music.skill`. The `:` is load-bearing — mirroring the + `:` dispatch topic. Drops separate + `scope` and `origin` fields on stored entries (both were + redundant with the key shape). `requires_context` and + `excludes_context` declarations take an OPTIONAL + `scope: private|shared` discriminator (default `private`) + to express which lookup the gate uses; bare-string declarations default to private to prevent shared-leak. - **Skill self-identification on every emission** (INTENT-4 - §3.1). Every Message a skill emits carries - `Message.context["skill_id"]`. Current OVOS skills set this on - some emissions but not uniformly. Enforcement is structural on + §3.1). Current OVOS skills set `context.skill_id` on some + emissions but not uniformly. Enforcement is structural on the dispatch path: the orchestrator stamps - `context.skill_id` from the `:` dispatch - topic prefix (PIPELINE-1 §7.1), and skill emissions via + `context.skill_id` from the `:` + dispatch topic prefix, and skill emissions via `forward`/`reply` inherit automatically. Loader-side - interception covers off-dispatch emissions. Drives CONTEXT-1 - §5.2 stored-key computation. + interception covers off-dispatch emissions. - **Entry-point topic renamed `ovos.utterance.handle`** - (PIPELINE-1 §9.1). Current deployments use the Mycroft-era - `recognizer_loop:utterance`. That name fails the naming - conventions of OVOS-MSG-1 §2.1.2 on three counts: it uses `:` - as a segment separator (where `:` is reserved for - `:` dispatch topics, §2.1.1); its - leading segment names an implementation role (the audio-input - "recognizer loop") rather than a stable assistant root; and - it does not pair with the past-tense terminal event - `ovos.utterance.handled`. The rename to - `ovos.utterance.handle` fixes all three: dot-separated - hierarchy, stable `ovos.` root, request/terminal pair - (`handle` ↔ `handled`) sharing a root verb. Migration cost - is real — every audio-input service emits this, every - intent-service handler subscribes — touching - `ovos-dinkum-listener`, `ovos-simple-listener`, `ovos-audio`, - and `ovos-core/intent_services/service.py`. A transitional + (PIPELINE-1 §9.1). Current deployments use the + Mycroft-era `recognizer_loop:utterance`. That name fails + the naming conventions of OVOS-MSG-1 §2.1.2 on three + counts: it uses `:` as a segment separator (where `:` is + reserved for `:` dispatch topics); + its leading segment names an implementation role (the + audio-input "recognizer loop") rather than a stable + assistant root; and it does not pair with the past-tense + terminal event `ovos.utterance.handled`. The rename fixes + all three: dot-separated hierarchy, stable `ovos.` root, + request/terminal pair (`handle` ↔ `handled`) sharing a + root verb. Migration cost is real — every audio-input + service emits this, every intent-service handler + subscribes — touching `ovos-dinkum-listener`, + `ovos-simple-listener`, `ovos-audio`, and + `ovos-core/intent_services/service.py`. A transitional deployment MAY subscribe to both names during migration. -- **All introspection topics share the `ovos..` prefix.** - Verb segments vary by domain — INTENT-4 nests under - `ovos.intent.register.` / `ovos.intent.list`; PIPELINE-1 - uses `ovos.pipeline..intents.list`; CONTEXT-1 uses - `ovos.context.`; TRANSFORM-1 uses - `ovos.transformer..`. Uniformity is at the prefix - level, not at verb depth. -### 6.5 New topics with no direct precedent +### 5.5 New topics with no direct precedent - **`ovos.intent.matched`** (PIPELINE-1 §9.2). The positive-match broadcast notification. Current OVOS has `complete_intent_failure` for the negative case but no positive equivalent. -- **`ovos.intent.list` / `ovos.intent.describe`** (INTENT-4 §10). - Introspection topics served from the orchestrator's passive - registration index. -- **Materialize-default-session rule** on `forward` / `reply` / - `response` (MSG-1 §4.3). Formalizes a "MAY" convenience for - in-process subsystems; not currently implemented but compatible - with current behaviour. +- **`ovos.intent.list` / `ovos.intent.describe`** (INTENT-4 + §10). Introspection topics served from the orchestrator's + passive registration index. +- **`ovos.context.set` / `.unset` / `.clear` / `.list`** + (CONTEXT-1 §5). Skill-facing API replacing Adapt-era + `add_context` / `remove_context` plus + `mycroft.skill.set_cross_context`. +- **`ovos.transformer.{type}.list`** (TRANSFORM-1 §6). + Per-type introspection of loaded transformers. +- **Materialize-default-session rule** on `forward` / + `reply` / `response` (MSG-1 §4.3). Formalizes a "MAY" + convenience for in-process subsystems; not currently + implemented but compatible with current behaviour. + +### 5.6 Things the specs do *not* change + +- The session object's internal shape is owned by + OVOS-SESSION-1; the field set is the closed set defined + there plus whatever future specs claim via SESSION-1 §2.1. + The "extra" fields current OVOS Session carries + (`persona_id`, `system_unit`, `time_format`, `date_format`, + …) ride through as non-normative pass-through and may be + claimed by future per-domain specs. +- The `mycroft.*` topic prefix outside the intent layer (e.g. + `mycroft.audio.*`) — these are not part of any spec here. +- The `:` dispatch topic — kept + verbatim from current OVOS so no skill needs to migrate + its handler subscription. +- **Engine-specific introspection topics.** The standard + plugins expose their own debug / inspection topics — for + example `intent.service.adapt.reply`, + `intent.service.adapt.manifest`, + `intent.service.adapt.vocab.manifest`, and + `intent.service.padatious.get`. These are plugin-specific + surface, parallel to the spec's generic + `ovos.intent.list` / `ovos.intent.describe` (INTENT-4 + §10). The specs do not claim authority over them — they + remain plugin-defined and may continue to coexist with + the orchestrator's generic index. -### 6.5.1 Introspection patterns across the specs (informative) +### 5.7 Predecessor-topic mapping -Four specs in this set define pull-query / scatter-response -introspection surfaces. The shapes are intentionally similar but -serve different scopes: +The bus topics formalized by INTENT-4 and PIPELINE-1 replace +a number of legacy names. Implementer migration aid: -| Spec | Topic | Scope | Authoritative responder | -|------|-------|-------|-------------------------| -| INTENT-4 §10 | `ovos.intent.list` / `.describe` | Declared intents observed on the bus | Orchestrator (the manifest) | -| PIPELINE-1 §10 | `ovos.pipeline..intents.list` | Intents currently compiled inside a specific plugin's matcher | The pipeline plugin | -| CONTEXT-1 §5.4 | `ovos.context.list` | Post-decay session-context snapshot | The orchestrator process owning the match round | -| TRANSFORM-1 §6 | `ovos.transformer..list` | Loaded transformers per injection point | The orchestrator process implementing that chain | +#### Registration topics (INTENT-4) -Three properties hold across all four: +| Legacy topic | v1 replacement | Notes | +|--------------|---------------|-------| +| `register_vocab` | folded into `ovos.intent.register.keyword` | Vocabularies in v1 are inline `samples` or `file`-by-path inside the registration. | +| `register_intent` (Adapt parser) | `ovos.intent.register.keyword` | Adapt's `IntentBuilder.__dict__` payload replaced by the structured shape. | +| `padatious:register_intent` | `ovos.intent.register.template` | Same content, structured payload. | +| `padatious:register_entity` | `ovos.entity.register` | Entities are not Padatious-specific. | +| `detach_intent` | `ovos.intent.deregister` | Identity now expressed as the structured triple, not the munged `skill_id:intent_name` string. | +| `detach_skill` | `ovos.skill.deregister` | | +| `mycroft.skill.enable_intent` / `mycroft.skill.disable_intent` | `ovos.intent.enable` / `ovos.intent.disable` | First-class topics under v1, with the prefix dropped. | + +#### Utterance-lifecycle topics (PIPELINE-1) + +| Legacy topic | Status | +|--------------|--------| +| `recognizer_loop:utterance` | renamed to `ovos.utterance.handle` (see §5.4) | +| `complete_intent_failure` | **unchanged** — kept as the no-match signal. | +| `ovos.utterance.cancelled` | **unchanged** — kept as the cancellation signal. | +| `ovos.utterance.handled` | **unchanged** — kept as the universal end-marker. | +| `:` | **unchanged** — kept as the dispatch topic; PIPELINE-1 extends the shape to `:` so plugins can also own handlers. | +| `mycroft.skill.handler.start` / `.complete` / `.error` | renamed to `ovos.intent.handler.start` / `.complete` / `.error` | -1. **Pull-query is the source of truth.** Producers MAY broadcast - load-time announcements; consumers MUST NOT rely on having - received them. The bus is asynchronous and gives no delivery - guarantee; a consumer that started late missed the broadcast. -2. **No completeness signal.** A consumer that wants completeness - keeps its own roster of expected responders and times out - non-responders. -3. **Per-process slices under split orchestrators.** When the - orchestrator is split (PIPELINE-1 §2), each process responds - from its own slice; consumers aggregate. - -All four surfaces share the `ovos..` prefix; verb segments -vary by domain (some nest, some don't). The uniformity is in the -namespace, not in a fixed depth. - -The word **"intent"** appears in three of the four topic strings -above with three different meanings, which is worth flagging for -implementers wiring observers: - -- `ovos.intent.list` (INTENT-4 §10) — list of registered *intents* - (the things skills declare; `data` entries name `intent_name`). -- `ovos.pipeline..intents.list` (PIPELINE-1 §10) — - list of *intents currently compiled by one plugin's matcher* - (`data` entries name `intent_name`). +#### Out of scope + +| Legacy topic | Status | +|--------------|--------| +| `add_context` / `remove_context` | Replaced by `ovos.context.set` / `.unset` under CONTEXT-1. | +| `mycroft.skill.set_cross_context` / `remove_cross_context` | Replaced by `ovos.context.set` / `.unset` with `scope: "shared"` under CONTEXT-1. | +| `.activate` | Activity-tracking emit currently in `ovos-core`; not part of any spec here. | + +--- + +## 6. Implementer reference + +Material an implementer reaches for repeatedly: cross-spec +tables that don't fit cleanly in any single normative spec. + +### 6.1 Topic-name conventions across the family + +The naming conventions of OVOS-MSG-1 v2 §2.1.2 — dot-separated +hierarchy, stable root, verb-tense pattern for the trailing +segment, request/terminal pairs sharing a root verb, +`.response` suffix, per-instance +`...` form — apply across the family. +The four-way collision of the word "intent" in introspection +topics deserves an explicit callout: + +- `ovos.intent.list` (INTENT-4 §10) — list of registered + *intents* (skills declare them; `data` entries name + `intent_name`). +- `ovos.pipeline..intents.list` (PIPELINE-1 + §10) — list of *intents currently compiled by one plugin's + matcher* (`data` entries name `intent_name`). - `ovos.transformer.intent.list` (TRANSFORM-1 §6) — list of *intent-transformer plugins* loaded at the intent-transformer - injection point (`data` entries name `transformer_id`). Despite - the topic shape, this is **not** an intent-listing surface; it - follows the per-chain pattern `ovos.transformer..list` - where `` happens to be `intent` for this chain (alongside - `audio`, `utterance`, `metadata`, `dialog`, `tts`). - -The collision is at the human-reading level only; payload shapes -are distinct and a consumer subscribing to one cannot accidentally -parse responses from another. - -### 6.5.2 Session-field cheat-sheet (informative) - -Every spec in the family that claims a `session` field does so -via the OVOS-SESSION-1 §2.1 registry mechanism. The full set -spans three specs; this table consolidates them for -implementer reference. All fields follow the canonical -SHOULD-omit / `[]`-equivalent-to-omission wire-weight rule of + injection point (`data` entries name `transformer_id`). + Despite the topic shape, this is **not** an intent-listing + surface; it follows the per-chain pattern + `ovos.transformer..list` where `` happens to + be `intent` for this chain (alongside `audio`, `utterance`, + `metadata`, `dialog`, `tts`). + +The collision is at the human-reading level only; payload +shapes are distinct and a consumer subscribing to one cannot +accidentally parse responses from another. + +### 6.2 Session-field cheat-sheet + +Every spec in the family that claims a `session` field does +so via the OVOS-SESSION-1 §2.1 registry mechanism. The full +set spans four specs; this table consolidates them. All +fields follow the canonical SHOULD-omit / +`[]`-equivalent-to-omission wire-weight rule of OVOS-SESSION-1 §3.4. -| Field | Owner spec | Role | Empty-array semantics | -|-------|------------|------|------------------------| +| Field | Owner | Role | Empty-array semantics | +|-------|-------|------|------------------------| | `session_id` | SESSION-1 §3.1 | identity / channel | n/a (string; `"default"` reserved) | | `lang` | SESSION-1 §3.2.1 | preference (user) | n/a (string) | | `secondary_langs` | SESSION-1 §3.2.2 | preference (user) | ≡ absent | @@ -1014,232 +1298,109 @@ OVOS-SESSION-1 §3.4. specific behaviour. Orchestrator narrows the request by availability and policy. - *Policy* — populated by deployment / layer-2 substrate to - enforce constraints. Overrides preference at the composition - stage (PIPELINE-1 §5.5, TRANSFORM-1 §5.3). -- *Signal* — recorded by a producer or earlier lifecycle stage - to communicate information about this specific utterance. + enforce constraints. Overrides preference at the + composition stage (PIPELINE-1 §5.5, TRANSFORM-1 §5.3). +- *Signal* — recorded by a producer or earlier lifecycle + stage to communicate information about this specific + utterance. - *Identity / channel* — names the session itself; not a preference or policy knob. -**Stamp-rule cheat-sheet (component identities, not session -fields — for reference alongside the table above):** - -| Context key | Owner spec | Stamps on | Stamps on .reply / .response | Stamps on .forward | -|-------------|------------|-----------|-------------------------------|---------------------| -| `skill_id` | INTENT-4 §3.1 | every origination + modify-in-place | yes (authorial) | no (preserves inherited) | -| `pipeline_id` | PIPELINE-1 §3.1 | every origination + modify-in-place | yes (authorial) | no (preserves inherited) | -| `_transformer_ids` (six) | TRANSFORM-1 §1.3 | every origination + modify-in-place | yes (append to list) | no (list rides through) | - -All three identity surfaces coexist freely on a single Message -when the derivation chain crosses component boundaries. -Attribution consumers apply the eight-level precedence of -CONTEXT-1 §5.2 to pick a single owner when needed. - -### 6.6 Things the specs do *not* change - -- The session object's internal shape is now owned by - OVOS-SESSION-1; the field set is the closed set defined there - plus whatever future specs claim via SESSION-1 §2.1. The "extra" - fields current OVOS Session carries (`site_id`, `persona_id`, - `system_unit`, `time_format`, `date_format`, etc.) ride through - as non-normative pass-through and may be claimed by future - per-domain specs. -- The `mycroft.*` topic prefix outside the intent layer (e.g. - `mycroft.audio.*`) — these are not part of any spec here. -- The `:` dispatch topic — kept verbatim - from current OVOS so no skill needs to migrate its handler - subscription. -- **Engine-specific introspection topics.** The standard plugins - expose their own debug / inspection topics — for example - `intent.service.adapt.reply`, - `intent.service.adapt.manifest`, - `intent.service.adapt.vocab.manifest`, and - `intent.service.padatious.get`. These are - plugin-specific surface, parallel to the spec's generic - `ovos.intent.list` / `ovos.intent.describe` (INTENT-4 §10). - The specs do not claim authority over them — they remain - plugin-defined and may continue to coexist with the - orchestrator's generic index. +### 6.3 Component-identity stamp-rule cheat-sheet -### 6.7 Predecessor-topic mapping +Each component type self-identifies via a reserved context +key. The keys coexist freely on a single Message when the +derivation chain crosses component boundaries; attribution +consumers apply the eight-level lifecycle-position precedence +of CONTEXT-1 §5.2 to pick a single owner when needed. -The bus topics formalized by INTENT-4 and PIPELINE-1 replace a -number of legacy names. Implementer migration aid: +| Context key | Owner | Stamps on (origination + modify-in-place) | `.reply` / `.response` | `.forward` | +|-------------|-------|------|----------|--------| +| `skill_id` | INTENT-4 §3.1 | yes | yes (authorial — overwrite) | no (preserve inherited) | +| `pipeline_id` | PIPELINE-1 §3.1 | yes | yes (authorial — overwrite) | no (preserve inherited) | +| six `_transformer_ids` (list-valued) | TRANSFORM-1 §1.3 | yes (append) | yes (append) | no (list rides through) | -#### Registration topics (INTENT-4) +The `_transformer_ids` list-valued form preserves the +full per-type chain provenance on the wire (every transformer +of that type that touched the Message, in order of touch). +Single-string `skill_id` / `pipeline_id` reflect that those +component types *originate* Messages rather than chain over +them. -| Legacy topic | v1 replacement | Notes | -|--------------|---------------|-------| -| `register_vocab` | folded into `ovos.intent.register.keyword` | Vocabularies in v1 are inline `samples` or `file`-by-path inside the registration. | -| `register_intent` (Adapt parser) | `ovos.intent.register.keyword` | Adapt's `IntentBuilder.__dict__` payload replaced by the structured shape. | -| `padatious:register_intent` | `ovos.intent.register.template` | Same content, structured payload. | -| `padatious:register_entity` | `ovos.entity.register` | Entities are not Padatious-specific. | -| `detach_intent` | `ovos.intent.deregister` | Identity now expressed as the structured triple, not the munged `skill_id:intent_name` string. | -| `detach_skill` | `ovos.skill.deregister` | | -| `mycroft.skill.enable_intent` / `mycroft.skill.disable_intent` | `ovos.intent.enable` / `ovos.intent.disable` | First-class topics under v1, with the prefix dropped. | +### 6.4 Introspection patterns -#### Utterance-lifecycle topics (PIPELINE-1) +Four specs in this set define pull-query / scatter-response +introspection surfaces. The shapes are intentionally similar +but serve different scopes: -| Topic | Status | -|-------|--------| -| `recognizer_loop:utterance` | renamed to `ovos.utterance.handle` — see §6.4 above. | -| `complete_intent_failure` | **unchanged** — kept as the no-match signal. | -| `ovos.utterance.cancelled` | **unchanged** — kept as the cancellation signal. | -| `ovos.utterance.handled` | **unchanged** — kept as the universal end-marker. | -| `:` | **unchanged** — kept as the dispatch topic; PIPELINE-1 extends the shape to `:` so plugins can also own handlers. | -| `mycroft.skill.handler.start` / `.complete` / `.error` | renamed to `ovos.intent.handler.start` / `.complete` / `.error` | +| Spec | Topic | Scope | Authoritative responder | +|------|-------|-------|-------------------------| +| INTENT-4 §10 | `ovos.intent.list` / `.describe` | Declared intents observed on the bus | Orchestrator (the manifest) | +| PIPELINE-1 §10 | `ovos.pipeline..intents.list` | Intents currently compiled inside a specific plugin's matcher | The pipeline plugin | +| CONTEXT-1 §5.4 | `ovos.context.list` | Post-decay session-context snapshot | The orchestrator process owning the match round | +| TRANSFORM-1 §6 | `ovos.transformer..list` | Loaded transformers per injection point | The orchestrator process implementing that chain | -#### Out of scope +Three properties hold across all four: -| Topic | Status | -|-------|--------| -| `add_context` / `remove_context` | Adapt conversational context — not part of intent registration. A future spec may define it. | -| `.activate` | Activity-tracking emit currently in `ovos-core`; not part of any spec here. | +1. **Pull-query is the source of truth.** Producers MAY + broadcast load-time announcements; consumers MUST NOT + rely on having received them. The bus is asynchronous + and gives no delivery guarantee; a consumer that started + late missed the broadcast. +2. **No completeness signal.** A consumer that wants + completeness keeps its own roster of expected responders + and times out non-responders. +3. **Per-process slices under split orchestrators.** When + the orchestrator is split (PIPELINE-1 §2), each process + responds from its own slice; consumers aggregate. + +All four surfaces share the `ovos..` prefix; verb +segments vary by domain (some nest, some don't). The +uniformity is in the namespace, not in a fixed depth. --- ## 7. Known gaps and planned work -- **Per-plugin behavioural specs.** PIPELINE-1 defines the plugin - contract (the `match` shape, the orchestrator's iteration - semantics) but explicitly defers what each non-trivial plugin - type actually *does*. Real candidates for their own - specifications: `converse`, `fallback`, `common_query`, `ocp`, - `persona`, `stop`. Each defines its own internal behaviour and - its own bus emissions beyond the universal lifecycle PIPELINE-1 - prescribes. -- **A session specification.** MSG-1 §4 carries `session` opaquely - and names only `session_id` and `lang`; PIPELINE-1 §5 adds - `pipeline`. Everything else about the session is - deferred — session lifecycle (start, end, expiry, resumption), - the full set of session preferences current OVOS already carries - (`site_id`, `persona_id`, `time_format`, `date_format`, - `system_unit`, `tts_preferences`, …), and the shape of any - conversational state. A future session specification picks - these up. -- **A multi-turn conversation specification.** When a skill asks - the user a question and waits for the next utterance, the "next - utterance belongs to that pending question" link is not - formalized today (handled informally by the `converse` plugin - type plus skill-side state). MSG-1's async-by-default stance - (§5.2) leaves room for this to be formalized either in the - session spec or as a separate one. -- **Intent context.** Formalized in **OVOS-CONTEXT-1** — see §4 - *Intent context* above. The Adapt-era `add_context` / - `remove_context` feature is lifted to a session-bound, - decaying, engine-agnostic primitive. -- **The utterance-transformer chain.** Formalized in - **OVOS-TRANSFORM-1** — see §4 *Transformer plugins* above — - covering six injection points (audio, utterance, metadata, - intent, dialog, TTS) and their cancellation contract. -- **Text normalization of ASR output.** The basis for slot value - typing (OVOS-INTENT-1 §5.3). Deferred to its own specification. -- **A machine-checkable conformance corpus** of `template → sample - set` pairs for OVOS-INTENT-1 expansion, so expander conformance - can be verified automatically. A parallel corpus of bus-message - fixtures for MSG-1 would be the equivalent at the bus layer. -- **An end-to-end worked example.** The specs have local examples; - none shows a single skill defining one keyword intent and one - template intent through the whole path — files, registration, - match, handler. -- **i18n corpus.** OVOS-INTENT-2 defines the locale file format, and - ovos-localize (§8) provides the operations layer; what remains is - the *scale* of the translated corpus. - ---- - -## 8. Ecosystem tooling: ovos-localize - -The specifications define formats and contracts; turning those into a working -i18n operation takes tooling. **ovos-localize** is that layer — a GitHub-native -localization platform for OVOS skills, built specifically around the resource -roles of OVOS-INTENT-2. - -It scans skill repositories for locale files; analyzes each skill's Python -source (via AST) to recover the **handler context** of a resource — which -function uses a file, what its slots mean, what dialog it triggers, which is -exactly the intent↔handler binding of OVOS-INTENT-3 §1; validates translations -against a rule set (slot preservation, expansion validity, variant counts); and -lets translators browse, edit, preview, and submit translations as pull -requests. It also exports a unified intent/dialog/vocabulary dataset. - -ovos-localize is the OVOS counterpart to Home Assistant's managed -`intents` repository. Two honest notes: it is currently -**descriptive** of real OVOS skills — it also handles legacy file -types these specs deliberately drop — so as the specs and the -ecosystem converge, its file-type coverage and the specs will need to -meet in the middle; and its translation validators are a natural home -for spec conformance checks, distinct from but related to the planned -grammar-level conformance corpus (§7). - ---- - -## 9. The spec set, in three stacks - -Built bottom-up in three stacks: - -- The **intent stack**, in dependency order: OVOS-INTENT-1 - (template grammar) → OVOS-INTENT-2 (resource files) → - OVOS-INTENT-3 (the intent concept) → OVOS-INTENT-4 (the - registration wire format on the bus). -- The **bus stack**: OVOS-MSG-1 formalizes the envelope, routing, - session carrier, and `forward`/`reply`/`response` derivations. - OVOS-SESSION-1 formalizes the wire shape of the session carrier. -- The **orchestrator stack**: OVOS-PIPELINE-1 defines the - orchestrator, the pipeline-plugin abstraction, the utterance - lifecycle, and the handler-lifecycle trio. OVOS-CONTEXT-1 - defines per-session intent-context state. OVOS-TRANSFORM-1 - defines the six injection-point transformer chains. Sits on top - of the bus stack (uses MSG-1's envelope and routing, SESSION-1's - session carrier) and around the intent stack (intent - registrations are one kind of input pipeline plugins consume). - -The **reference implementation** for the intent stack is -**ovos-spec-tools** — expander, resource loader, dialog renderer, -language matching, locale linter, in one dependency-light package. -The bus and orchestrator stacks do not yet have a comparable -reference; `ovos-bus-client` is the closest match for MSG-1 and -`ovos-core` is the closest match for PIPELINE-1 + INTENT-4, but -both predate the specs. - ---- - -## 10. Compatibility levels - -Each specification carries its own integer (or minor) `Version`, -bumped per PR per the contributing rules in the README. The -architecture as a whole is spoken of at **compatibility levels** — -versioned snapshots a tool may target, checked against by -`ovos-spec-lint`. - -The compatibility-level model works cleanly for the **intent -stack**, where a single integer identifies a coherent grammar / -resources / intent-definition snapshot. The bus and orchestrator -stacks do not yet map onto the same single-axis ladder; a -specification-set-wide version tuple covering all eight specs is -a planned follow-up. - -The intent-stack ladder: - -- **V0** — *informal.* The undocumented, de-facto behaviour from - before these specifications existed. V0 is not specified - anywhere; it is the baseline the formalization started from. - V0 has no notion of the `.blacklist` resource role or of - `` references. -- **V1** — the intent stack as first formalized: OVOS-INTENT-1, - -2 and -3, each at version 1. V1's headline addition over V0 - is the `.blacklist` role. -- **V2** — V1 plus **inline vocabulary references** (the - `` token): OVOS-INTENT-1 and OVOS-INTENT-2 at version 2. - A V2 template cannot be expanded by a V1 tool. - -These intent-stack levels continue to make sense in isolation. -The bus stack (OVOS-MSG-1), the registration spec (OVOS-INTENT-4), -and the orchestrator spec (OVOS-PIPELINE-1) are versioned -**individually** and not placed on a unified compatibility -ladder. A tool targeting them today cites per-spec versions: -"MSG-1 v1, INTENT-4 v1, PIPELINE-1 v1." Whether the compat-level -model evolves into a multi-axis grid, per-stack ladders, or is -quietly deprecated in favour of per-spec versions only, is -deferred. - +- **Per-plugin behavioural specs.** OVOS-PIPELINE-1 defines + the plugin contract (the `match` shape, the orchestrator's + iteration semantics) but explicitly defers what each + non-trivial plugin type actually *does*. Real candidates + for their own specifications: `converse`, `fallback`, + `common_query`, `ocp`, `persona`, `stop`. Each defines its + own internal behaviour and its own bus emissions beyond + the universal lifecycle PIPELINE-1 prescribes. +- **A full session-lifecycle specification.** SESSION-1 + defines the wire shape; lifecycle (start, end, expiry, + resumption) and the full set of session preferences + current OVOS already carries (`persona_id`, + `time_format`, `date_format`, `system_unit`, + `tts_preferences`, `location`, …) are deferred to a + future specification. +- **A multi-turn conversation specification.** When a skill + asks the user a question and waits for the next utterance, + the "next utterance belongs to that pending question" link + is not formalized today (handled informally by the + `converse` plugin type plus skill-side state). MSG-1's + async-by-default stance (§3.1.2) leaves room for this to + be formalized either in the session spec or as a separate + one. +- **Text normalization of ASR output.** The basis for slot + value typing (INTENT-1 §5.3). Deferred to its own + specification. +- **A machine-checkable conformance corpus** of `template → + sample set` pairs for INTENT-1 expansion, so expander + conformance can be verified automatically. A parallel + corpus of bus-message fixtures for MSG-1 would be the + equivalent at the bus layer. +- **An end-to-end worked example.** The specs have local + examples; none shows a single skill defining one keyword + intent and one template intent through the whole path — + files, registration, match, handler. +- **Conversation-level evaluation infrastructure.** Rasa + has story-based testing and end-to-end success metrics; + the OVOS specs do not currently have a counterpart. +- **i18n corpus.** OVOS-INTENT-2 defines the locale file + format, and `ovos-localize` (§1.4) provides the + operations layer; what remains is the *scale* of the + translated corpus. From a0066bda8dfad5a67951cc72c169551b46ce4d7b Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 16:09:05 +0100 Subject: [PATCH 18/27] =?UTF-8?q?APPENDIX=20=C2=A72:=20drop=20Mycroft=20co?= =?UTF-8?q?mparator=20subsection;=20renumber=202.4-2.7=20to=202.3-2.6?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Mycroft AI Inc shut down in 2023; the fork is years old and the intervening design is not Mycroft's. Keeping a 'comparison to predecessor' subsection over-attributes the architecture and mis-frames OVOS as a derivative project rather than a long- running open project in its own right. Section §2 is now a comparison with currently-relevant voice-assistant systems only: - §2.1 Home Assistant and Rhasspy (shared grammar lineage) - §2.2 Closed domain vs open ecosystem - §2.3 Rasa - §2.4 Amazon ASK / Google Dialogflow - §2.5 hassil - §2.6 Summary Collateral: dropped Mycroft from the project-name list in the intro and from the comparator enumeration in the §2.6 summary. Legacy topic strings that happen to contain 'mycroft' in their literal name remain in the §5 divergence tables and §5.7 predecessor-topic mapping as factual code references. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 55 ++++++++++++++++------------------------------------- 1 file changed, 16 insertions(+), 39 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index e485f87..487c3af 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -10,7 +10,7 @@ OVOS-INTENT-3, OVOS-INTENT-4, OVOS-MSG-1, OVOS-SESSION-1, OVOS-PIPELINE-1, OVOS-CONTEXT-1, and OVOS-TRANSFORM-1. Pointers to specific OVOS code (file paths, class names, function -names) and to specific real projects (HiveMind, Mycroft, Adapt, +names) and to specific real projects (HiveMind, Adapt, padatious, ovos-audio, ovos-workshop, …) are deliberately kept *out* of the spec bodies and collected here, because implementation code moves and specifications must not. @@ -237,31 +237,7 @@ shared-vocabulary model is correct for a curated one. The two models are right for different platforms; neither is universally better. -### 2.3 Mycroft — the predecessor - -The merged OVOS specifications are effectively Mycroft plus -corrections. The bus model, the `mycroft.skill.handler.*` -trio, the `recognizer_loop:utterance` entry topic, the session -concept — all inherited. Mycroft never wrote any of this down. - -OVOS's contribution is the *formalization*, plus the cleanups -of the prescriptive divergence catalogue (§5): - -- Single-flip routing (formalized in OVOS-MSG-1 §5). -- The `:` dispatch shape generalized - beyond skills (OVOS-PIPELINE-1 §7). -- The per-injection-point transformer contracts of - OVOS-TRANSFORM-1 (Mycroft had ad-hoc audio / utterance / - TTS hooks but no normative IO contract). -- The explicit gating semantics of OVOS-CONTEXT-1 (Mycroft - had `Message.context` adjacency rules but no normative - gate semantics). -- The handler-lifecycle trio renamed `mycroft.skill.handler.*` - → `ovos.intent.handler.*`. -- The entry-topic rename `recognizer_loop:utterance` → - `ovos.utterance.handle` (OVOS-PIPELINE-1 §9.1). - -### 2.4 Rasa — closest comparator for intent context +### 2.3 Rasa — closest comparator for intent context Rasa's "active forms" and slot mappings perform context-aware matching, but they are baked into the policy engine; you @@ -285,7 +261,7 @@ policy/preference split (TRANSFORM-1 §5.3) does not exist. TRANSFORM-1's six-injection-point model is genuinely more expressive. -### 2.5 Amazon ASK / Alexa Skills Kit, Google Dialogflow +### 2.4 Amazon ASK / Alexa Skills Kit, Google Dialogflow Both are closed-domain centrally-trained stacks. Their built-in entity-type systems (`AMAZON.DATE`, @@ -307,7 +283,7 @@ they do not have first-class dispatch identity. OVOS-PIPELINE-1's advertise its own intent identity *to the user* on the bus, indistinguishable from a skill — original to OVOS. -### 2.6 hassil — comparable only at the grammar layer +### 2.5 hassil — comparable only at the grammar layer The Home Assistant template-matcher, comparable only to OVOS-INTENT-1 / -2 / -3 (grammar + locale resources + intent @@ -318,7 +294,7 @@ transformers), or OVOS-CONTEXT-1 (no decaying session state). The grammar layer is broadly equivalent to OVOS-INTENT-1; everything above the grammar is OVOS-only. -### 2.7 Summary — where OVOS leads, follows, and differs +### 2.6 Summary — where OVOS leads, follows, and differs **OVOS leads architecturally** in three places: @@ -328,8 +304,8 @@ everything above the grammar is OVOS-only. owner on the same dispatch surface. - **The six-injection-point transformer chain with per-session preference/policy separation.** Nothing in HA, Rhasspy, - Mycroft, Rasa, ASK, or Dialogflow has a comparable - lifecycle-uniform extensibility surface. + Rasa, ASK, or Dialogflow has a comparable lifecycle-uniform + extensibility surface. - **Negative gating (`excludes_context` "match if absent") in CONTEXT-1.** ASK/Dialogflow contexts are purely positive; Rasa forms are not engine-agnostic; HA has no @@ -491,10 +467,11 @@ remote HiveMind sessions carry their own `session_id` and never `"default"`. None of this required HiveMind to modify OVOS core. The -mechanism that makes it work — single-flip routing + opaque -per-session identifiers + no central state — was already in -`ovos-bus-client/message.py:194-198`; OVOS-MSG-1 just names -and formalizes it. +mechanism that makes it work — single-flip routing, opaque +per-session identifiers, no central state — was an OVOS +design, built into `ovos-bus-client/message.py:194-198` +before this spec family was written; OVOS-MSG-1 formalizes +the design rather than introducing it. A layer-2 substrate also has a uniform **authorization surface** in the spec family without inventing a separate @@ -798,9 +775,9 @@ the normative sections. ### 4.6 Intent context (CONTEXT-1) -- **Lifts intent context out of Adapt.** The Adapt-era +- **Lifts intent context out of Adapt.** The Adapt-specific `add_context` / `remove_context` mechanism, and the - Mycroft-era `mycroft.skill.set_cross_context` / + legacy `mycroft.skill.set_cross_context` / `remove_cross_context` fan-out for cross-skill use, are Adapt-only at the matcher level — Padatious and other engines ignore them. CONTEXT-1 generalizes the mechanism @@ -1120,7 +1097,7 @@ and needs no implementation change: interception covers off-dispatch emissions. - **Entry-point topic renamed `ovos.utterance.handle`** (PIPELINE-1 §9.1). Current deployments use the - Mycroft-era `recognizer_loop:utterance`. That name fails + legacy `recognizer_loop:utterance` topic name. That name fails the naming conventions of OVOS-MSG-1 §2.1.2 on three counts: it uses `:` as a segment separator (where `:` is reserved for `:` dispatch topics); @@ -1147,7 +1124,7 @@ and needs no implementation change: §10). Introspection topics served from the orchestrator's passive registration index. - **`ovos.context.set` / `.unset` / `.clear` / `.list`** - (CONTEXT-1 §5). Skill-facing API replacing Adapt-era + (CONTEXT-1 §5). Skill-facing API replacing Adapt-specific `add_context` / `remove_context` plus `mycroft.skill.set_cross_context`. - **`ovos.transformer.{type}.list`** (TRANSFORM-1 §6). From a52151afaf63e65d20d7913bca49e11fa2dfe3af Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Mon, 25 May 2026 16:18:14 +0100 Subject: [PATCH 19/27] =?UTF-8?q?APPENDIX=20=C2=A73.3:=20external-protocol?= =?UTF-8?q?=20interoperability=20injection=20points?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Make the family's interop story explicit rather than implied. New §3.3 catalogues three injection points where external protocols plug into the spec family: 1. Pipeline plugins as the dispatch-layer adapter — LLM APIs (OpenAI Chat Completions and compatible), deterministic template matchers (hassil), external intent classifiers, agent-tool protocols (MCP). 2. Transformer chains as the artifact-pipeline adapter — bidirectional translation, STT validators, content-policy filters, acoustic-event detectors. 3. Bus boundary as the wire-level adapter — Wyoming bridges, MQTT-based stacks, HiveMind-style layer-2 substrates. Per-protocol notes for Wyoming, OpenAI, MCP, hassil, MQTT, A2A — naming where each plugs in. The single-flip routing and no-central-state stance (§3.1) are what make the bus-boundary adapter feasible without modifying the assistant core. Concrete suggestion: a translation tool between OVOS-INTENT-2 locale resources and HA's hassil/intents YAML would let the two corpora cross-pollinate mechanically. Added to §7 known gaps as planned tooling. The three injection points are intentionally not exhaustive — they're the points the spec family deliberately keeps clean. A protocol needing deeper integration is a signal of architectural overlap rather than complementarity. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 487c3af..509bf03 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -562,6 +562,101 @@ plugin, a persona plugin: each defines itself. PIPELINE-1 only defines the contract every plugin conforms to and the universal utterance lifecycle around the iteration. +### 3.3 Interoperability with external protocols + +The spec family does not define new transport protocols and +does not aim to replace existing ones. Where an external +voice-assistant protocol — Wyoming, OpenAI Chat Completions, +MCP tool calls, hassil templates, MQTT-based stacks — already +exists and serves a population, the spec family is designed to +**interoperate** with it through three well-defined injection +points. An adapter that plugs an external protocol into the +right injection point is a third-party implementation concern; +the spec family makes the integration shape predictable. + +**1. Pipeline plugins (OVOS-PIPELINE-1 §3) — the dispatch-layer +adapter.** A pipeline plugin wraps an external matcher, +consumes the utterance, and returns a `Match` with the +plugin's own `pipeline_id` as `owner_id`. The external +protocol becomes a first-class participant in the dispatch +surface, indistinguishable from a skill from the bus's +perspective. This is how language-model APIs, deterministic +template matchers, and external intent classifiers attach. + +**2. Transformer chains (OVOS-TRANSFORM-1 §3) — the +artifact-pipeline adapter.** A transformer wraps an external +protocol that operates on an audio, text, or rendered-output +artifact but does not claim intents. Examples: a +bidirectional-translation service at the utterance and dialog +chains; an external STT-confidence validator at the utterance +chain; a content-policy filter at the dialog or TTS chain; an +acoustic-event detector at the audio chain. + +**3. Bus boundary (OVOS-MSG-1 §3.4) — the wire-level +adapter.** A bridge component subscribes to the bus, translates +to and from an external transport, and either operates entirely +external (Wyoming-style audio / STT / TTS services talking +over TCP to a bridge that proxies the OVOS bus) or remotes the +whole bus (HiveMind-style layer-2 substrates). The +single-flip routing of §3.1.1 and the no-central-state stance +of §3.1.2 are what make the bus-boundary adapter feasible +without modifying the assistant core. + +#### Per-protocol notes + +- **Wyoming** (the component protocol used by Home Assistant + Voice and its ecosystem) operates at the audio-input / STT / + intent / TTS service boundary. A Wyoming bridge sits at the + bus boundary (§3.1, injection point 3 above): translate + Wyoming's `transcript` event into an `ovos.utterance.handle` + emission and translate the assistant's `speak` Messages + into Wyoming's `synthesize` event. Pipeline plugins are + unaffected; Wyoming components plug in *under* the + utterance lifecycle, not into it. +- **OpenAI Chat Completions and compatible APIs** (the + de-facto LLM interface). A persona-style pipeline plugin + wraps an OpenAI-compatible client (§3 of PIPELINE-1, + injection point 1 above). The plugin emits `Match` with + `owner_id = ` and bundles its own handler + using the dispatch polymorphism of OVOS-PIPELINE-1 §7. The + user sees a normal response; the LLM is a first-class + intent owner. +- **MCP (Model Context Protocol) and similar agent-tool + protocols.** A pipeline plugin can expose OVOS intents to + an MCP client (the OVOS-INTENT-4 §10 introspection topics + enumerate available intents) or call out to MCP tools from + within a plugin-bundled handler. Either direction sits at + injection point 1. +- **hassil templates and the Home Assistant `intents` + corpus.** A pipeline plugin can wrap hassil as a + deterministic template matcher (injection point 1). + Separately, the OVOS-INTENT-1 / hassil grammar lineage is + close enough that a **translation tool** between + OVOS-INTENT-2 locale resources and HA's `intents` YAML is + mostly mechanical — both formats are template-and-vocabulary + YAML at the same level of abstraction. Such a tool would + let the HA `intents` corpus and the OVOS locale corpus + cross-pollinate without either project changing its + format. This is concrete planned tooling, not just an + architectural possibility (§7). +- **MQTT-based stacks** (Rhasspy 2.x, miscellaneous IoT + voice systems). Bridge at the bus boundary (injection + point 3), same shape as Wyoming. +- **A2A and other agent-bus protocols.** Same shape as MCP; + pipeline-plugin wrapper or bus-boundary bridge depending + on whether the protocol participates in intent dispatch + or in cross-process bus routing. + +The three injection points are not exhaustive of where +adapters *could* go — a determined integrator can hook +almost anywhere — but they are the points the spec family +deliberately designs to keep clean. Any new protocol that +needs deeper integration than the three points permit is a +signal that the protocol genuinely overlaps the assistant's +own architecture rather than complementing it, at which +point the integration is a co-architecture decision rather +than an adapter. + --- ## 4. Design rationale, per specification @@ -1377,6 +1472,13 @@ uniformity is in the namespace, not in a fixed depth. - **Conversation-level evaluation infrastructure.** Rasa has story-based testing and end-to-end success metrics; the OVOS specs do not currently have a counterpart. +- **OVOS-INTENT-2 ↔ hassil `intents` translation tool.** + The grammar lineage (§2.1) makes a mechanical translator + between OVOS-INTENT-2 locale resources and HA's `intents` + YAML feasible. Such a tool would let the two corpora + cross-pollinate without either format changing. Sits at + injection point 3 of §3.3 conceptually but is + build-time rather than runtime tooling. - **i18n corpus.** OVOS-INTENT-2 defines the locale file format, and `ovos-localize` (§1.4) provides the operations layer; what remains is the *scale* of the From e7b62b37e03b8acd60d0d476df798574262e7459 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 00:23:25 +0100 Subject: [PATCH 20/27] APPENDIX: add CONVERSE-1 to orchestrator-stack narrative; close multi-turn gap MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OVOS-CONVERSE-1 (PR #25) fills the multi-turn conversation gap that §7 previously listed as planned work. Update §1.2 stack description to include it, and drop the §7 gap entry. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 509bf03..7dc4aef 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -58,12 +58,17 @@ The specifications are built bottom-up in three stacks: - **The orchestrator stack**: OVOS-PIPELINE-1 defines the orchestrator, the pipeline-plugin abstraction, the utterance lifecycle, and the handler-lifecycle trio. OVOS-CONTEXT-1 - defines per-session intent-context state. OVOS-TRANSFORM-1 - defines the six injection-point transformer chains. The - orchestrator stack sits on top of the bus stack (uses MSG-1's - envelope and routing, SESSION-1's session carrier) and around - the intent stack (intent registrations are one kind of input - pipeline plugins consume). + defines per-session intent-context state (the **declarative** + continuous-dialog primitive). OVOS-CONVERSE-1 defines the + active-handler recency stack, the converse plugin role, and + the interactive response-collection mechanism (the + **imperative** continuous-dialog primitive, complementary to + CONTEXT-1 — its §7 fixes the evaluation order between the two + surfaces). OVOS-TRANSFORM-1 defines the six injection-point + transformer chains. The orchestrator stack sits on top of the + bus stack (uses MSG-1's envelope and routing, SESSION-1's + session carrier) and around the intent stack (intent + registrations are one kind of input pipeline plugins consume). ### 1.3 Compatibility levels @@ -1449,14 +1454,6 @@ uniformity is in the namespace, not in a fixed depth. `time_format`, `date_format`, `system_unit`, `tts_preferences`, `location`, …) are deferred to a future specification. -- **A multi-turn conversation specification.** When a skill - asks the user a question and waits for the next utterance, - the "next utterance belongs to that pending question" link - is not formalized today (handled informally by the - `converse` plugin type plus skill-side state). MSG-1's - async-by-default stance (§3.1.2) leaves room for this to - be formalized either in the session spec or as a separate - one. - **Text normalization of ASR output.** The basis for slot value typing (INTENT-1 §5.3). Deferred to its own specification. From 671aa0480af432b65a3e0b696af83978fb8c4681 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 01:28:50 +0100 Subject: [PATCH 21/27] =?UTF-8?q?APPENDIX=20=C2=A75.3,=20=C2=A75.4:=20upda?= =?UTF-8?q?te=20for=20PIPELINE-1=20=C2=A74.2=20relaxation=20+=20=C2=A77.0?= =?UTF-8?q?=20polymorphism=20collapse?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two divergence-catalogue entries updated to reflect the PIPELINE-1 restructure: - The §5.4 'side-effect-free during match' entry is rewritten as 'match contract is the single obligation' — match's only MUST is returning Match-or-null; bus emissions during match are allowed; session mutation during match is via Match.updated_session (explicit channel). - New §5.4 entry: 'Match.updated_session as the match-phase session channel' — promotes the existing ovos-core code pattern `sess = match.updated_session or SessionManager.get(message)` to a normative Match field. Claiming plugin's mutations land; declined plugin's mutations drop at the boundary. - The §5.3 'Dispatch payload uses polymorphic owner_id' entry is rewritten as 'unified owner_id' — reflects PIPELINE-1 §7.0's collapse to two handler-owner shapes (plain skill, pipeline plugin with bundled handlers where pipeline_id == skill_id) plus the pure-matcher recognition. Notes the conceptual mapping skill_id ≈ voice_app_id, pipeline_id ≈ matching-engine id. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 61 +++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 16 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 7dc4aef..52ff51b 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -1123,10 +1123,24 @@ and needs no implementation change: prescribed shape uses the structured `(skill_id, intent_name, lang)` triple plus `samples|file` and `blacklist|blacklist_file`. -- **Dispatch payload uses polymorphic `owner_id`** - (PIPELINE-1 §7.1). Today dispatch carries `skill_id` only. - PIPELINE-1's `owner_id` is either a `skill_id` or a - `pipeline_id` — same field, polymorphic value. +- **Dispatch payload uses unified `owner_id`** (PIPELINE-1 + §7.0, §7.1). Today dispatch carries `skill_id` only. + PIPELINE-1 §7.0 collapses handler-owner shapes to two: + plain skill (handler reached via its `skill_id`) and + pipeline plugin with bundled handlers (the plugin's + `pipeline_id == skill_id` — one identifier filling both + roles). Conceptually `skill_id` is the voice-app identity + (every handler-owner has one); `pipeline_id` is the + matching-engine identity (only loaded plugins have one). + Plugins-with-handlers MUST NOT register their intents + under INTENT-4 — they own the handler directly — and + SHOULD publish their intent_names via the per-pipeline + passive index (§7.0, §10) for observability. A + pure-matcher plugin (Padatious, Adapt, the converse + plugin) has only a `pipeline_id` and produces matches + whose `owner_id` is some other component's identity. The + dispatch payload uniformly carries `owner_id` regardless + of shape. - **Handler-lifecycle payload includes `owner_id`** (PIPELINE-1 §8.2). Today the trio payload is `{name: }`. Prescribed: `{owner_id, @@ -1142,18 +1156,33 @@ and needs no implementation change: `ovos.intent.list` / `ovos.intent.describe` from the passive view. This is a new orchestrator responsibility, not a change to existing behaviour. -- **Plugins are side-effect-free during `match`** (PIPELINE-1 - §4.2). This is a forward-looking rule rather than a fix - for current code. The standard - `match_high` / `match_medium` / `match_low` methods in the - official plugins are already side-effect-free (they - compute and return). Where side effects do happen today, - they are orchestrator-side after the match wins (e.g. the - `.activate` emit in - `ovos-core/intent_services/service.py:365`), or in - *other* bus handlers a plugin subscribes to. The spec - rule keeps the current discipline normative as alternative - plugin types (LLM-backed, agent-backed) are written. +- **The match contract is the single obligation** (PIPELINE-1 + §4.2). The plugin's `match` operation has one MUST: return + a `Match` (§4.1) or `null`. Bus emissions during `match` + are allowed — a plugin that polls other components, calls + out to a model server, or runs any matching strategy that + requires bus communication is conformant. This matches the + actual OVOS converse-plugin pattern (it polls active skills + during its match decision) and accommodates LLM-backed and + agent-backed plugin shapes that are inherently bus-active. + Session mutation during `match` is via the explicit + `Match.updated_session` channel — see the next entry — so + declined plugins' exploratory mutations never reach the + next iteration step. +- **`Match.updated_session` as the match-phase session channel** + (PIPELINE-1 §4.1, §4.2). Promotes the existing ovos-core + code pattern + `sess = match.updated_session or SessionManager.get(message)` + to a normative Match field. The plugin that produces a + claiming match composes any session mutations it needs + (decrementing a response-mode counter, pre-promoting an + active-handler to the head, setting intent_context + alongside the match) into a fresh snapshot returned in + `Match.updated_session`. The orchestrator uses that + snapshot for the dispatch and every downstream stage; a + declined-match (plugin returns `null`) drops the snapshot + at the plugin boundary. This is what makes match-phase + mutation safe under §6.2 first-match-wins iteration. - **`ovos.utterance.handled` on every terminal path** (PIPELINE-1 §9.5). Current `ovos-workshop`'s `_on_event_error` does not emit it on the handler-error From 7a37dca644e9ac3c624428ec3204947bb1cf0253 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 02:26:41 +0100 Subject: [PATCH 22/27] =?UTF-8?q?APPENDIX=20=C2=A71.2,=20=C2=A77:=20SESSIO?= =?UTF-8?q?N-2=20fills=20the=20lifecycle=20gap?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OVOS-SESSION-2 (in flight at PR #27) defines session lifecycle and state ownership. Update: - §1.2 orchestrator-stack narrative adds SESSION-2 to the stack description with one-line summary of its scope (stateless orchestrator for named sessions, orchestrator-owned default session, projection mandate). - §7 gap entry rewritten: SESSION-2 lands the lifecycle piece; what remains deferred is the set of session preference fields that need to be claimed under SESSION-1 §2.1 by their owning specs (preferences / OCP / persona / locale). Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 52ff51b..85a5575 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -65,10 +65,15 @@ The specifications are built bottom-up in three stacks: **imperative** continuous-dialog primitive, complementary to CONTEXT-1 — its §7 fixes the evaluation order between the two surfaces). OVOS-TRANSFORM-1 defines the six injection-point - transformer chains. The orchestrator stack sits on top of the - bus stack (uses MSG-1's envelope and routing, SESSION-1's - session carrier) and around the intent stack (intent - registrations are one kind of input pipeline plugins consume). + transformer chains. OVOS-SESSION-2 defines the session + lifecycle and state-ownership model (stateless orchestrator + for named sessions, orchestrator-owned default session, + projection mandate forcing all cross-utterance state into + session-resident fields). The orchestrator stack sits on top + of the bus stack (uses MSG-1's envelope and routing, + SESSION-1's session carrier with SESSION-2's lifecycle) and + around the intent stack (intent registrations are one kind + of input pipeline plugins consume). ### 1.3 Compatibility levels @@ -1476,13 +1481,16 @@ uniformity is in the namespace, not in a fixed depth. `common_query`, `ocp`, `persona`, `stop`. Each defines its own internal behaviour and its own bus emissions beyond the universal lifecycle PIPELINE-1 prescribes. -- **A full session-lifecycle specification.** SESSION-1 - defines the wire shape; lifecycle (start, end, expiry, - resumption) and the full set of session preferences - current OVOS already carries (`persona_id`, +- **Session preference fields not yet claimed.** SESSION-1 + defines the wire shape and OVOS-SESSION-2 (in flight at + PR #27) defines the lifecycle and state-ownership model; + what remains deferred is the full set of session + preferences current OVOS already carries (`persona_id`, `time_format`, `date_format`, `system_unit`, - `tts_preferences`, `location`, …) are deferred to a - future specification. + `tts_preferences`, `location`, …) — these need to be + claimed under SESSION-1 §2.1's field registry by their + respective owning specs (a future preferences spec, + OCP / persona / locale specs as appropriate). - **Text normalization of ASR output.** The basis for slot value typing (INTENT-1 §5.3). Deferred to its own specification. From a260b83bc289a83fe3554e1b8a575fcc709b41d2 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 02:41:34 +0100 Subject: [PATCH 23/27] =?UTF-8?q?APPENDIX=20=C2=A71.2:=20SESSION-2=20narra?= =?UTF-8?q?tive=20=E2=80=94=20SHOULD-project=20+=20MAY-internal=20(not=20'?= =?UTF-8?q?mandate')?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sync with SESSION-2 §2.4 relaxation (commit 6a882c8). The projection pathway is SHOULD-when-practical; plugins MAY hold internal state with full lifecycle ownership and best-effort resumption. Co-Authored-By: Claude Opus 4.7 (1M context) --- APPENDIX.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 85a5575..1391493 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -68,8 +68,9 @@ The specifications are built bottom-up in three stacks: transformer chains. OVOS-SESSION-2 defines the session lifecycle and state-ownership model (stateless orchestrator for named sessions, orchestrator-owned default session, - projection mandate forcing all cross-utterance state into - session-resident fields). The orchestrator stack sits on top + SHOULD-project pathway for cross-utterance state with + MAY-internal as the alternative for state too large or + externally coupled to project). The orchestrator stack sits on top of the bus stack (uses MSG-1's envelope and routing, SESSION-1's session carrier with SESSION-2's lifecycle) and around the intent stack (intent registrations are one kind From 5490f6159204fa7bffbdace97f46431646f48195 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 16:42:11 +0100 Subject: [PATCH 24/27] =?UTF-8?q?APPENDIX=20=C2=A75.2.1:=20document=20ovos?= =?UTF-8?q?.session.sync=20/=20update=5Fdefault=20for=20removal?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit These ovos-core topics are not defined by any spec. SESSION-2 §6.4 explicitly avoids naming them. They should be retired in favour of clients reading session state from normal Message flow (ovos.utterance.handled or any other session-carrying Message). Co-Authored-By: Claude Sonnet 4.6 --- APPENDIX.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/APPENDIX.md b/APPENDIX.md index 1391493..04896a2 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -1113,6 +1113,25 @@ and needs no implementation change: | PIPELINE-1 | `mycroft.skill.handler.start` / `.complete` / `.error` | `ovos.intent.handler.start` / `.complete` / `.error` | Renamed into the `ovos.intent.*` namespace for uniformity. Breaks every existing handler-lifecycle observer; the migration cost is real. | | PIPELINE-1 | `recognizer_loop:utterance` | `ovos.utterance.handle` | See §5.4 entry. Migration touches `ovos-dinkum-listener`, `ovos-simple-listener`, `ovos-audio`, and `ovos-core/intent_services/service.py`. | +### 5.2.1 Topics to remove from ovos-core + +The following topics exist in current ovos-core but are **not +defined by any spec** and should be removed or replaced: + +- **`ovos.session.sync` / `ovos.session.update_default`** — + emitted by `SessionManager` to broadcast the current default + session to interested components. SESSION-2 §6.4 acknowledges + that an orchestrator MAY emit default-session state on a + deployer-defined topic but assigns no normative name. These + ad-hoc topics should be retired: any component that needs the + default-session state can subscribe to `ovos.utterance.handled` + (PIPELINE-1 §9.5) and read the session it carries, or listen + to any other assistant-emitted Message on the default session. + A named sync topic adds an implicit state-broadcast contract + that the specs deliberately avoid; clients are expected to + track session from Message flow, not from dedicated sync + broadcasts. + ### 5.3 Prescriptive shape changes - **Keyword intent registration is atomic** (INTENT-4 §5). From 92bcbf24e55233c1d478bf47df4e4638bfb28209 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 20:26:52 +0100 Subject: [PATCH 25/27] =?UTF-8?q?README=20+=20APPENDIX=20=C2=A71.0:=20esta?= =?UTF-8?q?blish=20voice=20OS=20framing?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit README intro replaced: "voice assistant ecosystem" → "voice operating system" with an OS-analogy table (scheduler, IPC, shared memory, process supervision, loadable modules, syscall ABI). APPENDIX §1.0 (new): The voice operating system concept — two conflations addressed: (1) voice assistant product (closed, vertically integrated vs open platform); (2) LLM wrapper (LLMs fit as pipeline plugins, utterance/dialog/metadata transformers — one possible multi-role deployment, not the architecture itself). Co-Authored-By: Claude Sonnet 4.6 --- APPENDIX.md | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 38 ++++++++++++++++++++++++++++++++++---- 2 files changed, 82 insertions(+), 4 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index 04896a2..e6a5d9b 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -19,6 +19,54 @@ code moves and specifications must not. ## 1. About the OVOS specifications +### 1.0 The voice operating system concept + +The term *voice operating system* is precise, not marketing. The +distinction matters because OVOS is routinely conflated with two +things it is not: + +**It is not a voice assistant product.** A voice assistant is a +closed, vertically-integrated product — a single vendor controls +the NLU, the dialogue policy, the skill ecosystem, and the output +layer. It answers questions. A voice operating system is a +*platform*: it defines contracts that arbitrary third-party +components implement independently, and the platform's job is to +arbitrate between them. The analogy to a general-purpose OS is +direct. The pipeline is a scheduler: it has a priority order, a +first-match-wins dispatch policy, and a circuit-breaker for failing +components. The bus is IPC: broadcast delivery, no central +authority, no guaranteed ordering beyond the single-flip routing +model. The session carrier is shared memory: it propagates opaquely +through every message and every component reads and writes its +owned slice. The handler-lifecycle trio is process supervision: the +orchestrator wraps every handler invocation with start/complete/error +events regardless of what the handler does. Pipeline plugins and +transformer plugins are loadable modules: swapped, replaced, and +composed at deployment time with no changes to the ABI. + +**It is not an LLM wrapper.** A language model fits the voice OS +model as a first-class plugin — and in multiple roles. As a +*pipeline plugin*, it implements `match(utterances, lang, session) +→ Match`, returning a match immediately and deferring generation to +its handler (PIPELINE-1 §4.4). As an *utterance transformer*, it +paraphrases, normalizes, or expands the inbound candidate list +before matching (TRANSFORM-1 §3.2). As a *dialog transformer*, it +rewrites the handler's natural-language response before delivery +(TRANSFORM-1 §3.5). As a *metadata transformer*, it enriches the +utterance with detected intent signals before the pipeline sees it +(TRANSFORM-1 §3.3). In each role, the model is one implementation +of a defined plugin contract — swappable, composable, and neutral +to the platform. Whether any LLM is loaded at all, and in which +roles and at what priority, is a deployment decision. An +architecture organized around a single model call is not a voice OS; +it is one possible single-plugin deployment of one. + +The consequence of the OS framing: a skill written against the +intent stack runs on any conformant orchestrator, under any pipeline +configuration, with any combination of NLU backends, in any language +the deployment supports. The platform's only invariant is the ABI — +the wire contracts these specifications define. + ### 1.1 Formalization of an existing system The OVOS stack — the engines (padatious, Adapt), the skill diff --git a/README.md b/README.md index 74749e3..7e31fc8 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,42 @@ # OVOS Formal Specifications -Formal, implementation-agnostic specifications for the OpenVoiceOS -voice assistant ecosystem. +Formal, implementation-agnostic specifications for a **voice +operating system** — a platform that provides a stable application +binary interface for voice-interactive applications. -This repository is the **source of truth** for how OVOS components +This repository is the **source of truth** for how components talk to each other and what their data shapes mean. The specs are written generically so they can be implemented by any tool, in any -language, and adopted by voice assistants beyond OVOS. +language, and adopted beyond their origin project. + +### What a voice operating system is + +A voice OS is not a voice assistant. A voice assistant is a product +that answers questions. A voice OS is a **platform**: it defines the +boundary between user input and computation, arbitrates which +application handles each utterance, manages conversation state across +interactions, and provides a stable ABI that arbitrary third-party +applications run against without knowing anything about each other. + +The analogy to a general-purpose OS is direct: + +| OS concept | Voice OS equivalent | +|---|---| +| Process scheduler | Pipeline plugin ordering (PIPELINE-1 §5–6) | +| IPC / message passing | The bus and MSG-1 envelope | +| Shared memory | Session carrier (SESSION-1, SESSION-2) | +| Process supervision | Handler-lifecycle trio (PIPELINE-1 §8) | +| Loadable kernel modules | Pipeline plugins, transformer plugins | +| System call ABI | The `match(utterances, lang, session) → Match` contract | + +The consequence is that OVOS is not a chatbot, not an LLM wrapper, +and not a monolithic product. It is a **runtime**: swap the +scheduler (pipeline ordering), the NLU engines (pipeline plugins), +the dialogue policy (converse / context), the output layer (TTS, +display), or any combination — the ABI stays stable and the rest +keeps working. A skill written against the intent stack runs on any +conformant orchestrator, under any pipeline configuration, in any +language OVOS supports. > ⚠️ **Draft.** Specs in this repository are at **Draft** status. > Implementations are being brought into conformance progressively; From 1d6cc168adfeb7ad06274cc4b910ef6ee35c2dfd Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 20:28:55 +0100 Subject: [PATCH 26/27] revert: move README voice-OS framing to its own PR (#28) Co-Authored-By: Claude Sonnet 4.6 --- README.md | 38 ++++---------------------------------- 1 file changed, 4 insertions(+), 34 deletions(-) diff --git a/README.md b/README.md index 7e31fc8..74749e3 100644 --- a/README.md +++ b/README.md @@ -1,42 +1,12 @@ # OVOS Formal Specifications -Formal, implementation-agnostic specifications for a **voice -operating system** — a platform that provides a stable application -binary interface for voice-interactive applications. +Formal, implementation-agnostic specifications for the OpenVoiceOS +voice assistant ecosystem. -This repository is the **source of truth** for how components +This repository is the **source of truth** for how OVOS components talk to each other and what their data shapes mean. The specs are written generically so they can be implemented by any tool, in any -language, and adopted beyond their origin project. - -### What a voice operating system is - -A voice OS is not a voice assistant. A voice assistant is a product -that answers questions. A voice OS is a **platform**: it defines the -boundary between user input and computation, arbitrates which -application handles each utterance, manages conversation state across -interactions, and provides a stable ABI that arbitrary third-party -applications run against without knowing anything about each other. - -The analogy to a general-purpose OS is direct: - -| OS concept | Voice OS equivalent | -|---|---| -| Process scheduler | Pipeline plugin ordering (PIPELINE-1 §5–6) | -| IPC / message passing | The bus and MSG-1 envelope | -| Shared memory | Session carrier (SESSION-1, SESSION-2) | -| Process supervision | Handler-lifecycle trio (PIPELINE-1 §8) | -| Loadable kernel modules | Pipeline plugins, transformer plugins | -| System call ABI | The `match(utterances, lang, session) → Match` contract | - -The consequence is that OVOS is not a chatbot, not an LLM wrapper, -and not a monolithic product. It is a **runtime**: swap the -scheduler (pipeline ordering), the NLU engines (pipeline plugins), -the dialogue policy (converse / context), the output layer (TTS, -display), or any combination — the ABI stays stable and the rest -keeps working. A skill written against the intent stack runs on any -conformant orchestrator, under any pipeline configuration, in any -language OVOS supports. +language, and adopted by voice assistants beyond OVOS. > ⚠️ **Draft.** Specs in this repository are at **Draft** status. > Implementations are being brought into conformance progressively; From 1911c01d5315cc1911537509a1e8d9e573934c92 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 26 May 2026 20:48:21 +0100 Subject: [PATCH 27/27] APPENDIX: fix stale PIPELINE-1 refs; slim redundant prose MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - owner_id → skill_id throughout (§3.2, §3.3, §4.5, §4.6) - match(utterance,…) → match(utterances,…) (§4.5) - Match.captures → Match.slots (§4.7) - complete_intent_failure → ovos.intent.unmatched in §5.1/§5.3/§5.7; add rename row to §5.2 table - Dispatch payload block in §5.3 rewritten: {lang, utterance, slots}, handler-lifecycle uses {skill_id, intent_name, optional exception} - §5.5: add ovos.intent.unmatched and ovos.utterance.speak entries - §2.5 hassil: drop standalone subsection; fix §2 intro cross-ref - §1.3 compat levels: condense to bullets - §1.4: drop ovos-localize "honest notes" paragraph - §3.1.3: trim to essential bus-substrate mechanics - §4.7: trim per-type-explosion and per-type-self-id bullets - §5.4: trim rename and match-contract entries Co-Authored-By: Claude Sonnet 4.6 --- APPENDIX.md | 308 ++++++++++++++++++---------------------------------- 1 file changed, 104 insertions(+), 204 deletions(-) diff --git a/APPENDIX.md b/APPENDIX.md index e6a5d9b..e608da9 100644 --- a/APPENDIX.md +++ b/APPENDIX.md @@ -127,42 +127,19 @@ The specifications are built bottom-up in three stacks: ### 1.3 Compatibility levels Each specification carries its own integer `Version`, bumped per -PR per the contributing rules in the README. The architecture as -a whole is spoken of at **compatibility levels** — versioned -snapshots a tool may target, checked against by -`ovos-spec-lint`. - -The compatibility-level model works cleanly for the **intent -stack**, where a single integer identifies a coherent grammar / -resources / intent-definition snapshot. The bus and orchestrator -stacks do not yet map onto the same single-axis ladder; a -specification-set-wide version tuple covering every spec is a -planned follow-up. - -The intent-stack ladder: - -- **V0** — *informal.* The undocumented, de-facto behaviour from - before these specifications existed. V0 is not specified - anywhere; it is the baseline the formalization started from. - V0 has no notion of the `.blacklist` resource role or of - `` references. -- **V1** — the intent stack as first formalized: - OVOS-INTENT-1, -2 and -3, each at version 1. V1's headline - addition over V0 is the `.blacklist` role. -- **V2** — V1 plus **inline vocabulary references** (the - `` token): OVOS-INTENT-1 and OVOS-INTENT-2 at version 2. - A V2 template cannot be expanded by a V1 tool. - -The bus stack (OVOS-MSG-1), the registration spec -(OVOS-INTENT-4), the session spec (OVOS-SESSION-1), the -orchestrator spec (OVOS-PIPELINE-1), the context spec -(OVOS-CONTEXT-1), and the transformer spec (OVOS-TRANSFORM-1) -are versioned **individually** and not placed on a unified -compatibility ladder. A tool targeting them today cites per-spec -versions: "MSG-1 v2, INTENT-4 v1, PIPELINE-1 v2." Whether the -compat-level model evolves into a multi-axis grid, per-stack -ladders, or is quietly deprecated in favour of per-spec -versions only, is deferred. +PR per the contributing rules in the README. + +For the **intent stack**, a single integer identifies a coherent +grammar / resources / intent-definition snapshot checked by +`ovos-spec-lint`. The ladder: + +- **V0** — undocumented pre-spec baseline; no `.blacklist`, no `` references. +- **V1** — INTENT-1, -2, -3 at v1; headline addition is the `.blacklist` role. +- **V2** — V1 plus inline vocabulary references (``); a V2 template cannot be expanded by a V1 tool. + +The bus and orchestrator stacks are versioned **individually** +and not placed on a unified ladder — a tool targeting them cites +per-spec versions ("MSG-1 v2, PIPELINE-1 v2"). ### 1.4 Reference implementations and ecosystem tooling @@ -193,14 +170,6 @@ preview, and submit translations as pull requests. It is the OVOS counterpart to Home Assistant's managed `intents` repository. -Two honest notes on `ovos-localize`: it is currently -**descriptive** of real OVOS skills — it also handles legacy -file types these specs deliberately drop — so as the specs and -the ecosystem converge, its file-type coverage and the specs -will need to meet in the middle; and its translation validators -are a natural home for spec conformance checks, distinct from -but related to the planned grammar-level conformance corpus -(§7). --- @@ -208,9 +177,9 @@ but related to the planned grammar-level conformance corpus The OVOS specifications occupy territory adjacent to several existing voice-assistant systems. This section locates the -design choices against each comparator. The summary in §2.7 -records where OVOS leads architecturally, where it follows, and -where it makes a deliberately different choice. +design choices against each comparator. The summary in §2.5 +records where the voice OS leads architecturally, where it +follows, and where it makes a deliberately different choice. ### 2.1 Home Assistant and Rhasspy — shared grammar lineage @@ -337,23 +306,12 @@ Neither ASK nor Dialogflow has a `session.pipeline`-equivalent anything like the layer-2 substrate of OVOS-MSG-1 §3.4. ASK has built-in intents (`AMAZON.HelpIntent`) but they are handled inside the skill; Dialogflow has fallback intents but -they do not have first-class dispatch identity. OVOS-PIPELINE-1's -`:` lets a non-skill component -advertise its own intent identity *to the user* on the bus, -indistinguishable from a skill — original to OVOS. - -### 2.5 hassil — comparable only at the grammar layer - -The Home Assistant template-matcher, comparable only to -OVOS-INTENT-1 / -2 / -3 (grammar + locale resources + intent -concept). hassil has no equivalent of OVOS-MSG-1 (no bus -envelope), OVOS-PIPELINE-1 (no pipeline notion — HA runs a -single matcher), OVOS-TRANSFORM-1 (no per-utterance -transformers), or OVOS-CONTEXT-1 (no decaying session state). -The grammar layer is broadly equivalent to OVOS-INTENT-1; -everything above the grammar is OVOS-only. +they do not have first-class dispatch identity. OVOS-PIPELINE-1's dispatch polymorphism +(`skill_id == pipeline_id` for plugin-bundled handlers) lets a +non-skill component advertise its own intent identity on the bus, +indistinguishable from a skill — original to this architecture. -### 2.6 Summary — where OVOS leads, follows, and differs +### 2.5 Summary — where the voice OS leads, follows, and differs **OVOS leads architecturally** in three places: @@ -504,45 +462,28 @@ of conversational state. The async-by-default model means those future specs only need to define *what* the state is, not *how* it travels. -#### 3.1.3 Why HiveMind works - -HiveMind is the canonical layer-2 system this design enables. -A HiveMind satellite is just another user-side emitter — it -sets `source` to its peer ID, populates `session` with a -per-peer session, and emits a Message. Inside OVOS: - -- ovos-core runs the same `.reply` flip (§3.1.1 step 2) — - `destination` becomes the satellite's peer ID instead of - the local microphone. -- Skills `.forward` as usual — `destination` stays the - satellite ID through every handler emission. -- HiveMind, watching the bus, sees each message addressed to - its peer and routes it back over the HiveMind transport. - -The pre-existing `session_id == "default"` rule keeps -device-local TTS on the device's speakers (per -`ovos-audio/utils.py`'s `require_default_session`), because -remote HiveMind sessions carry their own `session_id` and -never `"default"`. - -None of this required HiveMind to modify OVOS core. The -mechanism that makes it work — single-flip routing, opaque -per-session identifiers, no central state — was an OVOS -design, built into `ovos-bus-client/message.py:194-198` -before this spec family was written; OVOS-MSG-1 formalizes -the design rather than introducing it. - -A layer-2 substrate also has a uniform **authorization -surface** in the spec family without inventing a separate -channel: client sessions populate the preference fields of -OVOS-SESSION-1 (`pipeline`, the six `_transformers`) -to request behaviour, while the layer-2 substrate populates -the policy fields (`blacklisted_pipelines`, -`blacklisted_skills`, `blacklisted_intents`, the six -`blacklisted__transformers`) from the peer's grant. -OVOS-PIPELINE-1 §5.5 and OVOS-TRANSFORM-1 §5.3 compose them -deterministically (preference → availability → policy) at the -orchestrator without per-hop re-authorization. +#### 3.1.3 Layer-2 substrates + +The single-flip routing model and the no-central-state +design make layer-2 federation composable without modifying +the assistant core. A remote peer is just another user-side +emitter: it sets `source` to its peer ID, populates `session` +with its own named session, and emits a Message. The +orchestrator runs the same `.reply` flip; response messages +carry `destination == peer ID`; the bridge (watching the bus) +routes them back over the transport. The +`session_id == "default"` rule keeps device-local TTS on the +device's speakers because remote sessions carry their own +`session_id` and never `"default"`. + +Layer-2 bridges also inherit the session-field +**preference/policy split** without extra mechanism: client +sessions populate the preference fields +(`pipeline`, `_transformers`) to request behaviour; +the bridge populates the policy fields +(`blacklisted_pipelines`, `blacklisted__transformers`) +from the peer's grant. PIPELINE-1 §5.5 and TRANSFORM-1 §5.3 +compose them deterministically at the orchestrator. ### 3.2 The pipeline-plugin model @@ -573,7 +514,7 @@ refinement**, not a wholesale new abstraction. It: - formalizes the plugin contract (the `match` shape, the `Match` result, the side-effect-free discipline); -- defines `:` **dispatch +- defines `:` **dispatch polymorphism** so a plugin can bundle its own handler (a language-model persona, a chatbot) as a first-class participant alongside skill-owned handlers; @@ -636,7 +577,7 @@ the spec family makes the integration shape predictable. **1. Pipeline plugins (OVOS-PIPELINE-1 §3) — the dispatch-layer adapter.** A pipeline plugin wraps an external matcher, consumes the utterance, and returns a `Match` with the -plugin's own `pipeline_id` as `owner_id`. The external +plugin's own `pipeline_id` as `skill_id`. The external protocol becomes a first-class participant in the dispatch surface, indistinguishable from a skill from the bus's perspective. This is how language-model APIs, deterministic @@ -676,7 +617,7 @@ without modifying the assistant core. de-facto LLM interface). A persona-style pipeline plugin wraps an OpenAI-compatible client (§3 of PIPELINE-1, injection point 1 above). The plugin emits `Match` with - `owner_id = ` and bundles its own handler + `skill_id = ` and bundles its own handler using the dispatch polymorphism of OVOS-PIPELINE-1 §7. The user sees a normal response; the LLM is a first-class intent owner. @@ -892,7 +833,7 @@ the normative sections. since the orchestrator's job *is* iterating plugins and translating their matches into bus events. Splitting them would leave neither coherent. -- **Plugin contract is minimal.** `match(utterance, lang, +- **Plugin contract is minimal.** `match(utterances, lang, session) → Match | None`. Side-effect-free during `match`; everything else (state, registrations, language-model calls, response generation) is @@ -911,11 +852,11 @@ the normative sections. `pipeline_id` in `Session.pipeline`. The current convention is compatible with PIPELINE-1 unchanged. - **Skills and plugins are equivalent handler owners.** - Dispatch topic `:` polymorphism - (owner is `skill_id` or `pipeline_id`) lets a plugin - bundle its own handler — for example, a language-model - persona plugin that has no skills behind it — and still - be addressed uniformly. + The dispatch topic `:` is uniform: + for a pure-matcher plugin the `skill_id` is the matched + skill's id; for a plugin that bundles its own handler + (e.g. a language-model persona) `skill_id == pipeline_id`. + Both are addressed the same way. - **Universal `ovos.utterance.handled` end-marker on every terminal path.** One reserved invariant lets observers count turns, route fallbacks, and know "the assistant @@ -940,7 +881,7 @@ the normative sections. `requires_context` and `excludes_context` declarations. - **Two explicit scopes encoded in the key shape.** `private` (orchestrator auto-prefixes with - `:`) and `shared` (flat, cross-skill). The + `:`) and `shared` (flat, cross-skill). The current OVOS code models the same distinction informally (`MycroftSkill.set_context` auto-prefixes with `alphanumeric_skill_id`; `set_cross_skill_context` fans @@ -999,7 +940,7 @@ the normative sections. normalization specification. TRANSFORM-1 §3.4 is the spec'd injection home for typing: a deployer ships date / number / duration parsing once, and every skill - receives typed values in `Match.captures` regardless of + receives typed values in `Match.slots` regardless of which engine matched. The OVOS analogue of ASK's `AMAZON.DATE` and Dialogflow's `@sys.date-time`, but as an injected enrichment rather than a built-in engine @@ -1045,7 +986,7 @@ the normative sections. concept the field encodes) and adding orchestrator-stamped `cancel_by: `. The spec's `ovos.utterance.cancelled` terminal event sits alongside - `complete_intent_failure`, keeping cancellation and + `ovos.intent.unmatched`, keeping cancellation and failure observably distinct on the bus. - **`lang` parameter is bidirectional** (TRANSFORM-1 §3.0). Four of the six per-type contracts (audio, utterance, @@ -1057,23 +998,14 @@ the normative sections. downstream stages. Language-detector and clearing cases fall out of the same channel. - **Per-type self-identification keys, list-valued.** - TRANSFORM-1 §1.3 claims six `Message.context` keys — one - per transformer type (`audio_transformer_ids`, …, - `tts_transformer_ids`) — rather than a single generic - key. Two reasons. First, role matters: a Message at the - dialog stage may have been touched by five transformer - types in sequence, and lumping them into one slot loses - the role partitioning that exists in every other - surface of the spec (per-type registries, per-type - `*_transformers` overrides, per-type introspection - topics). Second, multi-type plugins disambiguate: a - plugin shipping both an utterance and a dialog - transformer under the same `transformer_id` would, with - a single generic key, leave consumers unable to tell - which role emitted. The keys are *lists*, not single - strings, because transformers chain by design — the - list preserves the full per-type chain on the wire in - order of touch. + TRANSFORM-1 §1.3 claims six `Message.context` keys + (one per transformer type) rather than a single generic + key. Role matters: a Message may have been touched by + multiple types in sequence, and a multi-type plugin + (e.g., both utterance and dialog) would be ambiguous + in a single-key model. Keys are lists because + transformers chain — the full per-type chain is + preserved in order. - **Per-type denylists complete the policy surface.** TRANSFORM-1 §5.2 claims six `blacklisted__transformers` session fields, @@ -1082,22 +1014,14 @@ the normative sections. `pipeline` / `blacklisted_pipelines` pair of PIPELINE-1 §5. Three-stage composition (preference → availability → policy) in §5.3 mirrors PIPELINE-1 §5.5 exactly. -- **The per-type "explosion" is deliberate.** Counting - transformer-related session-field claims: six chain - orderings + six denylists = twelve fields, plus six - `Message.context` attribution keys. The alternative — a - `transformer_:` prefix-encoded single - namespace — would require prefix parsing at every - lookup. The per-type partition matches the partitioning - that already exists in the §1.1 registries, the §4 - chain ordering rules, and the §6 introspection topics. - Under the canonical SHOULD-omit rule of SESSION-1 §3.4, - the common case carries zero of these fields on the - wire. If the field count ever proves painful in - practice, the cleanest fallback is an object-valued form - (`session.transformers: {audio: [...], ...}`), - collapsing twelve flat fields into two structured ones - with the per-type partition preserved as object keys. +- **The per-type "explosion" is deliberate.** Twelve flat + session fields (six chain-orderings + six denylists) plus + six `Message.context` attribution keys. A prefix-encoded + single namespace would require prefix parsing at every + lookup; the per-type partition matches the existing + registry and chain-ordering structure. Under + SESSION-1 §3.4's SHOULD-omit rule the common case carries + zero of these on the wire. - **Language signals live in SESSION-1.** Language signals (`stt_lang`, `request_lang`, `detected_lang`, alongside `lang`, `secondary_langs`, `output_lang`) are @@ -1140,8 +1064,6 @@ and needs no implementation change: matches `ovos-bus-client.Message.{forward,reply,response}`. - The `.response` suffix convention — pervasive across OVOS topics today. -- The `complete_intent_failure` no-match topic (PIPELINE-1) — - matches current topic name verbatim. - `ovos.utterance.cancelled` and `ovos.utterance.handled` (PIPELINE-1) — match current topic names verbatim. - Per-utterance first-match-wins iteration (PIPELINE-1) — @@ -1160,6 +1082,7 @@ and needs no implementation change: | INTENT-3 v1.1 | "host" | "orchestrator" | Editorial; conformance unchanged. | | PIPELINE-1 | `mycroft.skill.handler.start` / `.complete` / `.error` | `ovos.intent.handler.start` / `.complete` / `.error` | Renamed into the `ovos.intent.*` namespace for uniformity. Breaks every existing handler-lifecycle observer; the migration cost is real. | | PIPELINE-1 | `recognizer_loop:utterance` | `ovos.utterance.handle` | See §5.4 entry. Migration touches `ovos-dinkum-listener`, `ovos-simple-listener`, `ovos-audio`, and `ovos-core/intent_services/service.py`. | +| PIPELINE-1 | `complete_intent_failure` | `ovos.intent.unmatched` | Follows `ovos.intent.*` namespace; pairs with `ovos.intent.matched`. | ### 5.2.1 Topics to remove from ovos-core @@ -1196,28 +1119,17 @@ defined by any spec** and should be removed or replaced: prescribed shape uses the structured `(skill_id, intent_name, lang)` triple plus `samples|file` and `blacklist|blacklist_file`. -- **Dispatch payload uses unified `owner_id`** (PIPELINE-1 - §7.0, §7.1). Today dispatch carries `skill_id` only. - PIPELINE-1 §7.0 collapses handler-owner shapes to two: - plain skill (handler reached via its `skill_id`) and - pipeline plugin with bundled handlers (the plugin's - `pipeline_id == skill_id` — one identifier filling both - roles). Conceptually `skill_id` is the voice-app identity - (every handler-owner has one); `pipeline_id` is the - matching-engine identity (only loaded plugins have one). - Plugins-with-handlers MUST NOT register their intents - under INTENT-4 — they own the handler directly — and - SHOULD publish their intent_names via the per-pipeline - passive index (§7.0, §10) for observability. A - pure-matcher plugin (Padatious, Adapt, the converse - plugin) has only a `pipeline_id` and produces matches - whose `owner_id` is some other component's identity. The - dispatch payload uniformly carries `owner_id` regardless - of shape. -- **Handler-lifecycle payload includes `owner_id`** - (PIPELINE-1 §8.2). Today the trio payload is - `{name: }`. Prescribed: `{owner_id, - intent_name, optional exception}`. +- **Dispatch payload is minimal** (PIPELINE-1 §7.1). Today + dispatch carries `skill_id` and `intent_name` in the + payload. PIPELINE-1 drops both from the payload — they + are already in the topic (`:`); + a consumer that needs them splits the topic. The + prescribed payload is `{lang, utterance, slots}`. + For plugin-bundled handlers (`pipeline_id == skill_id`), + the same uniform dispatch applies. +- **Handler-lifecycle payload updated** (PIPELINE-1 §8.2). + Today the trio payload is `{name: }`. + Prescribed: `{skill_id, intent_name, optional exception}`. ### 5.4 Architectural divergences @@ -1231,17 +1143,11 @@ defined by any spec** and should be removed or replaced: not a change to existing behaviour. - **The match contract is the single obligation** (PIPELINE-1 §4.2). The plugin's `match` operation has one MUST: return - a `Match` (§4.1) or `null`. Bus emissions during `match` - are allowed — a plugin that polls other components, calls - out to a model server, or runs any matching strategy that - requires bus communication is conformant. This matches the - actual OVOS converse-plugin pattern (it polls active skills - during its match decision) and accommodates LLM-backed and - agent-backed plugin shapes that are inherently bus-active. - Session mutation during `match` is via the explicit - `Match.updated_session` channel — see the next entry — so - declined plugins' exploratory mutations never reach the - next iteration step. + a `Match` or `null`. Bus emissions during `match` are + allowed — converse plugins, LLM-backed matchers, and + agent-backed shapes are all conformant. Session mutation + during `match` goes via `Match.updated_session` so + declined matches' mutations never escape. - **`Match.updated_session` as the match-phase session channel** (PIPELINE-1 §4.1, §4.2). Promotes the existing ovos-core code pattern @@ -1298,30 +1204,24 @@ defined by any spec** and should be removed or replaced: `forward`/`reply` inherit automatically. Loader-side interception covers off-dispatch emissions. - **Entry-point topic renamed `ovos.utterance.handle`** - (PIPELINE-1 §9.1). Current deployments use the - legacy `recognizer_loop:utterance` topic name. That name fails - the naming conventions of OVOS-MSG-1 §2.1.2 on three - counts: it uses `:` as a segment separator (where `:` is - reserved for `:` dispatch topics); - its leading segment names an implementation role (the - audio-input "recognizer loop") rather than a stable - assistant root; and it does not pair with the past-tense - terminal event `ovos.utterance.handled`. The rename fixes - all three: dot-separated hierarchy, stable `ovos.` root, - request/terminal pair (`handle` ↔ `handled`) sharing a - root verb. Migration cost is real — every audio-input - service emits this, every intent-service handler - subscribes — touching `ovos-dinkum-listener`, - `ovos-simple-listener`, `ovos-audio`, and - `ovos-core/intent_services/service.py`. A transitional - deployment MAY subscribe to both names during migration. + (PIPELINE-1 §9.1). `recognizer_loop:utterance` fails + MSG-1 §2.1.2 naming conventions: `:` as a segment + separator, an implementation-role prefix, and no pairing + with the terminal `ovos.utterance.handled`. Migration cost + is real — every audio-input service and intent-service + handler is affected. A transitional deployment MAY + subscribe to both names during migration. ### 5.5 New topics with no direct precedent - **`ovos.intent.matched`** (PIPELINE-1 §9.2). The - positive-match broadcast notification. Current OVOS has - `complete_intent_failure` for the negative case but no - positive equivalent. + positive-match broadcast notification. No current equivalent. +- **`ovos.intent.unmatched`** (PIPELINE-1 §9.4). Renamed from + `complete_intent_failure`; follows the `ovos.intent.*` + namespace for symmetry with `ovos.intent.matched`. +- **`ovos.utterance.speak`** (PIPELINE-1 §9.6). The NL output + exit point; symmetric to `ovos.utterance.handle`. No current + equivalent — TTS trigger is currently implicit. - **`ovos.intent.list` / `ovos.intent.describe`** (INTENT-4 §10). Introspection topics served from the orchestrator's passive registration index. @@ -1384,10 +1284,10 @@ a number of legacy names. Implementer migration aid: | Legacy topic | Status | |--------------|--------| | `recognizer_loop:utterance` | renamed to `ovos.utterance.handle` (see §5.4) | -| `complete_intent_failure` | **unchanged** — kept as the no-match signal. | +| `complete_intent_failure` | renamed to `ovos.intent.unmatched` — follows `ovos.intent.*` namespace. | | `ovos.utterance.cancelled` | **unchanged** — kept as the cancellation signal. | | `ovos.utterance.handled` | **unchanged** — kept as the universal end-marker. | -| `:` | **unchanged** — kept as the dispatch topic; PIPELINE-1 extends the shape to `:` so plugins can also own handlers. | +| `:` | **unchanged** — dispatch topic; a plugin-bundled handler has `skill_id == pipeline_id`. | | `mycroft.skill.handler.start` / `.complete` / `.error` | renamed to `ovos.intent.handler.start` / `.complete` / `.error` | #### Out of scope