From ed86ce821cee0daa06e0fa7ee372c91717539888 Mon Sep 17 00:00:00 2001 From: JarbasAi Date: Tue, 23 Jun 2026 06:19:31 +0100 Subject: [PATCH] COMMON-QUERY-1: common query pipeline plugin specification --- CHANGELOG.md | 28 ++ GLOSSARY.md | 3 + appendix/comparisons.md | 39 +++ appendix/rationale.md | 64 ++++ common-query.md | 627 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 761 insertions(+) create mode 100644 common-query.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 8585e84..4da400a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -121,3 +121,31 @@ tool does not recognize the token and cannot expand the template. consumer-side `ovos.mic.listen` row (defined in OVOS-AUDIO-1 §4.4). - See-also — cross-references OVOS-AUDIO-1 §4.4 as the defining spec for `ovos.mic.listen`. + +## OVOS-COMMON-QUERY-1 — Common Query Pipeline Plugin + +### 2 + +- Initial draft. Specifies the common query pipeline plugin: a + scatter-gather contest that answers factual questions by + broadcasting the utterance, collecting competing answers from + skills, ranking them, and speaking the best. Reserves the + `common_query` intent_name (PIPELINE-1 §7.3). The full contest runs + in the plugin's blocking `match` (a deliberate, documented exception + to PIPELINE-1 §4.4 latency discipline, since the answer is the claim + decision): a fast `ovos.common_query.ping`/`pong` poll filters + skills down to plausible answerers using only cheap local checks, + then `:common_query` requests full answers (where network + and DB I/O are expected) collected on `.common_query.response`. + Filtering and selection (minimum confidence, denylist, fast-win, + optional reranker) run against the live session; if no answer + survives, `match` returns `None` so the pipeline reaches fallback. + A surviving answer is carried in `Match.slots.answer` and spoken by + the plugin's trivial handler — skills never speak. Defines an + optional question gate (SHOULD, for latency) and an early-start + optimisation subscribing to `ovos.utterance.handle` to overlap the + contest with upstream pipeline stages, caching only raw responses + keyed by `(session_id, utterance)`. All poll/answer messages carry + the `utterance` as correlation key and derive via MSG-1 `reply`, + with the session in `context.session`. Tunable defaults and + confidence-range guidance are collected in appendices. diff --git a/GLOSSARY.md b/GLOSSARY.md index 32e1294..9717acd 100644 --- a/GLOSSARY.md +++ b/GLOSSARY.md @@ -37,3 +37,6 @@ open a PR adding it. | **Context** | The assistant-metadata object on a Message; an extensible JSON object whose keys are defined by companion specs ([MSG-1 §2.3](msg-1.md)). | | **Session** | The per-conversation carrier in `context.session`; carries `session_id` (with `"default"` reserved for "originates from the device itself") and `lang` (the user's preferred language, distinct from any `data.lang` describing the payload's own language) ([MSG-1 §4](msg-1.md)). | | **Listening lifecycle signal** | A payload-free bus signal the audio input service emits or consumes around voice-command capture and sleep mode — `ovos.listener.record.started` / `.record.ended`, `ovos.listener.sleep`, `ovos.listener.awoken` ([AUDIO-IN-1 §6](audio-in.md)). | +| **Common query** | A pipeline plugin that answers factual questions by holding a timed contest among skills — broadcast, collect competing answers, rank, speak the best ([COMMON-QUERY-1 §2](common-query.md)). | +| **Scatter-gather** | The contest pattern: one broadcast fans out to many skills (scatter), their answers are collected and ranked (gather) ([COMMON-QUERY-1 §2](common-query.md)). | +| **Wants-to-answer poll** | Common query's fast ping/pong phase — a cheap local filter where skills self-nominate before the expensive full-answer phase ([COMMON-QUERY-1 §6](common-query.md)). | diff --git a/appendix/comparisons.md b/appendix/comparisons.md index f49b48e..fa2c942 100644 --- a/appendix/comparisons.md +++ b/appendix/comparisons.md @@ -181,3 +181,42 @@ architecture: (OVOS-INTENT-3 §1) rather than HA-style curated vocabulary. The trade-off: skill author freedom vs. cross-integration vocabulary sharing. + +### 2.6 Mycroft CommonQuerySkill — the direct ancestor + +COMMON-QUERY-1's closest comparator is not another assistant but +OVOS's own lineage: Mycroft's `CommonQuerySkill` base class, from +which the scatter-gather question-answering pattern is inherited. +The shapes rhyme — broadcast a query, let skills self-nominate, +collect answers, speak the best — but the formalization diverges in +three ways worth recording. + +**Two phases, different reason.** Mycroft's CommonQuery was also +two-phase (a query broadcast, then answer collection), but the split +was driven by **message-bus timeout management** — the framework +needed a bounded window to gather responses from skills that might +never reply. COMMON-QUERY-1 keeps a two-phase poll for a different, +sharper reason (§6): the ping is a *cheap local filter* that exists +to keep I/O-heavy skills from querying their backends on every +utterance. The window is incidental; the filtering is the point. + +**Where the contest lives.** Mycroft ran the gather inside a skill +handler — common query was itself a skill. COMMON-QUERY-1 lifts it +into a pipeline plugin and runs the entire contest in `match`, so +the no-answer case returns `None` and the pipeline reaches fallback +(rationale §4.9). A skill-layer implementation cannot do this: by +the time a skill handler runs, the claim is already made and +fallback is foreclosed — the same layering argument that puts STOP-1 +in the pipeline (rationale §4.8). + +**Single speaker.** In COMMON-QUERY-1 the plugin is the only voice: +skills return answer *strings* and the plugin speaks the winner +(§10). This removes the ambiguity, present in the original, about +which component renders speech, and lets the plugin re-rank or +suppress answers without a skill having already spoken. + +No mainstream closed stack (Alexa, Google) exposes a comparable +mechanism, because answer resolution there happens centrally in the +cloud rather than as an open contest among independently authored +local skills. The scatter-gather-over-a-bus shape is specific to the +open-ecosystem voice OS. diff --git a/appendix/rationale.md b/appendix/rationale.md index 1851e17..17d1be9 100644 --- a/appendix/rationale.md +++ b/appendix/rationale.md @@ -449,3 +449,67 @@ subscribed to `:stop`. The pipeline plugin matches and selects; the skill stops. Stop is one of the few cases in the spec set where the pipeline / skill split is not substitutable. + +### 4.10 Common query pipeline plugin (COMMON-QUERY-1) + +Common query answers factual questions by holding a timed contest +among skills — broadcast the question, collect competing answers, +rank them, speak the best. Four of its design choices are +unusual enough to be worth recording, because each one trades +against an instinct a reader brings to the spec. + +**`match` blocks, and that is deliberate.** PIPELINE-1 §4.4 tells +plugins to return from `match` quickly and defer expensive work to +the handler, because match-phase latency is response latency. Common +query openly violates that discipline, and it has to: the answer +*is* the claim decision. The plugin cannot return a `Match` and +collect afterwards, because whether it claims at all depends on +whether any skill produced an answer above threshold. Routing and +processing are the same act here, so both happen in `match`. This is +the one place the spec set says "yes, this matcher blocks for +seconds" — and it pays for that admission with the early-start +optimisation and explicit pipeline positioning, rather than +pretending the cost away. + +**Returning `None` on no-answer is what keeps fallback alive.** The +earlier, discarded design had the plugin claim the utterance, then +discover during the handler that no skill could answer, and speak a +dead-end "I don't know." That permanently starves fallback: once a +plugin claims, first-match-wins means no later stage runs. Moving the +whole contest into `match` lets the plugin make an honest claim — it +returns a `Match` only when it actually has an answer, and `None` +otherwise — so a failed contest flows naturally to fallback. The +correctness of the whole pipeline tail depends on the contest +finishing before the claim is made. + +**Ping/pong is a cheap filter gating an expensive operation, not +ceremony.** It would be simpler to broadcast the question once and +let skills answer directly. The two-phase poll earns its place +because the full-answer request invites real I/O — a knowledge skill +will hit Wikipedia, Wolfram, or a database. Without the cheap local +pong filter, every such skill performs that I/O for every question +that passes the gate, including ones far outside its domain. The +~500ms poll window buys the right to *not* hammer every backend on +every utterance. (Mycroft's original CommonQuerySkill was also +two-phase, but for a different reason — message-bus timeout +management; see comparisons §2.6.) + +**Early start hides latency without shrinking the contest.** Because +`match` blocks, the plugin MAY begin the contest the instant the +utterance arrives (`ovos.utterance.handle`), running it in parallel +with the upstream stop/converse/intent stages that get first refusal +anyway. The subtle requirement is that the early-start cache holds +only *raw* responses — never a selected answer — and all filtering +and selection run at `match` time against the *live* session. That +keeps the optimisation transparent: an upstream stage that +blacklists a skill or changes session state still takes full effect, +because the denylist and confidence filters never saw the stale +snapshot. + +The question gate (COMMON-QUERY-1 §4) is the other half of the +latency story: a cheap up-front classifier that rejects weather +requests, music commands, timers, and plain statements before any +broadcast. It is a SHOULD, not a MUST — the confidence filter +guarantees correctness without it — but on mixed traffic it is the +single largest latency win available, since it skips the entire +contest for utterances no knowledge skill would answer anyway. diff --git a/common-query.md b/common-query.md new file mode 100644 index 0000000..8b266ee --- /dev/null +++ b/common-query.md @@ -0,0 +1,627 @@ +# Common Query Pipeline Plugin Specification + +**Spec ID:** OVOS-COMMON-QUERY-1 · **Version:** 2 · **Status:** Draft + +This specification defines the **common query pipeline plugin** — a +pipeline plugin that answers factual questions by holding a timed +contest among skills. During its `match` phase it broadcasts the +question, collects full answers from the skills that claim they can +answer, ranks them, and — if any answer clears a confidence +threshold — returns a `Match` carrying the winning answer for its +own handler to speak. When no answer clears the threshold, `match` +returns `None` and the pipeline continues to the next stage, +including fallback. + +It builds on OVOS-MSG-1 (envelope, `reply` derivation, session +carrier), OVOS-PIPELINE-1 (pipeline-plugin contract, `Match` shape, +dispatch topic shape, handler-lifecycle trio, `Match.updated_session`, +reserved intent_name registry, §4.4 blocking-match allowance), and +OVOS-SESSION-1 (session field registry, omission rule). + +The key words **MUST**, **MUST NOT**, **SHOULD**, **SHOULD NOT**, +**MAY**, and **RECOMMENDED** are used as in RFC 2119. + +--- + +## 1. Scope + +This specification defines: + +- the **common query plugin role** (§2) — a pipeline plugin whose + `match` blocks while it runs a multi-skill contest; +- the **reserved intent_name `common_query`** (§3); +- the **question gate** (§4) — an optional pre-filter that rejects + non-question utterances before any broadcast; +- the **early-start optimisation** (§5) — how the plugin overlaps + its contest with upstream pipeline stages; +- the **wants-to-answer poll** (§6) — the fast ping/pong broadcast + that filters skills down to plausible answerers; +- the **answer collection** (§7) — full-answer gathering; +- the **filtering and selection** (§8) — confidence filtering, + denylist, fast-win, and ranking, applied against the live session; +- the **match construction** (§9) — the `Match`, or `None`; +- the **plugin handler** (§10) — the trivial handler that speaks the + selected answer; +- the **skill-side protocol** (§11); +- **pipeline positioning** (§12); +- the **bus surface** (§13); +- **conformance** (§14); +- **tunable defaults** (Appendix A) and **confidence-range guidance + for skill authors** (Appendix B). + +This specification does **not** define: + +- **vocabulary file format or question-classification algorithm** — + the gate MAY use any method; only the observable behaviour + (accept / reject) is normative. +- **skill-side answer generation** — what a skill does internally to + produce an answer is the skill's business. The spec fixes only the + bus contract by which the skill reports its answer. +- **the framework decorator or base class** a skill author uses to + participate — these are conveniences, not normative. Any component + that honours the bus contract in §11 is a valid common-query skill. +- **streaming answer delivery** — the plugin collects complete answer + strings before selecting; incremental assembly is out of scope. + +--- + +## 2. The common query plugin role + +The common query plugin is a pipeline plugin (PIPELINE-1 §3) that +**bundles its own handler**. Its two roles are structurally separate: + +- **Matcher role** (§6–§9): during `match`, the plugin runs the full + contest — ping/pong broadcast, parallel answer collection, + filtering, and ranking. If an answer wins, `match` returns a + `Match` pointing at the plugin itself, carrying the answer in + `slots`. If no answer wins, `match` returns `None` and the + orchestrator proceeds to the next pipeline stage, including + fallback. +- **Handler role** (§10): on receiving `:common_query`, + the plugin reads the selected answer from the dispatch payload and + speaks it. + +Because `match` returns `None` when no good answer is found, common +query **never blocks fallback**. This is its defining difference from +a plugin that claims an utterance speculatively and only later +discovers it cannot satisfy it. + +### 2.1 Blocking match — a deliberate exception to latency discipline + +OVOS-PIPELINE-1 §4.4 permits `match` to block on bus I/O but +**SHOULD**s plugins to return quickly and defer expensive work to the +handler, since match-phase latency is response latency. Common query +is a deliberate, documented exception: the answer **is** the claim +decision. The plugin cannot return a `Match` and defer collection to +the handler, because whether it claims at all depends on whether any +skill produces an answer above threshold (§9). The expensive work and +the routing decision are the same act, so it must happen in `match`. + +Two consequences a deployer **MUST** accept: + +- **Bound interaction.** OVOS-PIPELINE-1 §4.4 lets the orchestrator + bound each `match` by a timeout and skip a plugin that exceeds it. + A deployment **MUST** set common query's match-timeout bound at or + above its collection-window ceiling (§7.2), or the stage will be + skipped mid-contest. The early-start optimisation (§5) is the + intended way to keep the observed `match` duration low without + shrinking the contest. +- **Positioning.** Latency is bounded by the slowest claiming skill, + up to the ceiling; §12 positions the stage to contain that cost. + +Every other field of the pipeline-plugin contract applies unchanged: +the plugin is loaded and iterated per `session.pipeline` ordering, +subject to first-match-wins iteration (PIPELINE-1 §6.2) and denylist +filtering (PIPELINE-1 §5.2–§5.4). + +### 2.2 Pipeline identity + +The plugin is loaded as one or more `pipeline_id` entries in +`session.pipeline`. A deployment typically configures one entry +(e.g., `common_query`). Confidence-tier variants are a deployment +choice, not normative. + +--- + +## 3. Reserved intent_name + +The intent_name `common_query` is **reserved** in the +OVOS-PIPELINE-1 §7.3 registry. + +| Reserved intent_name | Dispatch topic | Meaning | +|----------------------|---------------|---------| +| `common_query` | `:common_query` | The plugin's own handler: speak the answer selected during `match` (§10). | + +This intent_name is **not** registered via OVOS-INTENT-4. A +registration naming `common_query` via `ovos.intent.register.*` is +malformed per PIPELINE-1 §7.3. + +The `:common_query` topic shape (colon form) is used by the +plugin during `match` to request full answers from claiming skills +(§7). These are sent by the plugin, not by the orchestrator. + +--- + +## 4. The question gate + +The question gate is a **cost-optimisation pre-filter**, not the +primary quality mechanism. The confidence filter (§8) is the primary +quality gate: even with no gate, a non-question utterance produces no +answer above threshold, `match` returns `None`, and the pipeline +continues. The gate exists only to skip the broadcast cost — the +ping/pong round-trip and parallel skill invocations — for utterances +that obviously cannot produce a useful answer. + +A plugin **SHOULD** apply a gate — a sentence-type classifier or any +other cheap short-circuit — to avoid running the contest for +utterances that are not question-like. Weather requests, music +commands, timers, and plain statements have no business reaching a +knowledge skill, and querying them wastes the full ping/pong-plus- +collection latency on every such utterance. A cheap up-front reject is +the single largest latency win available to a deployment that sees +mixed traffic. + +A deployment that omits the gate is still conformant — the confidence +filter guarantees correctness either way — but pays the broadcast cost +on every utterance. A gate trades that cost against the risk of false +negatives (a genuine question wrongly rejected), so §4.2 biases the +gate toward acceptance when in doubt. + +### 4.1 Gate semantics + +When configured, the gate is a binary pre-filter: + +- **Accept** — the utterance is plausibly a factual question. Proceed + to the poll (§6). +- **Reject** — the utterance is clearly an action command or + otherwise not a factual question. `match` returns `None` without + broadcasting. + +The gate **MUST NOT** be used as a confidence scorer or ranking +layer; scoring belongs to the responding skills (§7–§8). The gate MAY +use any combination of classifiers, vocabulary heuristics, or +length thresholds. A deployment MAY skip the plugin's own gate and +rely on an upstream classifier. + +### 4.2 Gate conformance + +The gate **MUST** accept utterances that express a factual question +("what is the capital of France", "who invented electricity", "tell +me about France") and **SHOULD NOT** accept unambiguous action +commands with no information intent ("play music", "set a timer", +"turn off the lights"). + +The question/command boundary is fuzzy. Over-acceptance wastes a +round-trip; under-acceptance silently fails the user. When in doubt, +accept. + +--- + +## 5. Early-start optimisation + +Common query is a slow stage, but most of its latency can be hidden. +The plugin **MAY** subscribe to the utterance-entry topic +`ovos.utterance.handle` (OVOS-PIPELINE-1 §9.1) — the message the +orchestrator consumes to begin a new utterance, before pipeline +iteration starts. When this subscription is active, the plugin begins +the contest (gate → poll → answer collection) immediately, **in +parallel** with the upstream pipeline stages (stop, converse, intent +matchers). By the time the orchestrator calls `match` for the common +query stage, the raw responses are often already collected. + +### 5.1 What is cached, and what is not + +The early-start cache holds only the **raw skill responses** (§7) and +the **utterance** they were collected for. It does **not** hold a +selected answer. All filtering and selection (§8) is performed at +`match` time against the **live session** the orchestrator passes in — +never against the session snapshot the early start began with. This +makes the optimisation transparent: an upstream stage that blacklists +a skill or changes session state still takes full effect, because the +denylist and confidence filters run on the live session after +collection. + +If the live session's `lang` differs from the language the early +start collected under, the cached responses **MUST** be discarded and +the contest re-run. + +### 5.2 Cache keying and lifetime + +The cache **MUST** be keyed by the pair `(session_id, utterance)`, +with `session_id` read from `context.session`. A cache entry is +consumed and evicted when `match` reads it. An entry is evicted +unconditionally when a new utterance arrives in the same session. + +A cache entry **MUST NOT** be returned for any utterance other than +the exact string it was collected for. There is no time-based +expiry — the cache exists to bridge a single pipeline iteration, and +the new-utterance eviction bounds its lifetime precisely. + +--- + +## 6. The wants-to-answer poll + +When the gate accepts (or no gate is configured), the plugin runs a +**fast broadcast poll** to filter the skill set down to those that +plausibly can answer. The poll exists to avoid invoking the +**expensive** full-answer path (§7) — which may hit the network or a +database — on skills that have no relevant knowledge. It is a cheap +local filter gating an expensive operation; that is its entire +justification. + +### 6.1 Ping + +The plugin broadcasts on `ovos.common_query.ping`: + +```json +{ "utterance": "what is the capital of France" } +``` + +| Field | Type | Required | Meaning | +|-------|------|----------|---------| +| `utterance` | string | yes | The utterance being broadcast. Also the correlation key for the pong and the answer (§6.2, §7.1). | + +The language is read from `context.session.lang` per OVOS-SESSION-1. +The broadcast carries no `destination`; any subscribed skill MAY +respond. The session rides in `context.session` per OVOS-MSG-1 §4. + +### 6.2 Pong + +A skill that believes it can answer responds on +`ovos.common_query.pong`, derived via `reply` (OVOS-MSG-1 §5): + +```json +{ + "utterance": "what is the capital of France", + "skill_id": "wiki.test", + "can_answer": true, + "latency_ms": 800 +} +``` + +| Field | Type | Required | Meaning | +|-------|------|----------|---------| +| `utterance` | string | yes | Echo of the ping's utterance; correlates the pong to its poll. | +| `skill_id` | string | yes | The responding skill's identifier. | +| `can_answer` | boolean | yes | Whether the skill claims it can answer. | +| `latency_ms` | number | no | Expected time in milliseconds to produce a full answer. Sizes the collection window (§7.2). | + +**The pong check is a fast local decision.** A skill **MUST** base +`can_answer` on local, synchronous operations only — keyword +matching, vocabulary lookup, cached knowledge — and **MUST NOT** +perform network requests, database queries, or other blocking I/O +during the pong phase. The full answer comes later (§7), where I/O is +expected. + +The plugin **MUST** enforce a poll-window ceiling and stop waiting +when it elapses; the responses it has by then are the claimants. +Skills **SHOULD** respond within the deployer-configured pong bound +(Appendix A). A skill that cannot answer **SHOULD** stay silent; +sending `can_answer: false` is permitted but pointless, since the +window closes on timeout or sufficiency regardless. A skill that does +not respond in time is treated as not claiming. + +### 6.3 Poll window and early close + +The plugin **MUST** enforce a maximum poll window (Appendix A) and +**SHOULD** close it early once enough claimants are identified — a +deployment MAY proceed as soon as one claims. + +State is keyed by `session_id` from `context.session`; pongs whose +`utterance` or session does not match the active poll **MUST** be +discarded. + +--- + +## 7. Answer collection + +After the poll window closes, the plugin requests full answers from +all claiming skills **in parallel**. + +### 7.1 Full-answer request and response + +The plugin sends `:common_query` to each claiming skill: + +```json +{ "utterance": "what is the capital of France" } +``` + +| Field | Type | Required | Meaning | +|-------|------|----------|---------| +| `utterance` | string | yes | The utterance to answer. Correlation key for the response. | + +The language is read from `context.session.lang`. These are direct +plugin-to-skill messages: the orchestrator does not participate, does +not emit the handler-lifecycle trio for them, and skills **MUST NOT** +emit lifecycle signals in response. + +Each skill emits its result on `.common_query.response` +(dotted form, derived via `reply` per OVOS-MSG-1 §5): + +```json +{ + "utterance": "what is the capital of France", + "skill_id": "wiki.test", + "answer": "Paris is the capital of France.", + "conf": 0.85 +} +``` + +| Field | Type | Required | Meaning | +|-------|------|----------|---------| +| `utterance` | string | yes | Echo of the request's utterance; correlates the response to its request. | +| `skill_id` | string | yes | The responding skill's identifier. | +| `answer` | string | conditional | The natural-language answer. **MUST** be present when the skill has one. | +| `conf` | number | conditional | Self-reported confidence in `[0, 1]`. **MUST** be present when `answer` is present (Appendix B). | + +A skill that cannot produce an answer after all **MUST** still +respond, with no `answer` field, so early termination can fire. +Responses whose `utterance` or session does not match the active +collection **MUST** be discarded. + +### 7.2 Collection window + +The plugin **MUST** enforce a collection window with a hard ceiling +(Appendix A). When `latency_ms` values are available from pongs +(§6.2), the plugin **SHOULD** size the initial window to the maximum +`latency_ms` across claimants, clamped to the ceiling; otherwise it +**SHOULD** use the fixed initial window. + +The plugin **MUST** support early termination and **SHOULD** close +the window as soon as every claiming skill has responded. A claimant +that does not respond before the ceiling is treated as declining. + +--- + +## 8. Filtering and selection + +Filtering and selection run at `match` time against the **live +session** (§5.1), in order: + +1. **Minimum self-confidence.** Discard responses whose `conf` is + below the deployer-defined threshold (Appendix A). +2. **Denylist.** Discard responses whose `skill_id` appears in the + live `session.blacklisted_skills` (PIPELINE-1 §5.3). +3. **Fast-win.** If any surviving response carries `conf ≥` + fast-win threshold (Appendix A), the plugin **SHOULD** stop waiting + immediately and select it. The fast-win check MAY fire during + collection (§7.2), short-circuiting the window. +4. **Selection.** Select the highest-`conf` survivor. Ties MAY be + broken by any deployer-defined heuristic; the algorithm is not + normative. When a reranker is configured, the plugin **SHOULD** + pass all survivors to it and use its ranking in place of raw + `conf` ordering; the reranker interface is a deployment concern. + +If no response survives, the contest has no winner — `match` returns +`None` (§9). + +--- + +## 9. Match construction + +After selection (§8): + +- If **no response survived**, the plugin **MUST** return `None`. The + orchestrator proceeds to the next pipeline stage, including + fallback. A contest with no winner is an expected outcome, not an + error. +- If **an answer won**, the plugin **MUST** return a `Match` with: + - `skill_id`: the plugin's own `pipeline_id` + - `intent_name`: `"common_query"` (reserved, §3) + - `lang`: from `context.session.lang` + - `utterance`: the candidate string + - `slots`: `{ "answer": "" }` — the + only field the handler needs (§10) + - `updated_session`: the inbound session, unmodified + +The plugin **MUST NOT** mutate the session: common query does not +activate handlers, change `persona_id`, or modify any session field. + +--- + +## 10. The plugin handler + +When the orchestrator dispatches `:common_query`, the +handler runs and fires the handler-lifecycle trio per PIPELINE-1 §8 +(`ovos.intent.handler.start`, `.complete`, `.error`). + +The handler is intentionally trivial — all contest work completed +during `match` (§6–§8). It: + +1. Reads `answer` from `slots` in the dispatch payload. +2. Speaks it via `ovos.utterance.speak` per OVOS-PIPELINE-1. +3. Emits `ovos.intent.handler.complete`. + +The handler **MUST NOT** re-dispatch to skills or perform additional +collection. `ovos.intent.handler.error` is reserved for crashes and +unrecoverable handler failures. + +--- + +## 11. Skill-side protocol + +A skill participates by handling two topics (see §13 for the full bus +surface): + +1. On `ovos.common_query.ping`, perform a **fast local check** for a + likely answer. If yes, respond on `ovos.common_query.pong` with + `can_answer: true`, the echoed `utterance`, and optionally + `latency_ms`. If no, stay silent. +2. On `:common_query`, produce the best answer — network + calls, DB queries, and full generation are appropriate here — and + emit it on `.common_query.response` (via `reply`, + OVOS-MSG-1 §5) with the echoed `utterance`, `answer`, and `conf`. + If no answer can be produced, emit the response with no `answer` + field so early termination can fire. +3. The skill **MUST NOT** call `ovos.utterance.speak` from its + `common_query` handler. Speaking is the plugin's responsibility + (§10). + +--- + +## 12. Pipeline positioning + +Common query is a slow stage. A deployment **SHOULD** place it after +all intent-matching stages and before the fallback stage(s): intent +matchers are tried first, and fallback still runs if common query +finds no answer. When a persona catch-all (OVOS-PERSONA-1 §10) is also +present, common query precedes it, so deterministic question-answering +is preferred over a persona's generated reply. + +``` +session.pipeline: [ + "stop_high", + "converse", + "skill_high", + "skill_medium", + "common_query", + "fallback_medium", + "fallback_low" +] +``` + +With early start enabled (§5), the contest begins as the utterance +arrives, so its wall-clock cost is largely amortised against the +upstream stages by the time the orchestrator reaches it. Without +early start, the stage blocks for the full collection window. + +--- + +## 13. Bus surface + +| Topic | Direction | Purpose | Defined in | +|-------|-----------|---------|------------| +| `ovos.common_query.ping` | plugin → all skills | Wants-to-answer poll | §6.1 | +| `ovos.common_query.pong` | skill → plugin | Claim, via `reply` | §6.2 | +| `:common_query` | plugin → claiming skill | Full-answer request (during match) | §7.1 | +| `.common_query.response` | claiming skill → plugin | Full answer or decline, via `reply` | §7.1, §11 | +| `:common_query` | orchestrator → plugin | Handler dispatch (reserved intent_name) | §3, §10 | + +Colon-form topics (`:common_query`, `:common_query`) +follow the PIPELINE-1 §7 dispatch shape. Dotted-form topics +(`.common_query.response`) are skill-emitted events per +MSG-1 §2.1.1. `ovos.common_query.ping` is a broadcast. Pong and +answer responses are both derived via `reply` (OVOS-MSG-1 §5). Every +poll/response message carries the `utterance` as its correlation key. + +--- + +## 14. Conformance + +### A common query pipeline plugin **MUST**: + +- expose a blocking `match(utterances, lang, session) → Match | None` + per PIPELINE-1 §4 (§2.1); +- broadcast `ovos.common_query.ping` and collect + `ovos.common_query.pong` within a bounded poll window (§6.3); +- discard pongs and responses whose `utterance` or session does not + match the active contest (§6.3, §7.1); +- request full answers via `:common_query` from all + claimants in parallel and collect within a bounded window + (§7.1, §7.2); +- apply confidence filtering and the denylist against the **live + session** passed to `match`, not against any early-start snapshot + (§5.1, §8); +- honour the live `session.blacklisted_skills` (§8 step 2); +- return `None` when no response survives, letting the pipeline reach + fallback (§9); +- return a `Match` with `skill_id` = its own `pipeline_id`, + `intent_name` = `"common_query"`, and `slots.answer` = the selected + answer when one wins (§9); +- not mutate the session — `Match.updated_session` MUST equal the + inbound session (§9); +- key all contest state by `session_id` from `context.session` (§6.3); +- speak the selected answer from `slots.answer` in the handler + without re-dispatching to skills (§10). + +### A common query pipeline plugin **SHOULD**: + +- apply a question gate — classifier or other cheap short-circuit — + to skip the contest for non-question-like utterances; gate-less + deployments are conformant but pay the broadcast cost on every + utterance (§4); +- subscribe to the utterance-arrival event and run the contest early, + in parallel with upstream stages (§5); +- discard early-start cache entries when the live `lang` differs, and + evict on every new utterance in the session (§5.1, §5.2); +- close the poll window early when enough claimants respond (§6.3); +- size the collection window from claimants' `latency_ms` (§7.2); +- close the collection window on fast-win or all-responded (§7.2, §8); +- use a reranker when configured (§8 step 4). + +### A skill that participates in common query **MUST**: + +- on `ovos.common_query.ping`, perform only a fast local check; + **MUST NOT** perform network requests or blocking I/O during the + pong phase (§6.2); +- echo the `utterance` in every pong and response for correlation + (§6.2, §7.1); +- emit answers on `.common_query.response` via `reply` + (§7.1, §11); +- include `conf` whenever `answer` is present (§7.1); +- respond even when no answer can be produced (no `answer` field), so + early termination can fire (§7.2); +- not call `ovos.utterance.speak` from the `common_query` handler + (§11); +- not emit handler-lifecycle signals in response to + `:common_query` (§7.1). + +### A skill that participates in common query **SHOULD**: + +- respond to the pong within the deployer-configured bound + (Appendix A, §6.2); +- report `conf` using the Appendix B ranges so values interoperate; +- include `latency_ms` in its pong so the plugin can size an adaptive + collection window (§6.2); +- ignore unknown fields in `ovos.common_query.ping`. + +--- + +## Appendix A — Tunable defaults (RECOMMENDED) + +All values are deployer-configurable; these are the RECOMMENDED +defaults. They are guidance, not protocol — a deployment that tunes +them is conformant. + +| Knob | Default | Section | +|------|---------|---------| +| Pong response bound (skill-side target) | 100 ms | §6.2 | +| Poll-window ceiling | 500 ms | §6.3 | +| Collection-window initial | 3 s | §7.2 | +| Collection-window ceiling | 5 s | §7.2 | +| Minimum self-confidence | 0.5 | §8 step 1 | +| Fast-win threshold | 0.9 | §8 step 3 | + +--- + +## Appendix B — Confidence-range guidance for skill authors + +`conf` is self-reported and not calibrated across skills. These +ranges are RECOMMENDED so independently authored skills produce +comparable values; a reranker (§8 step 4) is the proper fix when +calibration matters. + +| Range | Meaning | +|-------|---------| +| 0.0–0.3 | weak signal; something, but low certainty | +| 0.3–0.5 | partial match; can attempt an answer | +| 0.5–0.7 | reasonable answer; fairly confident | +| 0.7–0.9 | strong answer; confident | +| 0.9–1.0 | definitive answer; certain (use sparingly) | + +--- + +## See also + +- *Utterance Lifecycle and Pipeline Specification* (OVOS-PIPELINE-1) + — the pipeline-plugin contract, the §4.4 blocking-match allowance + and latency discipline, the `Match` shape, the dispatch model, the + handler-lifecycle trio, the `ovos.utterance.handle` entry topic + (§9.1), and the reserved intent_name registry. +- *Bus Message Specification* (OVOS-MSG-1) — the envelope, + `context.session` carrier, and `reply` derivation used for pong and + answer responses. +- *Session Carrier Wire Shape Specification* (OVOS-SESSION-1) — the + field-registry mechanism, the omission rule, and `session.lang`. +- *Session Lifecycle and State Ownership Specification* + (OVOS-SESSION-2) — session-keyed state and mutation boundaries.