OpenVoiceOS · JarbasAl · Jun 26, 2026 · Jun 23, 2026 · Jun 26, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -251,6 +251,33 @@ version 2: its `{{ … }}` sequences become substitution points, and its
 - See-also — cross-references OVOS-AUDIO-1 §4.4 as the defining spec
   for `ovos.mic.listen`.
 
+## OVOS-COMMON-QUERY-1 — Common Query Pipeline Plugin
+
+### 2
+
+- Initial draft. Specifies the common query pipeline plugin: a
+  scatter-gather contest that answers factual questions by
+  broadcasting the utterance, collecting competing answers from
+  skills, ranking them, and speaking the best. Reserves the
+  `common_query` intent_name (PIPELINE-1 §7.3). The full contest runs
+  in the plugin's blocking `match` (a deliberate, documented exception
+  to PIPELINE-1 §4.4 latency discipline, since the answer is the claim
+  decision): a fast `ovos.common_query.ping`/`pong` poll filters
+  skills down to plausible answerers using only cheap local checks,
+  then `<skill_id>:common_query` requests full answers (where network
+  and DB I/O are expected) collected on `<skill_id>.common_query.response`.
+  Filtering and selection (minimum confidence, denylist, fast-win,
+  optional reranker) run against the live session; if no answer
+  survives, `match` returns `None` so the pipeline reaches fallback.
+  A surviving answer is carried in `Match.slots.answer` and spoken by
+  the plugin's trivial handler — skills never speak. Defines an
+  optional question gate (SHOULD, for latency) and an early-start
+  optimisation subscribing to `ovos.utterance.handle` to overlap the
+  contest with upstream pipeline stages, caching only raw responses
+  keyed by `(session_id, utterance)`. All poll/answer messages carry
+  the `utterance` as correlation key and derive via MSG-1 `reply`,
+  with the session in `context.session`. Tunable defaults and
+  confidence-range guidance are collected in appendices.
 ## OVOS-FALLBACK-1 — Fallback Pipeline Plugin
 
 ### 2

diff --git a/GLOSSARY.md b/GLOSSARY.md
@@ -37,3 +37,6 @@ open a PR adding it.
 | **Context** | The assistant-metadata object on a Message; an extensible JSON object whose keys are defined by companion specs ([MSG-1 §2.3](msg-1.md)). |
 | **Session** | The per-conversation carrier in `context.session`; carries `session_id` (with `"default"` reserved for "originates from the device itself") and `lang` (the user's preferred language, distinct from any `data.lang` describing the payload's own language) ([MSG-1 §4](msg-1.md)). |
 | **Listening lifecycle signal** | A payload-free bus signal the audio input service emits or consumes around voice-command capture and sleep mode — `ovos.listener.record.started` / `.record.ended`, `ovos.listener.sleep`, `ovos.listener.awoken` ([AUDIO-IN-1 §6](audio-in.md)). |
+| **Common query** | A pipeline plugin that answers factual questions by holding a timed contest among skills — broadcast, collect competing answers, rank, speak the best ([COMMON-QUERY-1 §2](common-query.md)). |
+| **Scatter-gather** | The contest pattern: one broadcast fans out to many skills (scatter), their answers are collected and ranked (gather) ([COMMON-QUERY-1 §2](common-query.md)). |
+| **Wants-to-answer poll** | Common query's fast ping/pong phase — a cheap local filter where skills self-nominate before the expensive full-answer phase ([COMMON-QUERY-1 §6](common-query.md)). |
diff --git a/appendix/comparisons.md b/appendix/comparisons.md
@@ -181,3 +181,42 @@ architecture:
   (OVOS-INTENT-3 §1) rather than HA-style curated vocabulary.
   The trade-off: skill author freedom vs. cross-integration
   vocabulary sharing.
+
+### 2.6 Mycroft CommonQuerySkill — the direct ancestor
+
+COMMON-QUERY-1's closest comparator is not another assistant but
+OVOS's own lineage: Mycroft's `CommonQuerySkill` base class, from
+which the scatter-gather question-answering pattern is inherited.
+The shapes rhyme — broadcast a query, let skills self-nominate,
+collect answers, speak the best — but the formalization diverges in
+three ways worth recording.
+
+**Two phases, different reason.** Mycroft's CommonQuery was also
+two-phase (a query broadcast, then answer collection), but the split
+was driven by **message-bus timeout management** — the framework
+needed a bounded window to gather responses from skills that might
+never reply. COMMON-QUERY-1 keeps a two-phase poll for a different,
+sharper reason (§6): the ping is a *cheap local filter* that exists
+to keep I/O-heavy skills from querying their backends on every
+utterance. The window is incidental; the filtering is the point.
+
+**Where the contest lives.** Mycroft ran the gather inside a skill
+handler — common query was itself a skill. COMMON-QUERY-1 lifts it
+into a pipeline plugin and runs the entire contest in `match`, so
+the no-answer case returns `None` and the pipeline reaches fallback
+(rationale §4.9). A skill-layer implementation cannot do this: by
+the time a skill handler runs, the claim is already made and
+fallback is foreclosed — the same layering argument that puts STOP-1
+in the pipeline (rationale §4.8).
+
+**Single speaker.** In COMMON-QUERY-1 the plugin is the only voice:
+skills return answer *strings* and the plugin speaks the winner
+(§10). This removes the ambiguity, present in the original, about
+which component renders speech, and lets the plugin re-rank or
+suppress answers without a skill having already spoken.
+
+No mainstream closed stack (Alexa, Google) exposes a comparable
+mechanism, because answer resolution there happens centrally in the
+cloud rather than as an open contest among independently authored
+local skills. The scatter-gather-over-a-bus shape is specific to the
+open-ecosystem voice OS.
diff --git a/appendix/rationale.md b/appendix/rationale.md
@@ -679,3 +679,67 @@ subscribed to `<own_skill_id>:stop`. The pipeline plugin matches
 and selects; the skill stops. Stop is one of the few cases in
 the spec set where the pipeline / skill split is not
 substitutable.
+
+### 4.10 Common query pipeline plugin (COMMON-QUERY-1)
+
+Common query answers factual questions by holding a timed contest
+among skills — broadcast the question, collect competing answers,
+rank them, speak the best. Four of its design choices are
+unusual enough to be worth recording, because each one trades
+against an instinct a reader brings to the spec.
+
+**`match` blocks, and that is deliberate.** PIPELINE-1 §4.4 tells
+plugins to return from `match` quickly and defer expensive work to
+the handler, because match-phase latency is response latency. Common
+query openly violates that discipline, and it has to: the answer
+*is* the claim decision. The plugin cannot return a `Match` and
+collect afterwards, because whether it claims at all depends on
+whether any skill produced an answer above threshold. Routing and
+processing are the same act here, so both happen in `match`. This is
+the one place the spec set says "yes, this matcher blocks for
+seconds" — and it pays for that admission with the early-start
+optimisation and explicit pipeline positioning, rather than
+pretending the cost away.
+
+**Returning `None` on no-answer is what keeps fallback alive.** The
+earlier, discarded design had the plugin claim the utterance, then
+discover during the handler that no skill could answer, and speak a
+dead-end "I don't know." That permanently starves fallback: once a
+plugin claims, first-match-wins means no later stage runs. Moving the
+whole contest into `match` lets the plugin make an honest claim — it
+returns a `Match` only when it actually has an answer, and `None`
+otherwise — so a failed contest flows naturally to fallback. The
+correctness of the whole pipeline tail depends on the contest
+finishing before the claim is made.
+
+**Ping/pong is a cheap filter gating an expensive operation, not
+ceremony.** It would be simpler to broadcast the question once and
+let skills answer directly. The two-phase poll earns its place
+because the full-answer request invites real I/O — a knowledge skill
+will hit Wikipedia, Wolfram, or a database. Without the cheap local
+pong filter, every such skill performs that I/O for every question
+that passes the gate, including ones far outside its domain. The
+~500ms poll window buys the right to *not* hammer every backend on
+every utterance. (Mycroft's original CommonQuerySkill was also
+two-phase, but for a different reason — message-bus timeout
+management; see comparisons §2.6.)
+
+**Early start hides latency without shrinking the contest.** Because
+`match` blocks, the plugin MAY begin the contest the instant the
+utterance arrives (`ovos.utterance.handle`), running it in parallel
+with the upstream stop/converse/intent stages that get first refusal
+anyway. The subtle requirement is that the early-start cache holds
+only *raw* responses — never a selected answer — and all filtering
+and selection run at `match` time against the *live* session. That
+keeps the optimisation transparent: an upstream stage that
+blacklists a skill or changes session state still takes full effect,
+because the denylist and confidence filters never saw the stale
+snapshot.
+
+The question gate (COMMON-QUERY-1 §4) is the other half of the
+latency story: a cheap up-front classifier that rejects weather
+requests, music commands, timers, and plain statements before any
+broadcast. It is a SHOULD, not a MUST — the confidence filter
+guarantees correctness without it — but on mixed traffic it is the
+single largest latency win available, since it skips the entire
+contest for utterances no knowledge skill would answer anyway.