diff --git a/CHANGELOG.md b/CHANGELOG.md index c66ab70..8585e84 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -104,3 +104,20 @@ tool does not recognize the token and cannot expand the template. the rejecting topic. File paths never cross the bus — INTENT-2 locale files are a producer-side authoring convenience expanded inline by the skill loader before emission. + +## OVOS-AUDIO-IN-1 — Audio Input Service + +### 2 + +- §6 (new) — listening lifecycle signals. The audio input service + emits `ovos.listener.record.started` / `ovos.listener.record.ended` around + voice-command capture, accepts `ovos.listener.sleep` to enter sleep mode + and suspend capture, and emits `ovos.listener.awoken` on the sleep→awake + transition. These replace the legacy `recognizer_loop:record_begin` + / `recognizer_loop:record_end` / `recognizer_loop:sleep` / + `mycroft.awoken` topics. All carry no payload; the session is + identified by `context.session.session_id`. +- §6.5 — bus surface table for the listener role, including the + consumer-side `ovos.mic.listen` row (defined in OVOS-AUDIO-1 §4.4). +- See-also — cross-references OVOS-AUDIO-1 §4.4 as the defining spec + for `ovos.mic.listen`. diff --git a/GLOSSARY.md b/GLOSSARY.md index 6eebc3d..32e1294 100644 --- a/GLOSSARY.md +++ b/GLOSSARY.md @@ -36,3 +36,4 @@ open a PR adding it. | **Message** | The unit of communication on the bus: a JSON object with `type`, `data`, `context` ([MSG-1 §2](msg-1.md)). | | **Context** | The assistant-metadata object on a Message; an extensible JSON object whose keys are defined by companion specs ([MSG-1 §2.3](msg-1.md)). | | **Session** | The per-conversation carrier in `context.session`; carries `session_id` (with `"default"` reserved for "originates from the device itself") and `lang` (the user's preferred language, distinct from any `data.lang` describing the payload's own language) ([MSG-1 §4](msg-1.md)). | +| **Listening lifecycle signal** | A payload-free bus signal the audio input service emits or consumes around voice-command capture and sleep mode — `ovos.listener.record.started` / `.record.ended`, `ovos.listener.sleep`, `ovos.listener.awoken` ([AUDIO-IN-1 §6](audio-in.md)). | diff --git a/appendix/divergences.md b/appendix/divergences.md index 16e0e17..b922541 100644 --- a/appendix/divergences.md +++ b/appendix/divergences.md @@ -271,3 +271,12 @@ a number of legacy names. Implementer migration aid: | `add_context` / `remove_context` | Replaced by `ovos.context.set` / `.unset` under CONTEXT-1. | | `mycroft.skill.set_cross_context` / `remove_cross_context` | Replaced by `ovos.context.set` / `.unset` with `scope: "shared"` under CONTEXT-1. | | `.activate` | Activity-tracking emit currently in `ovos-core`; not part of any spec here. | + +#### Listening-lifecycle topics (AUDIO-IN-1) + +| Legacy topic | v2 replacement | Notes | +|--------------|---------------|-------| +| `recognizer_loop:record_begin` | `ovos.listener.record.started` | Capture start. `:` segment separator and implementation-role prefix dropped; no payload. | +| `recognizer_loop:record_end` | `ovos.listener.record.ended` | Capture end; pairs with the start signal. | +| `recognizer_loop:sleep` | `ovos.listener.sleep` | Controller-to-listener sleep request. | +| `mycroft.awoken` | `ovos.listener.awoken` | Sleep→awake transition; moved into the `ovos.listener.*` namespace. | diff --git a/audio-in.md b/audio-in.md index 4cdd89b..056b010 100644 --- a/audio-in.md +++ b/audio-in.md @@ -119,7 +119,78 @@ placed in `context.session` (**OVOS-MSG-1 §4**). --- -## 6. Conformance +## 6. Listening lifecycle signals + +The audio input service emits lifecycle signals around voice-command +capture and sleep mode to notify other components of listener state. + +### 6.1 Capture start + +When voice-command capture begins, the audio input service **MUST** +emit: + +`ovos.listener.record.started` + +Payload: + +No payload. The session is identified by `context.session.session_id` +of this Message. + +### 6.2 Capture end + +When capture ends, the audio input service **MUST** emit: + +`ovos.listener.record.ended` + +Payload: + +No payload. The session is identified by `context.session.session_id` +of this Message. + +This signal pairs with `ovos.listener.record.started` (§6.1); a component +that subscribed to the start signal uses this to restore state. + +### 6.3 Sleep mode + +A controller (e.g. a naptime skill) requests sleep mode by emitting: + +`ovos.listener.sleep` + +Payload: + +No payload. The session is identified by `context.session.session_id` +of this Message. + +On receipt the audio input service enters sleep mode and suspends +capture until it is awoken (§6.4). + +### 6.4 Awoken + +When the audio input service leaves sleep mode, it **MUST** emit: + +`ovos.listener.awoken` + +Payload: + +No payload. The session is identified by `context.session.session_id` +of this Message. + +This signal fires only on the sleep→awake transition; it is not +emitted when the service is already awake. + +### 6.5 Bus surface + +| Topic | Direction | Purpose | +|-------|-----------|---------| +| `ovos.listener.record.started` | audio-input → broadcast | Voice-command capture began (§6.1). | +| `ovos.listener.record.ended` | audio-input → broadcast | Voice-command capture ended (§6.2). | +| `ovos.listener.sleep` | controller → audio-input | Enter sleep mode and suspend capture (§6.3). | +| `ovos.listener.awoken` | audio-input → broadcast | Left sleep mode (§6.4). | +| `ovos.mic.listen` | any component → audio-input | Re-open the user input channel; consumed here, defined in OVOS-AUDIO-1 §4.4. | + +--- + +## 7. Conformance ### An audio input service **MUST**: @@ -128,7 +199,10 @@ placed in `context.session` (**OVOS-MSG-1 §4**). STT (§4); - assign a session in `context.session` per §5.2; - emit `ovos.utterance.handle` with `data.utterances` and `data.lang` - (§5). + (§5); +- emit `ovos.listener.record.started` when voice-command capture begins and + `ovos.listener.record.ended` when it ends (§6.1, §6.2); +- emit `ovos.listener.awoken` on the sleep→awake transition (§6.4). ### An audio input service **SHOULD**: @@ -147,7 +221,8 @@ placed in `context.session` (**OVOS-MSG-1 §4**). - **OVOS-PIPELINE-1** — utterance lifecycle entry point (§9.1); post-STT transformer chains are owned here. - **OVOS-AUDIO-1** — audio output service; owns dialog and TTS - transformer chains. + transformer chains, and defines `ovos.mic.listen` (§4.4) which the + audio input service consumes (§6.5). - **OVOS-TRANSFORM-1** — audio-transformer chain (§3.1). - **OVOS-SESSION-1** — `session.lang`, `session.stt_lang`, `session.detected_lang`, `session.request_lang`.