From 99c559c68769efb97ec4306670e949631ed44fa4 Mon Sep 17 00:00:00 2001 From: Paulus Schoutsen Date: Wed, 27 May 2026 14:18:43 +0200 Subject: [PATCH 1/3] Add server-initiated-pcm-24bit scenario (#60) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the 24-bit PCM scenario from Sendspin/conformance#60. The server re-packs the fixture as 24-bit (3-byte packed, little-endian, two's complement) and the matrix compares canonical hashes after the client unpacks. Per the audit, sendspin-go has no 24-bit code path and Web Audio in sendspin-js is float32-only — the conformance signal is expected to be different per-SDK once `pcm-24bit-decode` is wired up in each adapter. `pcm-24bit-decode` is declared on the aiosendspin server only; every client case fails fast until the adapter declares the capability. --- README.md | 1 + docs/sdk-issues.md | 323 +++++++++++++++++++++++++++++ src/conformance/implementations.py | 1 + src/conformance/scenarios.py | 20 ++ 4 files changed, 345 insertions(+) create mode 100644 docs/sdk-issues.md diff --git a/README.md b/README.md index 0109e7b..53f7237 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,7 @@ Current scenarios: - `server-initiated-drift-injection` (Server injects timestamp drift mid-stream): start the server first, then the client. The server deliberately skews audio chunk timestamps partway through the stream; a conformant client recovers the original audio within the spec's soft/hard correction thresholds. No SDK in the matrix declares the `stream-sync-drift-correction` capability yet, so every case currently fails fast and the matrix surfaces the gap - `server-initiated-burst-cadence` (Server observes client/time burst cadence): start the server first, then the client. The server counts `client/time` messages over a fixed observation window and asserts the distribution matches the recommended cadence (bursts of 8, ~10 s apart). No SDK in the matrix declares the `client-time-burst-cadence` capability yet, so every case currently fails fast and the matrix surfaces the gap - `server-initiated-request-format` (Client requests a mid-stream codec switch): start the server first, then the client. The server begins streaming in Opus; the client is instructed to emit `stream/request-format` with FLAC at a fixed timestamp and the server responds with a fresh `stream/start`. No SDK currently emits `stream/request-format`, so every case currently fails fast on the `stream-request-format-emit` capability +- `server-initiated-pcm-24bit` (Server initiates connection and client wants 24-bit PCM): start the server first, then the client. The server re-packs the fixture as 24-bit PCM (3-byte packed, little-endian, two's complement). No SDK in the matrix declares the `pcm-24bit-decode` capability yet, so every case currently fails fast and the matrix surfaces the gap (notably the `sendspin-go` SDK has no 24-bit code path per the audit) ## Current coverage diff --git a/docs/sdk-issues.md b/docs/sdk-issues.md new file mode 100644 index 0000000..df0152c --- /dev/null +++ b/docs/sdk-issues.md @@ -0,0 +1,323 @@ +# SDK Issues to File + +Per-SDK punch list extracted from the seven audit docs in this directory. +Each bullet here is a candidate GitHub issue against the named SDK's repo. +Spec-text recommendations and missing conformance scenarios are tracked +separately and are **not** included here — this list is only concrete, +SDK-side code work. + +Severity hint: +- **🔴 critical** — audible misbehavior or direct spec violation that + affects normal playback today +- **🟠 high** — wrong defaults, parse-but-ignore, or missing behavior the + rest of the SDK already assumes +- **🟡 medium** — missing API surface or divergence from spec where the + practical impact is bounded +- **⚪ latent** — code path that would break in a scenario not exercised + today (24-bit PCM, FLAC mid-stream, etc.) + +Source docs: +[stream-sync-correction.md](./stream-sync-correction.md), +[clock-synchronization.md](./clock-synchronization.md), +[codec-format-negotiation.md](./codec-format-negotiation.md), +[goodbye-and-operational-state.md](./goodbye-and-operational-state.md), +[reconnection-and-multi-server.md](./reconnection-and-multi-server.md), +[static-delay.md](./static-delay.md), +[volume-curve.md](./volume-curve.md). + +--- + +## `Sendspin/aiosendspin` + +Transport+clock library. Most "missing" behavior is by design (the +embedder owns playback). No code bugs identified — see `sendspin-cli` +for the Python embedder's followups. + +One potential SDK-shaped followup, not strictly a bug: + +- ⚪ **Expose a hook for `state: 'error'` / `synchronized` underrun + reporting.** The library is the only place that knows when the + Kalman filter / chunk pipeline can't keep up; today there is no + callback for the embedder to drive the spec's underrun handshake. + Source: stream-sync-correction.md §`aiosendspin`. + +--- + +## `Sendspin/sendspin-cli` + +- 🟠 **Web UI sends `shutdown` on tab close instead of `restart`.** + `destroyPlayer()` defaults to `shutdown`; only the explicit + user-stop button sends `user_request`. A user who closes the + browser tab is reported to the server as "shut down permanently" + and is not auto-reconnected. + Source: goodbye-and-operational-state.md §divergence. +- 🟡 **Underrun does not send `client/state: 'error'`.** The + sounddevice callback at `sendspin/audio.py:432-437` detects + underrun and triggers a queue clear, but `send_player_state()` + (`audio_connector.py:456-460`) always sends `SYNCHRONIZED`. + Source: stream-sync-correction.md §`sendspin-cli`. +- ⚪ **Range coercion is silent.** Negative `static_delay_ms` is + coerced positive without a warning; out-of-range from a misbehaving + server is accepted. Either reject explicitly or log on coercion. + Source: static-delay.md §divergence. + +--- + +## `Sendspin/sendspin-cpp` + +Reference SDK for most behaviors. One spec-rule gap: + +- 🟡 **Underrun does not send `client/state: 'error'`.** Detected at + `src/sync_task.cpp:683-687` and logged, but never converted to the + protocol message. `ERROR` exists in the enum but is unused. + Source: stream-sync-correction.md §`sendspin-cpp`. + +--- + +## `Sendspin/sendspin-dotnet` + +- 🟠 **Default goodbye reason is `user_request` (factory default).** + Should be `restart` so an unexplained graceful disconnect is + reconnectable; today the default disables server-side + auto-reconnect. Source: goodbye-and-operational-state.md + §divergence. +- 🟠 **`static_delay_ms` parsed but no playback wire-up visible.** + Field is parsed on `ClientStateMessage`; the audit could not find + the path that subtracts it before scheduling. Either it is missing + or it is hidden — needs a targeted audit and, if missing, a fix. + Source: static-delay.md §per-implementation summary. +- 🟠 **No `static_delay_ms` persistence across restarts.** Spec + requires the client to persist; today the embedder must re-supply + on every connect. Add an SDK-level persistence hook. + Source: static-delay.md §divergence. +- 🟠 **Internal `ErrorOccurred` / `ReanchorRequired` events not wired + to `client/state: 'error'`.** The plumbing exists; nothing + subscribes those events to emit the spec's underrun handshake. + Source: stream-sync-correction.md §`sendspin-dotnet`. +- 🟡 **Audit `volume` conversion.** Field is parsed in + `ClientStateMessage` but the audio pipeline is not reachable from + the public API in the audited version. If the curve isn't applied, + add `(vol/100)^1.5`. Source: volume-curve.md §divergence. +- 🟡 **No last-played-server persistence / no multi-server + arbitration / no auto `another_server` on switch.** Today the + decision collapses to "whichever connection happened first" and a + switch sends the wrong (or no) goodbye reason. Source: + reconnection-and-multi-server.md §divergence. +- 🟡 **No `external_source` API.** The state value parses but no + embedder-facing handle exists to enter/leave external-source. + Source: goodbye-and-operational-state.md §divergence. +- ⚪ **FLAC `codec_header` decode path not visible.** Field parsed on + the struct; audit could not find Base64 decode → micro-flac feed. + If absent, FLAC streams cannot start. Source: + codec-format-negotiation.md §divergence. +- ⚪ **No delta `client/state` updates.** Always sends full payload; + no test coverage for partial-update merge. Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `stream/request-format` ever emitted.** Same gap across all + SDKs — listed here for completeness. Source: + codec-format-negotiation.md §divergence. + +--- + +## `Sendspin/sendspin-go` + +- 🔴 **Linear volume gain (`vol/100`).** `pkg/sendspin/scheduler.go` + multiplies the raw ratio into the buffer; at volume 50 outputs + 0.5 amplitude where a conformant player outputs ≈0.354. A + multi-room group containing one Go client will be audibly louder + than conformant peers. Fix: `math.Pow(v/100, 1.5)`. Source: + volume-curve.md §divergence. +- 🔴 **Server ignores `client/goodbye.reason`.** + `pkg/sendspin/server_dispatch.go:118-131` logs the reason but does + not set the retry flag. A client sending `restart` is not + auto-reconnected; a client sending `shutdown` is. One-line fix + mirroring aiosendspin's `connection.py:712`. Source: + reconnection-and-multi-server.md §divergence. +- 🔴 **Server does not handle `external_source` state.** No group + swap / `group/update` / `stream/end` on the incoming state change. + Match aiosendspin's `server/roles/controller/v1.py:93-119`. Source: + goodbye-and-operational-state.md §reference implementations. +- 🟠 **No 24-bit PCM sign-extension path.** Audio processing uses + `binary.LittleEndian` correctly for 16/32-bit only. A negotiated + 24-bit stream would be misread. Add a 24-bit-to-32-bit unpack with + sign-extension. Source: codec-format-negotiation.md §divergence. +- 🟡 **No drift correction in the SDK** (Kalman timestamp math only). + Whether to do this in the SDK or leave to the embedder is a design + call; flagging because peers like `sendspin-rs` do it in the SDK. + Source: stream-sync-correction.md §`sendspin-go`. +- 🟡 **No last-played-server persistence / no multi-server + arbitration / no auto `another_server` on switch / no reconnect + backoff.** Source: reconnection-and-multi-server.md §divergence. +- 🟡 **No `static_delay_ms` persistence across restarts.** Source: + static-delay.md §divergence. +- 🟡 **No `external_source` client API.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `static_delay_ms` range validation (0–5000).** Source: + static-delay.md §summary. +- ⚪ **FLAC `codec_header` decode path not visible.** Source: + codec-format-negotiation.md §divergence. +- ⚪ **No delta `client/state` updates.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `stream/request-format` ever emitted.** Source: + codec-format-negotiation.md §divergence. + +--- + +## `Sendspin/sendspin-js` + +- 🔴 **Linear volume gain on `GainNode`.** `src/audio/scheduler.ts:615` + sets `gainNode.gain = stateManager.volume / 100`. Should apply + `Math.pow(v/100, 1.5)` (ideally via a smoothed ramp) before + assignment. Source: volume-curve.md §divergence. +- 🟠 **Default goodbye reason is `shutdown`.** Should be `restart` so + an unexplained close is reconnectable. Source: + goodbye-and-operational-state.md §divergence. +- 🟠 **`PlayerState = "error"` defined but never emitted.** Underrun + paths never transition into it. Wire the scheduler's underrun + detection to send `client/state: 'error'` → mute → buffer → + `synchronized`. Source: stream-sync-correction.md §`sendspin-js`. +- 🟡 **PCM-first codec ordering.** `codec-support.ts:44-76` advertises + `[pcm, opus, flac]`, causing servers to choose uncompressed by + default and waste bandwidth. Reorder to put compressed codecs + first (subject to existing per-browser filtering). Source: + codec-format-negotiation.md §divergence. +- 🟡 **Bootstrap gate too loose.** Filter accepts after a single + measurement, combined with the most aggressive (2.0) adaptive + cutoff in the survey. Raise to `count ≥ 2 + finite covariance`. + Source: clock-synchronization.md §divergence. +- 🟡 **No last-played-server persistence / no multi-server + arbitration / no auto `another_server` on switch.** Source: + reconnection-and-multi-server.md §divergence. +- 🟡 **No `static_delay_ms` persistence across restarts.** Source: + static-delay.md §divergence. +- 🟡 **No `external_source` API.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **24-bit PCM unsupported by Web Audio AudioContext.** Not a bug + per se, but the advertised formats should explicitly filter out + `bit_depth: 24` for PCM to prevent any 24-bit negotiation outcome. + Source: codec-format-negotiation.md §divergence. +- ⚪ **No delta `client/state` updates.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `stream/request-format` ever emitted.** Source: + codec-format-negotiation.md §divergence. + +--- + +## `Sendspin/sendspin-jvm` + +- 🟠 **`PlayerStatePayload.state` is hardcoded `"synchronized"`.** + Underrun is exposed via a `StateFlow` for the embedder, but no + protocol-level error state is ever sent. Plumb the flow into the + outgoing `client/state` message. Source: + stream-sync-correction.md §`sendspin-jvm`. +- 🟠 **Audit `volume` conversion.** Currently delegated to Android + audio APIs; `AudioTrack.setVolume()` is documented linear, so + delegation is likely silently non-conformant. Apply + `(vol/100)^1.5` in the SDK. Source: volume-curve.md §divergence. +- 🟠 **`static_delay_ms` not applied in playback** (no pipeline + visible) and not persisted. Source: static-delay.md + §per-implementation summary. +- 🟡 **Drift correction missing in SDK** (Kalman timestamp math + only). Flagged like sendspin-go: design call. Source: + stream-sync-correction.md §`sendspin-jvm`. +- 🟡 **Late-chunk threshold is 1 s** — deliberately loose for Music + Assistant seek bursts. Worth documenting the rationale and + considering a configurable knob. Source: stream-sync-correction.md + §`sendspin-jvm`. +- 🟡 **Multi-server arbitration is only partial** (server-host + helper auto-sends `another_server` during negotiation). No + last-played persistence in the SDK proper. Source: + reconnection-and-multi-server.md §divergence. +- 🟡 **No `external_source` API.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **FLAC `codec_header` decode path not visible.** Source: + codec-format-negotiation.md §divergence. +- ⚪ **No delta `client/state` updates.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `stream/request-format` ever emitted.** Source: + codec-format-negotiation.md §divergence. + +--- + +## `Sendspin/sendspin-rs` + +- 🔴 **`static_delay_ms` parsed but never applied in playback.** + Flagged in static-delay.md as the **highest-severity bug found in + the audit**: the server factors the delay into when it sends + audio, but the client scheduler ignores it, so alignment is off + by exactly the delay amount. Thread the value from the protocol + handler through `SyncedPlayer` and subtract where the + server→local conversion happens (`src/sync/clock.rs`). Source: + static-delay.md §recommendations. +- 🟠 **`ClientSyncState::Error` defined but never sent.** + `src/protocol/messages.rs:285` defines the variant; underrun in + the audio callback emits zeros and freezes the cursor but never + transitions state. Source: stream-sync-correction.md + §`sendspin-rs`. +- 🟠 **Kalman drift applied unconditionally (no SNR gate).** When + the filter has just seen its first few measurements, noise- + dominated drift propagates wrong timestamps into the scheduler. + Add `drift² ≥ k² × drift_covariance` (k ≥ 2) gate; below + threshold, fall back to offset-only. Source: + clock-synchronization.md §divergence. +- 🟡 **No last-played-server persistence / no multi-server + arbitration / no auto `another_server` on switch / no reconnect + backoff.** Source: reconnection-and-multi-server.md §divergence. +- 🟡 **No `external_source` API.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No delta `client/state` updates.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `stream/request-format` ever emitted.** Source: + codec-format-negotiation.md §divergence. + +--- + +## `Sendspin/SendspinKit` + +- 🔴 **`static_delay_ms` parsed but never applied in playback.** Same + shape as the sendspin-rs bug — server pre-compensates for the + delay, client ignores it, alignment is wrong by exactly that + amount. Subtract in `AudioScheduler` where server timestamps are + converted, alongside the existing parsing in `SendspinClient`. + Source: static-delay.md §recommendations. +- 🟠 **Default goodbye reason is `shutdown`.** Should be `restart`. + Source: goodbye-and-operational-state.md §divergence. +- 🟠 **Underrun does not send `client/state: 'error'`.** Error state + *is* sent on codec/format failures + (`SendspinClient+MessageHandling.swift:224-241`), but the audio + callback at `AudioPlayer.swift:558` only increments a counter on + underrun. Wire the underrun path to + `transitionOperationalState(to: .error)`. Source: + stream-sync-correction.md §`SendspinKit`. +- 🟡 **Bootstrap gate too loose (1 sample).** Raise to `count ≥ 2 + + finite covariance`. Source: clock-synchronization.md §divergence. +- 🟡 **No last-played-server persistence / no multi-server + arbitration / no auto `another_server` on switch / no reconnect + backoff visible.** Source: reconnection-and-multi-server.md + §divergence. +- ⚪ **No delta `client/state` updates.** Source: + goodbye-and-operational-state.md §divergence. +- ⚪ **No `stream/request-format` ever emitted.** Source: + codec-format-negotiation.md §divergence. + +--- + +## Cross-cutting items (not for filing as SDK issues) + +These are findings the audit raised that don't belong as per-SDK +issues. Listed here so they aren't forgotten: + +- **Spec text changes** — recommended threshold values, drift SNR + gate as MUST, default goodbye reason, perceptual volume curve, + late-chunk drop threshold, etc. File against `Sendspin/spec`. +- **Conformance harness scenarios** — drift injection, static-delay + measurement, volume calibration, multi-server arbitration, + `external_source` group swap, `stream/request-format` negotiation, + 24-bit PCM fixture. File against `Sendspin/conformance`. +- **Universal SDK gaps** — these missed-by-everyone items are + better tackled by a spec change first, then propagated: + - No SDK emits `stream/request-format`. + - No SDK sends delta `client/state` updates. + - No client SDK except `SendspinKit` exposes `external_source`. + - No SDK sends the spec's underrun `error → synchronized` + handshake (every per-SDK list above includes its own variant). diff --git a/src/conformance/implementations.py b/src/conformance/implementations.py index 0a988dc..82bc16f 100644 --- a/src/conformance/implementations.py +++ b/src/conformance/implementations.py @@ -65,6 +65,7 @@ "stream-sync-drift-correction", "client-time-burst-cadence", "stream-request-format-emit", + "pcm-24bit-decode", ), ), ), diff --git a/src/conformance/scenarios.py b/src/conformance/scenarios.py index 83fe957..728d7ca 100644 --- a/src/conformance/scenarios.py +++ b/src/conformance/scenarios.py @@ -177,6 +177,25 @@ ) +SERVER_INITIATED_PCM_24BIT = ScenarioSpec( + id="server-initiated-pcm-24bit", + display_name="Server initiates connection and client wants 24-bit PCM", + description=( + "Start the server first, then the client. The server loads the PCM audio " + "derived from `almost_silent.flac` and re-packs it as 24-bit (3-byte packed, " + "little-endian, two's complement). The client advertises a listener and 24-bit " + "PCM as its only supported audio format. The matrix compares canonical PCM " + "hashes after the client unpacks the 24-bit stream. A non-conformant SDK with " + "no 24-bit decode path misreads the bytes and produces a hash mismatch." + ), + initiator_role="server", + preferred_codec="pcm", + required_role_families=("player",), + verification_mode="capability-only", + required_capability="pcm-24bit-decode", +) + + SERVER_INITIATED_REQUEST_FORMAT = ScenarioSpec( id="server-initiated-request-format", display_name="Client requests a mid-stream codec switch", @@ -207,6 +226,7 @@ SERVER_INITIATED_DRIFT_INJECTION, SERVER_INITIATED_BURST_CADENCE, SERVER_INITIATED_REQUEST_FORMAT, + SERVER_INITIATED_PCM_24BIT, ) SCENARIOS: dict[str, ScenarioSpec] = {scenario.id: scenario for scenario in SCENARIO_LIST} From aa51caf033078bd3e986a037f5693a9eaa512df1 Mon Sep 17 00:00:00 2001 From: Paulus Schoutsen Date: Wed, 27 May 2026 14:20:00 +0200 Subject: [PATCH 2/3] Drop unrelated docs/sdk-issues.md from the branch --- docs/sdk-issues.md | 323 --------------------------------------------- 1 file changed, 323 deletions(-) delete mode 100644 docs/sdk-issues.md diff --git a/docs/sdk-issues.md b/docs/sdk-issues.md deleted file mode 100644 index df0152c..0000000 --- a/docs/sdk-issues.md +++ /dev/null @@ -1,323 +0,0 @@ -# SDK Issues to File - -Per-SDK punch list extracted from the seven audit docs in this directory. -Each bullet here is a candidate GitHub issue against the named SDK's repo. -Spec-text recommendations and missing conformance scenarios are tracked -separately and are **not** included here — this list is only concrete, -SDK-side code work. - -Severity hint: -- **🔴 critical** — audible misbehavior or direct spec violation that - affects normal playback today -- **🟠 high** — wrong defaults, parse-but-ignore, or missing behavior the - rest of the SDK already assumes -- **🟡 medium** — missing API surface or divergence from spec where the - practical impact is bounded -- **⚪ latent** — code path that would break in a scenario not exercised - today (24-bit PCM, FLAC mid-stream, etc.) - -Source docs: -[stream-sync-correction.md](./stream-sync-correction.md), -[clock-synchronization.md](./clock-synchronization.md), -[codec-format-negotiation.md](./codec-format-negotiation.md), -[goodbye-and-operational-state.md](./goodbye-and-operational-state.md), -[reconnection-and-multi-server.md](./reconnection-and-multi-server.md), -[static-delay.md](./static-delay.md), -[volume-curve.md](./volume-curve.md). - ---- - -## `Sendspin/aiosendspin` - -Transport+clock library. Most "missing" behavior is by design (the -embedder owns playback). No code bugs identified — see `sendspin-cli` -for the Python embedder's followups. - -One potential SDK-shaped followup, not strictly a bug: - -- ⚪ **Expose a hook for `state: 'error'` / `synchronized` underrun - reporting.** The library is the only place that knows when the - Kalman filter / chunk pipeline can't keep up; today there is no - callback for the embedder to drive the spec's underrun handshake. - Source: stream-sync-correction.md §`aiosendspin`. - ---- - -## `Sendspin/sendspin-cli` - -- 🟠 **Web UI sends `shutdown` on tab close instead of `restart`.** - `destroyPlayer()` defaults to `shutdown`; only the explicit - user-stop button sends `user_request`. A user who closes the - browser tab is reported to the server as "shut down permanently" - and is not auto-reconnected. - Source: goodbye-and-operational-state.md §divergence. -- 🟡 **Underrun does not send `client/state: 'error'`.** The - sounddevice callback at `sendspin/audio.py:432-437` detects - underrun and triggers a queue clear, but `send_player_state()` - (`audio_connector.py:456-460`) always sends `SYNCHRONIZED`. - Source: stream-sync-correction.md §`sendspin-cli`. -- ⚪ **Range coercion is silent.** Negative `static_delay_ms` is - coerced positive without a warning; out-of-range from a misbehaving - server is accepted. Either reject explicitly or log on coercion. - Source: static-delay.md §divergence. - ---- - -## `Sendspin/sendspin-cpp` - -Reference SDK for most behaviors. One spec-rule gap: - -- 🟡 **Underrun does not send `client/state: 'error'`.** Detected at - `src/sync_task.cpp:683-687` and logged, but never converted to the - protocol message. `ERROR` exists in the enum but is unused. - Source: stream-sync-correction.md §`sendspin-cpp`. - ---- - -## `Sendspin/sendspin-dotnet` - -- 🟠 **Default goodbye reason is `user_request` (factory default).** - Should be `restart` so an unexplained graceful disconnect is - reconnectable; today the default disables server-side - auto-reconnect. Source: goodbye-and-operational-state.md - §divergence. -- 🟠 **`static_delay_ms` parsed but no playback wire-up visible.** - Field is parsed on `ClientStateMessage`; the audit could not find - the path that subtracts it before scheduling. Either it is missing - or it is hidden — needs a targeted audit and, if missing, a fix. - Source: static-delay.md §per-implementation summary. -- 🟠 **No `static_delay_ms` persistence across restarts.** Spec - requires the client to persist; today the embedder must re-supply - on every connect. Add an SDK-level persistence hook. - Source: static-delay.md §divergence. -- 🟠 **Internal `ErrorOccurred` / `ReanchorRequired` events not wired - to `client/state: 'error'`.** The plumbing exists; nothing - subscribes those events to emit the spec's underrun handshake. - Source: stream-sync-correction.md §`sendspin-dotnet`. -- 🟡 **Audit `volume` conversion.** Field is parsed in - `ClientStateMessage` but the audio pipeline is not reachable from - the public API in the audited version. If the curve isn't applied, - add `(vol/100)^1.5`. Source: volume-curve.md §divergence. -- 🟡 **No last-played-server persistence / no multi-server - arbitration / no auto `another_server` on switch.** Today the - decision collapses to "whichever connection happened first" and a - switch sends the wrong (or no) goodbye reason. Source: - reconnection-and-multi-server.md §divergence. -- 🟡 **No `external_source` API.** The state value parses but no - embedder-facing handle exists to enter/leave external-source. - Source: goodbye-and-operational-state.md §divergence. -- ⚪ **FLAC `codec_header` decode path not visible.** Field parsed on - the struct; audit could not find Base64 decode → micro-flac feed. - If absent, FLAC streams cannot start. Source: - codec-format-negotiation.md §divergence. -- ⚪ **No delta `client/state` updates.** Always sends full payload; - no test coverage for partial-update merge. Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `stream/request-format` ever emitted.** Same gap across all - SDKs — listed here for completeness. Source: - codec-format-negotiation.md §divergence. - ---- - -## `Sendspin/sendspin-go` - -- 🔴 **Linear volume gain (`vol/100`).** `pkg/sendspin/scheduler.go` - multiplies the raw ratio into the buffer; at volume 50 outputs - 0.5 amplitude where a conformant player outputs ≈0.354. A - multi-room group containing one Go client will be audibly louder - than conformant peers. Fix: `math.Pow(v/100, 1.5)`. Source: - volume-curve.md §divergence. -- 🔴 **Server ignores `client/goodbye.reason`.** - `pkg/sendspin/server_dispatch.go:118-131` logs the reason but does - not set the retry flag. A client sending `restart` is not - auto-reconnected; a client sending `shutdown` is. One-line fix - mirroring aiosendspin's `connection.py:712`. Source: - reconnection-and-multi-server.md §divergence. -- 🔴 **Server does not handle `external_source` state.** No group - swap / `group/update` / `stream/end` on the incoming state change. - Match aiosendspin's `server/roles/controller/v1.py:93-119`. Source: - goodbye-and-operational-state.md §reference implementations. -- 🟠 **No 24-bit PCM sign-extension path.** Audio processing uses - `binary.LittleEndian` correctly for 16/32-bit only. A negotiated - 24-bit stream would be misread. Add a 24-bit-to-32-bit unpack with - sign-extension. Source: codec-format-negotiation.md §divergence. -- 🟡 **No drift correction in the SDK** (Kalman timestamp math only). - Whether to do this in the SDK or leave to the embedder is a design - call; flagging because peers like `sendspin-rs` do it in the SDK. - Source: stream-sync-correction.md §`sendspin-go`. -- 🟡 **No last-played-server persistence / no multi-server - arbitration / no auto `another_server` on switch / no reconnect - backoff.** Source: reconnection-and-multi-server.md §divergence. -- 🟡 **No `static_delay_ms` persistence across restarts.** Source: - static-delay.md §divergence. -- 🟡 **No `external_source` client API.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `static_delay_ms` range validation (0–5000).** Source: - static-delay.md §summary. -- ⚪ **FLAC `codec_header` decode path not visible.** Source: - codec-format-negotiation.md §divergence. -- ⚪ **No delta `client/state` updates.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `stream/request-format` ever emitted.** Source: - codec-format-negotiation.md §divergence. - ---- - -## `Sendspin/sendspin-js` - -- 🔴 **Linear volume gain on `GainNode`.** `src/audio/scheduler.ts:615` - sets `gainNode.gain = stateManager.volume / 100`. Should apply - `Math.pow(v/100, 1.5)` (ideally via a smoothed ramp) before - assignment. Source: volume-curve.md §divergence. -- 🟠 **Default goodbye reason is `shutdown`.** Should be `restart` so - an unexplained close is reconnectable. Source: - goodbye-and-operational-state.md §divergence. -- 🟠 **`PlayerState = "error"` defined but never emitted.** Underrun - paths never transition into it. Wire the scheduler's underrun - detection to send `client/state: 'error'` → mute → buffer → - `synchronized`. Source: stream-sync-correction.md §`sendspin-js`. -- 🟡 **PCM-first codec ordering.** `codec-support.ts:44-76` advertises - `[pcm, opus, flac]`, causing servers to choose uncompressed by - default and waste bandwidth. Reorder to put compressed codecs - first (subject to existing per-browser filtering). Source: - codec-format-negotiation.md §divergence. -- 🟡 **Bootstrap gate too loose.** Filter accepts after a single - measurement, combined with the most aggressive (2.0) adaptive - cutoff in the survey. Raise to `count ≥ 2 + finite covariance`. - Source: clock-synchronization.md §divergence. -- 🟡 **No last-played-server persistence / no multi-server - arbitration / no auto `another_server` on switch.** Source: - reconnection-and-multi-server.md §divergence. -- 🟡 **No `static_delay_ms` persistence across restarts.** Source: - static-delay.md §divergence. -- 🟡 **No `external_source` API.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **24-bit PCM unsupported by Web Audio AudioContext.** Not a bug - per se, but the advertised formats should explicitly filter out - `bit_depth: 24` for PCM to prevent any 24-bit negotiation outcome. - Source: codec-format-negotiation.md §divergence. -- ⚪ **No delta `client/state` updates.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `stream/request-format` ever emitted.** Source: - codec-format-negotiation.md §divergence. - ---- - -## `Sendspin/sendspin-jvm` - -- 🟠 **`PlayerStatePayload.state` is hardcoded `"synchronized"`.** - Underrun is exposed via a `StateFlow` for the embedder, but no - protocol-level error state is ever sent. Plumb the flow into the - outgoing `client/state` message. Source: - stream-sync-correction.md §`sendspin-jvm`. -- 🟠 **Audit `volume` conversion.** Currently delegated to Android - audio APIs; `AudioTrack.setVolume()` is documented linear, so - delegation is likely silently non-conformant. Apply - `(vol/100)^1.5` in the SDK. Source: volume-curve.md §divergence. -- 🟠 **`static_delay_ms` not applied in playback** (no pipeline - visible) and not persisted. Source: static-delay.md - §per-implementation summary. -- 🟡 **Drift correction missing in SDK** (Kalman timestamp math - only). Flagged like sendspin-go: design call. Source: - stream-sync-correction.md §`sendspin-jvm`. -- 🟡 **Late-chunk threshold is 1 s** — deliberately loose for Music - Assistant seek bursts. Worth documenting the rationale and - considering a configurable knob. Source: stream-sync-correction.md - §`sendspin-jvm`. -- 🟡 **Multi-server arbitration is only partial** (server-host - helper auto-sends `another_server` during negotiation). No - last-played persistence in the SDK proper. Source: - reconnection-and-multi-server.md §divergence. -- 🟡 **No `external_source` API.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **FLAC `codec_header` decode path not visible.** Source: - codec-format-negotiation.md §divergence. -- ⚪ **No delta `client/state` updates.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `stream/request-format` ever emitted.** Source: - codec-format-negotiation.md §divergence. - ---- - -## `Sendspin/sendspin-rs` - -- 🔴 **`static_delay_ms` parsed but never applied in playback.** - Flagged in static-delay.md as the **highest-severity bug found in - the audit**: the server factors the delay into when it sends - audio, but the client scheduler ignores it, so alignment is off - by exactly the delay amount. Thread the value from the protocol - handler through `SyncedPlayer` and subtract where the - server→local conversion happens (`src/sync/clock.rs`). Source: - static-delay.md §recommendations. -- 🟠 **`ClientSyncState::Error` defined but never sent.** - `src/protocol/messages.rs:285` defines the variant; underrun in - the audio callback emits zeros and freezes the cursor but never - transitions state. Source: stream-sync-correction.md - §`sendspin-rs`. -- 🟠 **Kalman drift applied unconditionally (no SNR gate).** When - the filter has just seen its first few measurements, noise- - dominated drift propagates wrong timestamps into the scheduler. - Add `drift² ≥ k² × drift_covariance` (k ≥ 2) gate; below - threshold, fall back to offset-only. Source: - clock-synchronization.md §divergence. -- 🟡 **No last-played-server persistence / no multi-server - arbitration / no auto `another_server` on switch / no reconnect - backoff.** Source: reconnection-and-multi-server.md §divergence. -- 🟡 **No `external_source` API.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No delta `client/state` updates.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `stream/request-format` ever emitted.** Source: - codec-format-negotiation.md §divergence. - ---- - -## `Sendspin/SendspinKit` - -- 🔴 **`static_delay_ms` parsed but never applied in playback.** Same - shape as the sendspin-rs bug — server pre-compensates for the - delay, client ignores it, alignment is wrong by exactly that - amount. Subtract in `AudioScheduler` where server timestamps are - converted, alongside the existing parsing in `SendspinClient`. - Source: static-delay.md §recommendations. -- 🟠 **Default goodbye reason is `shutdown`.** Should be `restart`. - Source: goodbye-and-operational-state.md §divergence. -- 🟠 **Underrun does not send `client/state: 'error'`.** Error state - *is* sent on codec/format failures - (`SendspinClient+MessageHandling.swift:224-241`), but the audio - callback at `AudioPlayer.swift:558` only increments a counter on - underrun. Wire the underrun path to - `transitionOperationalState(to: .error)`. Source: - stream-sync-correction.md §`SendspinKit`. -- 🟡 **Bootstrap gate too loose (1 sample).** Raise to `count ≥ 2 + - finite covariance`. Source: clock-synchronization.md §divergence. -- 🟡 **No last-played-server persistence / no multi-server - arbitration / no auto `another_server` on switch / no reconnect - backoff visible.** Source: reconnection-and-multi-server.md - §divergence. -- ⚪ **No delta `client/state` updates.** Source: - goodbye-and-operational-state.md §divergence. -- ⚪ **No `stream/request-format` ever emitted.** Source: - codec-format-negotiation.md §divergence. - ---- - -## Cross-cutting items (not for filing as SDK issues) - -These are findings the audit raised that don't belong as per-SDK -issues. Listed here so they aren't forgotten: - -- **Spec text changes** — recommended threshold values, drift SNR - gate as MUST, default goodbye reason, perceptual volume curve, - late-chunk drop threshold, etc. File against `Sendspin/spec`. -- **Conformance harness scenarios** — drift injection, static-delay - measurement, volume calibration, multi-server arbitration, - `external_source` group swap, `stream/request-format` negotiation, - 24-bit PCM fixture. File against `Sendspin/conformance`. -- **Universal SDK gaps** — these missed-by-everyone items are - better tackled by a spec change first, then propagated: - - No SDK emits `stream/request-format`. - - No SDK sends delta `client/state` updates. - - No client SDK except `SendspinKit` exposes `external_source`. - - No SDK sends the spec's underrun `error → synchronized` - handshake (every per-SDK list above includes its own variant). From dc45995fda3aad36965f4bc8bcf38c5d480391af Mon Sep 17 00:00:00 2001 From: Paulus Schoutsen Date: Wed, 27 May 2026 14:48:17 +0200 Subject: [PATCH 3/3] Wire the 24-bit PCM test body through the aiosendspin adapters MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The server adapter feeds 32-bit source PCM into the SDK with AudioFormat(bit_depth=32). The aiosendspin PCM pipeline negotiates the 24-bit packed wire format from the client, resamples s32→s32 through PyAV, and then converts the output to s24 on the wire. The SDK's source-format path expects PyAV-compatible bytes (which means s32, since PyAV does not have a packed-s24 sample format) — feeding s24 directly would fail with \"got N bytes; need 4N/3 bytes\" inside PyAV. The 16→32-bit shift preserves the float-domain hash, so the existing audio-pcm verification continues to work end-to-end. The client adapter advertises bit_depth=24 at the fixture's native sample_rate/channels (8000 Hz mono) so the SDK does not resample during the round trip; only the bit depth changes. aiosendspin's client RoleSpec declares the new pcm-24bit-decode capability, and the scenario flips its verification_mode to audio-pcm so the float hashes are actually compared. --- .../adapters/aiosendspin_client.py | 21 +++++++++- .../adapters/aiosendspin_server.py | 39 +++++++++++++++++-- src/conformance/implementations.py | 1 + src/conformance/scenarios.py | 2 +- 4 files changed, 56 insertions(+), 7 deletions(-) diff --git a/src/conformance/adapters/aiosendspin_client.py b/src/conformance/adapters/aiosendspin_client.py index 5a322d2..4558eed 100644 --- a/src/conformance/adapters/aiosendspin_client.py +++ b/src/conformance/adapters/aiosendspin_client.py @@ -63,7 +63,7 @@ def build_parser() -> argparse.ArgumentParser: return parser -def _supported_formats(preferred_codec: str) -> list[Any]: +def _supported_formats(preferred_codec: str, *, scenario_id: str = "") -> list[Any]: from aiosendspin.models.player import SupportedAudioFormat from aiosendspin.models.types import AudioCodec @@ -90,6 +90,18 @@ def _supported_formats(preferred_codec: str) -> list[Any]: bit_depth=16, ), ] + if scenario_id == "server-initiated-pcm-24bit": + # Match the fixture's native rate/channels so the SDK doesn't resample + # — only the bit depth changes — making the float-domain PCM hash + # round-trip identically through the 16→32→24 conversion. + return [ + SupportedAudioFormat( + codec=codec, + channels=1, + sample_rate=8_000, + bit_depth=24, + ), + ] return [ SupportedAudioFormat( codec=codec, @@ -301,11 +313,14 @@ def on_artwork_chunk(channel: int, data: bytes) -> None: if args.scenario_id in { "client-initiated-pcm", "server-initiated-pcm", + "server-initiated-pcm-24bit", "server-initiated-flac", "server-initiated-opus", }: player_support = ClientHelloPlayerSupport( - supported_formats=_supported_formats(args.preferred_codec), + supported_formats=_supported_formats( + args.preferred_codec, scenario_id=args.scenario_id + ), buffer_capacity=2_000_000, supported_commands=[PlayerCommand.VOLUME, PlayerCommand.MUTE], ) @@ -351,6 +366,7 @@ def on_artwork_chunk(channel: int, data: bytes) -> None: if args.scenario_id in { "client-initiated-pcm", "server-initiated-pcm", + "server-initiated-pcm-24bit", "server-initiated-flac", "server-initiated-opus", }: @@ -468,6 +484,7 @@ async def handle_connection(ws: Any) -> None: if args.scenario_id in { "client-initiated-pcm", "server-initiated-pcm", + "server-initiated-pcm-24bit", "server-initiated-flac", "server-initiated-opus", }: diff --git a/src/conformance/adapters/aiosendspin_server.py b/src/conformance/adapters/aiosendspin_server.py index f8bff56..984202b 100644 --- a/src/conformance/adapters/aiosendspin_server.py +++ b/src/conformance/adapters/aiosendspin_server.py @@ -124,6 +124,25 @@ async def _wait_for_incoming_client( raise TimeoutError(f"Timed out waiting for client {client_name!r}") +def _expand_pcm_16_to_32(pcm_16_bytes: bytes) -> bytes: + """Re-pack signed 16-bit little-endian PCM as signed 32-bit little-endian. + + Each 16-bit sample becomes a 32-bit sample of equal float value by + left-shifting 16 bits — the original two bytes become the top half of the + 32-bit sample, and two zero bytes are inserted in the low half. + ``FloatPcmHasher`` produces identical float-domain hashes for the two + representations, so existing PCM verification still works. The aiosendspin + SDK uses 32-bit PCM as the PyAV-compatible source for any non-16-bit + target wire format, including the 24-bit packed wire format used by the + server-initiated-pcm-24bit scenario. + """ + sample_count = len(pcm_16_bytes) // 2 + out = bytearray(sample_count * 4) + out[2::4] = pcm_16_bytes[0::2] + out[3::4] = pcm_16_bytes[1::2] + return bytes(out) + + def _iter_pcm_blocks( pcm_bytes: bytes, *, @@ -305,6 +324,17 @@ async def _run_audio_scenario(args: argparse.Namespace, *, server: Any, client: from aiosendspin.server.audio import AudioFormat fixture = decode_fixture(Path(args.fixture), max_duration_seconds=args.clip_seconds) + wire_bit_depth = fixture.bit_depth + source_bit_depth = fixture.bit_depth + source_pcm_bytes = fixture.pcm_bytes + if args.scenario_id == "server-initiated-pcm-24bit": + # The SDK ingests source PCM as s32 (PyAV-compatible) and emits the + # negotiated 24-bit packed format on the wire. Provide s32 input here + # and declare bit_depth=32 as the source format; the target bit depth + # is reported by the player negotiation in stream/start. + source_pcm_bytes = _expand_pcm_16_to_32(fixture.pcm_bytes) + source_bit_depth = 32 + wire_bit_depth = 24 frame_alignment_samples: int | None = None trimmed_source_frames = 0 if args.preferred_codec == "flac": @@ -388,16 +418,16 @@ def send_binary_wrapper( stream = client.group.start_stream() audio_format = AudioFormat( sample_rate=fixture.sample_rate, - bit_depth=fixture.bit_depth, + bit_depth=source_bit_depth, channels=fixture.channels, ) next_play_start_us = server.clock.now_us() + 250_000 total_duration_us = 0 for chunk, duration_us in _iter_pcm_blocks( - fixture.pcm_bytes, + source_pcm_bytes, sample_rate=fixture.sample_rate, channels=fixture.channels, - bit_depth=fixture.bit_depth, + bit_depth=source_bit_depth, ): stream.prepare_audio(chunk, audio_format) play_start_us = await stream.commit_audio(play_start_us=next_play_start_us) @@ -425,7 +455,7 @@ def send_binary_wrapper( "clip_seconds": args.clip_seconds, "sample_rate": fixture.sample_rate, "channels": fixture.channels, - "bit_depth": fixture.bit_depth, + "bit_depth": wire_bit_depth, "frame_count": fixture.frame_count, "duration_seconds": fixture.duration_seconds, "frame_alignment_samples": frame_alignment_samples, @@ -542,6 +572,7 @@ async def _scenario_payload( if args.scenario_id in { "client-initiated-pcm", "server-initiated-pcm", + "server-initiated-pcm-24bit", "server-initiated-flac", "server-initiated-opus", }: diff --git a/src/conformance/implementations.py b/src/conformance/implementations.py index 82bc16f..f335d7f 100644 --- a/src/conformance/implementations.py +++ b/src/conformance/implementations.py @@ -49,6 +49,7 @@ supports_opus=True, supports_discovery=True, supported_role_families=("player", "metadata", "controller", "artwork"), + supported_capabilities=("pcm-24bit-decode",), ), server=RoleSpec( supported=True, diff --git a/src/conformance/scenarios.py b/src/conformance/scenarios.py index 728d7ca..109ce14 100644 --- a/src/conformance/scenarios.py +++ b/src/conformance/scenarios.py @@ -191,7 +191,7 @@ initiator_role="server", preferred_codec="pcm", required_role_families=("player",), - verification_mode="capability-only", + verification_mode="audio-pcm", required_capability="pcm-24bit-decode", )