Skip to content

Add visualizer@v1 role#86

Open
maximmaxim345 wants to merge 1 commit into
mainfrom
feat/visualizer-r2
Open

Add visualizer@v1 role#86
maximmaxim345 wants to merge 1 commit into
mainfrom
feat/visualizer-r2

Conversation

@maximmaxim345
Copy link
Copy Markdown
Member

@maximmaxim345 maximmaxim345 commented May 21, 2026

Adds the visualizer role, based on the previous visualizer@_draft_r1 proposal.

This alternative version aims to resolve a couple of gaps that were open in the previous PR:

One frame per binary message

Batching multiple frames of mixed types into one WebSocket message, as in the previous proposal, forces the server to either delay frames waiting for siblings (hurting low-latency playback) or send tiny batches anyway. It also makes ordering awkward across batches and pushes sort work onto the client. Per-type messages keep ordering trivial. Also more consistent with how other roles structure their binary messages.

The only concern with this approach is the increased number of messages. We could alternatively batch multiple message types with the same timestamp together.

Added peak and pitch

Both give clients more to work with for more reactive effects. Does not replace the existing types since they are still different: peak fires on any transient independent of the musical grid (where beat is rhythmic), and pitch tracks the perceived fundamental (where f_peak tracks the dominant FFT bin, which strong harmonics can hijack).

Downbeat flag on beat

Lets clients drive bar-aware effects. stream/start advertises tracks_downbeats so clients know whether to trust the bit. Accurate beat detection is hard and often relies on offline analysis, so servers without it omit the beat type entirely. Even when supported, it may be unavailable for some content (live streams, sparse non-percussive material).

Top-level rate_max

Now bounds all periodic types, not just spectrum. beat and peak are event-driven and unthrottled.

Scaling

Pins down what was previously hand-waved as "perceptual weighting" so implementations agree on the numbers.

Version name

Called @v1 here, but we might rename to @_draft_r2 if we find some concerns after implementing a prototype. This is also the reason why this is marked as draft for now.

@jhollowe
Copy link
Copy Markdown
Contributor

@Aircoookie this looks to be the successor to #28. Does this include all the information you would like available?

maximmaxim345 added a commit to Sendspin/aiosendspin that referenced this pull request May 29, 2026
Implements the `visualizer@v1` role: the server computes audio features
(loudness, spectrum, dominant frequency, onset peaks, pitch, and beats)
and streams them to visualizer clients as per-frame binary messages,
replacing the batched `_draft_r1` blob. Legacy `visualizer@_draft_r1`
clients are still accepted.

Spec:

* Sendspin/spec#86

Beats come from server-fed offline analysis via `append_beat_schedule`
and ride the wire interleaved with periodic frames in timestamp order.
`beat` is deferred from `stream/start` until the first schedule lands.

### Late join and pacing
A visualizer grouped onto an active stream now receives buffered audio
immediately instead of waiting up to the producer-buffer depth (about
30s) for the first FFT frames. While a beat schedule is still computing,
periodic frames are not sent too far in advance (only 3s or so). This
ensures that all visualizer data keeps having non-decreasing timestamps
and still appears as fast as possible for clients.

### Optional `pitch` computation
`SendspinServer.set_visualizer_pitch_enabled(enabled=False)` drops the
heaviest feature (YINFFT) server wide for if it turns out to be too
computationally intensive. Quality of `pitch` data and algorithm also
needs to be tested more.
All other data is rather simple to compute since FFT constants (Hanning
window, frequency grid, spectrum bin assignment) are cached so steady
per-frame cost stays low.

### Breaking changes

`Roles.VISUALIZER` switches to `"visualizer@v1"` and the exported
visualizer models (`VisualizerFrame`, `ClientHelloVisualizerSupport`,
`StreamStartVisualizer.from_support`) are updated to the newer spec
version. Connected clients stay backwards-compatible.
`visualizer@_draft_r1` remains registered and keeps working as before.
@maximmaxim345 maximmaxim345 marked this pull request as ready for review May 29, 2026 15:59
Comment thread README.md
Comment thread README.md

When [`stream/clear`](#server--client-streamclear) includes the visualizer role, clients should clear all buffered visualization data and continue with data received after this message.

### Server → Client: Visualization Data (Binary)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another potential concern is the amount of messages we are sending.
I don't think the overhead of WebSocket messages is too large though so this just needs testing on more low powered hardware.

There are two reasons why messages are completely split now:

  • Consistency with other roles, all other roles already have one message per datum
  • Difficulty of defining batching behavior. Requiring batching of multiple messages is difficult since it's always a compromise between latency and message count. But leaving batching open to the server would cause most implementations to never use them, defeating the whole purpose.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case this is really a problem, we could also later release a visualizer@v2 if the overhead turns out to be bigger than expected.

This just needs to be tested (with encryption) on a ESP8266 or similar.

MarvinSchenkel pushed a commit to music-assistant/server that referenced this pull request Jun 1, 2026
#4042)

Implements the `visualizer@v1` role on the Sendspin server and updates
Music Assistant to aiosendspin 6.0.1. The older `visualizer@_draft_r1`
role is left unchanged and still carries no beats. Clients connecting
with the `visualizer@_draft_r1` role still remain fully functional.

## Visualizer

The Sendspin player derives a per-track beat schedule from `smart_fades`
analysis and streams it to visualizer clients over `visualizer@v1`. The
Hue Entertainment plugin is reworked to use the v1 role, including
beat-based colour cycling with selectable modes.
Older clients using `visualizer@_draft_r1` stay fully compatible after
this PR.

## Player timing

Reports each player's lead-time and live-source hints to the push stream
so it schedules the first chunk far enough ahead. This stops the AirPlay
bridge from cutting off the start of a track.
Older clients not implementing this stay fully compatible after this PR.

## Repeat and shuffle

Repeat and shuffle now ride on controller state, following the
aiosendspin 6.0 move off metadata. The server sets the controller state
and keeps the legacy metadata copy in sync so older and newer clients
both work.

## Known limitations

The Hue plugin's beat effects need the track's beat analysis. On the
first play of a track whose analysis has not been computed yet, the
lights use the peak and onset fallback for the first stretch (up to
~30s) until beats arrive.

During a smart-fades transition the beats of the crossfading tracks are
used as is and can drift slightly while the two tracks overlap.
Alignment is correct again once the transition completes.

The Hue bridge uses a small visualizer buffer for near-realtime
delivery, so after a track change beats take a few seconds to start
arriving (the first moments of the new schedule are kept near the
playhead and not delivered yet). The lights use the peak and onset
fallback until they do.

Beat data is loaded by a lightweight retry poller rather than an
audio-analysis event subscription. This is intentional, to keep the
changes contained to the Sendspin provider and avoid touching core Music
Assistant code before the 2.9 release.

Visualizer pitch detection is disabled server-wide for now. It is the
heaviest visualizer DSP and its result quality is still mixed, so it
needs more testing before being enabled.

## Relevant Specification this PR implements
- Sendspin/spec#86
- Sendspin/spec#69
- Sendspin/spec#81

## Testing

I tested this locally, on a Raspberry Pi 4, and a Home Assistant Green.
Lower powered clients may not be powerful enough to compute `beats` in
time, but the Hue bridge falls back to `peaks` if thats the case.
Comment thread README.md

Energy onset event. Fires on any transient (drum hits, cymbal crashes, attacks), independent of musical timing. `strength` 0-255 lets clients scale flash intensity.

#### `pitch` — message type `21`
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple concerns with pitch that came up while implementing this into sendspin-cli.

  • First of all, the pitch given by aiosendspin isn't too precise and useful. (disabled in Music Assistant for this reason). But thats more of an implementation issue.
  • Secondly, how long is a pitch supposed to be valid? If the server stops emitting when there's nothing tonal, the last value just sticks until the next track or so. One could interpret "ignore below your own threshold" as "clear when confidence is below your threshold or 0", but thats not defined in the Specification.
  • And then the confidence scale itself: every client picking its own threshold and no defined meaning of the threshold makes behavior between server and client implementations inconsistent.

But if there is no reliable way to get a single useful pitch value, we could also just consider removing pitch from the specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants