Skip to content

Reconnection and multi-server arbitration: SDK persistence, mandatory another_server, reconnect backoff #92

@balloob

Description

@balloob

This issue comes out of a cross-SDK conformance audit comparing every Sendspin client/server implementation (aiosendspin, sendspin-cli, sendspin-cpp, sendspin-dotnet, sendspin-go, sendspin-js, sendspin-jvm, sendspin-rs, SendspinKit) against the spec. sendspin-cpp is treated as the reference implementation throughout — it most closely matches the spec's language and is the only SDK that runs on the constrained-embedded target the spec was originally written for.

The spec mandates client-side multi-server arbitration with
last-played persistence, but only sendspin-cpp and sendspin-cli
implement it.
Six SDKs (dotnet, go, js, jvm, rs, SendspinKit) have
no concept of "last played server" at all. sendspin-cpp is the
reference: ConnectionManager stores an FNV-1 hash of the last-played
server_id via SendspinPersistenceProvider, runs the spec's
handshake-first decision table on a second-server connect, and
auto-sends 'another_server' on switch.

1. Make SDK persistence the recommended shape

Today the spec text "Clients must persistently store" reads as a
requirement on the device, but in practice the SDK is where that
storage has to live — otherwise every embedder reinvents it
differently. Add:

Client SDKs SHOULD provide a persistence hook (file, key-value, or
callback) so that the application can plug in storage without
re-implementing the spec's multi-server arbitration logic.

Recommend reference shape from sendspin-cpp's
SendspinPersistenceProvider.

2. Mandate auto-another_server on switch

Strengthen the current text to:

When a client switches between servers under the multi-server
arbitration rules, it MUST emit client/goodbye with reason
'another_server' to the server it is leaving. Defaulting to any
other reason is non-conformant.

This makes the server's "did the client switch, or shut down?"
decision deterministic.

3. Specify a recommended client-side reconnect backoff

The spec today is silent on client-side reconnect. Among SDKs that
do retry the parameters are uncoordinated (1 s → 15 s for js, 1 s →
30 s for jvm) and several SDKs have no built-in retry at all
(cpp, rs, SendspinKit). Add:

Client SDKs SHOULD implement exponential backoff for reconnection,
starting at 1 s and capping at 30 s, with unbounded retries.

Specific curve is not critical; the value of writing it down is so
SDKs converge.


Source audit doc: docs/reconnection-and-multi-server.md
Full audit branch: claude/stream-sync-correction-sdks-AWoNC
Per-SDK digest: sdk-issues digest

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions