Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 20 additions & 3 deletions docs/internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -382,13 +382,14 @@ The filter is protected by `state_mutex_` so that `compute_client_time()` can be

### Connection Management (`src/connection_manager.cpp`)

The `ConnectionManager` maintains up to three connection slots:
The `ConnectionManager` maintains two observer slots for the connections it routes messages through:

| Slot | Purpose |
|------|---------|
| `current_connection_` | Active connection receiving messages |
| `pending_connection_` | Handoff candidate completing its handshake |
| `dying_connection_` | Gracefully disconnecting (kept alive as shared_ptr until goodbye is sent) |

Both are `std::shared_ptr<SendspinConnection>`, and on the ESP server path they are observers rather than authoritative owners — the authoritative owner of a `SendspinServerConnection` is the httpd session itself (see [Server connection ownership (ESP)](#server-connection-ownership-esp)). On the host (IXWebSocket) client path the `shared_ptr` in these slots is the only owner.

### Handshake and Handoff

Expand Down Expand Up @@ -419,7 +420,23 @@ When a connection is lost (`on_connection_lost`):

### Graceful Disconnect

`disconnect_and_release()` moves the connection into `dying_connection_` (as a `shared_ptr`) and sends a goodbye message with a completion callback. The callback sets a flag under the connection mutex, and the main loop's next tick destroys the connection. This two-phase approach prevents use-after-free when platform worker threads (e.g., ESP httpd) have pending work items referencing the connection.
`disconnect_and_release()` calls `conn->disconnect(reason, nullptr)` and lets the local `shared_ptr` go out of scope.

- **ESP server**: the goodbye text is queued as an httpd worker job; the worker looks the connection up via `httpd_sess_get_ctx`, sends the frame, then runs the completion lambda that calls `trigger_close()`. The session slot installed in `open_callback` keeps the connection alive across that whole sequence even after `ConnectionManager`'s observer `shared_ptr` is dropped. The session is finally freed when httpd invokes the slot's `free_fn` (see [Server connection ownership (ESP)](#server-connection-ownership-esp)). The completion lambda captures a `weak_ptr` to make this lifetime explicit — `trigger_close()` is skipped if the conn has already been freed.
- **Host client**: the IXWebSocket send is synchronous, so the goodbye and close have both completed by the time `disconnect()` returns and the `shared_ptr` drops the last reference.

### Server connection ownership (ESP)

On the ESP build, `SendspinServerConnection` lifetime is pinned to the httpd session rather than to `ConnectionManager`:

1. `SendspinWsServer::open_callback` (the httpd `open_fn`) creates the `shared_ptr<SendspinServerConnection>`, heap-allocates a `shared_ptr*` slot, and calls `httpd_sess_set_ctx(handle, sockfd, slot, free_fn)` with a deleter that `delete`s the slot. That slot is the authoritative reference.
2. The same shared_ptr is forwarded into `ConnectionManager::on_new_connection`, which stores it in `current_connection_` or `pending_connection_` as a *secondary observer*.
3. The httpd WebSocket handler (`websocket_handler`) and queued workers (`async_send_text`, `async_send_time_text`) look the connection up by `httpd_sess_get_ctx(handle, sockfd)` at run time, copying the slot's `shared_ptr` for the duration of their work. They never assume the manager's observer slot is alive.
4. When the socket closes, httpd calls the `close_fn` first (which fires `connection_closed_callback_` so `ConnectionManager` can drop its observer in the next `loop()`), then later calls the slot's `free_fn` to release the authoritative reference once no workers are queued for that session.

Queued workers carry only a `{httpd_handle_t, int sockfd}` pair (`AsyncRespArg` for text sends, `SessionLookup` for time sends) rather than capturing the connection directly. If the session has already been torn down by the time a worker runs, the lookup returns null and the worker no-ops cleanly. Both arg structs are allocated through `platform_malloc` / `platform_malloc_internal` and freed in the worker.

The host build does not need this scheme: `SendspinWsServer` (host) routes IXWebSocket messages by calling `find_connection_callback_` to resolve a synthetic sockfd back to the connection that `ConnectionManager` is holding. The ESP build keeps the `set_find_connection_callback()` setter as a no-op stub for symmetry; see the comment at the call site in `ConnectionManager::init_server`.

## Ordering Guarantees Summary

Expand Down
50 changes: 19 additions & 31 deletions src/connection_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ ConnectionManager::~ConnectionManager() {
this->pending_connection_.reset();
}
this->hello_retry_.conn.reset();
this->dying_connection_.reset();
}

// ============================================================================
Expand Down Expand Up @@ -101,20 +100,21 @@ void ConnectionManager::init_server(SendspinClient* client) {
this->ws_server_->set_ctrl_port(this->client_->config_.httpd_ctrl_port);

this->ws_server_->set_new_connection_callback(
[this](std::unique_ptr<SendspinServerConnection> conn) {
[this](std::shared_ptr<SendspinServerConnection> conn) {
this->on_new_connection(std::move(conn));
});

this->ws_server_->set_connection_closed_callback([this](int sockfd) {
SS_LOGD(TAG, "Connection closed callback for socket %d", sockfd);
// Defer the actual cleanup to loop() to avoid use-after-free.
// This callback runs in the httpd thread, but there may be pending httpd_queue_work items
// (e.g., async_send_text) with raw pointers to the connection object. If we destroy the
// connection here, those pending work items would dereference freed memory when processed.
// Defer the actual cleanup to loop() so on_connection_lost runs on the main thread
// alongside the rest of the connection state mutations.
std::lock_guard<std::mutex> lock(this->conn_mutex_);
this->pending_close_events_.push_back(sockfd);
});

// Connection lookup-by-sockfd. Used by the host build's ws_server to route IXWebSocket
// messages; the ESP build ignores this and looks the connection up directly via
// httpd_sess_get_ctx (set in open_callback), so its setter is a no-op stub.
this->ws_server_->set_find_connection_callback(
[this](int sockfd) -> std::shared_ptr<SendspinConnection> {
std::lock_guard<std::mutex> lock(this->conn_ptr_mutex_);
Expand All @@ -126,9 +126,6 @@ void ConnectionManager::init_server(SendspinClient* client) {
this->pending_connection_->get_sockfd() == sockfd) {
return this->pending_connection_;
}
// Deliberately excludes dying_connection_; it has already been through cleanup and its
// message dispatch is disabled. Returning it here would let httpd route stale messages
// from the old connection into freshly-reset role queues.
return nullptr;
});
}
Expand All @@ -149,15 +146,12 @@ void ConnectionManager::loop() {
std::vector<std::shared_ptr<SendspinConnection>> connected_events;
std::vector<std::shared_ptr<SendspinConnection>> disconnect_events;
std::vector<ServerHelloEvent> hello_events;
bool release_dying = false;
{
std::lock_guard<std::mutex> lock(this->conn_mutex_);
close_events.swap(this->pending_close_events_);
connected_events.swap(this->pending_connected_events_);
disconnect_events.swap(this->pending_disconnect_events_);
hello_events.swap(this->pending_hello_events_);
release_dying = this->dying_connection_ready_to_release_;
this->dying_connection_ready_to_release_ = false;
}

{
Expand Down Expand Up @@ -214,10 +208,6 @@ void ConnectionManager::loop() {
}
}
}

if (release_dying) {
this->dying_connection_.reset();
}
}

// Call loop on active connections using shared_ptr copies to avoid holding the lock
Expand Down Expand Up @@ -320,10 +310,10 @@ void ConnectionManager::setup_connection_callbacks(SendspinConnection* conn) {
};
}

void ConnectionManager::on_new_connection(std::unique_ptr<SendspinServerConnection> conn) {
// Called from httpd open_callback thread. Connection pointer assignment must happen inline
// because the httpd find_connection_callback needs to locate this connection immediately
// for subsequent message routing on the same thread.
void ConnectionManager::on_new_connection(std::shared_ptr<SendspinServerConnection> conn) {
// Called from the platform's new-connection callback (ESP: httpd open_callback). The
// authoritative owner is the platform's session/transport context; this observer shared_ptr
// can be reset at any time without freeing the conn out from under in-flight workers.
conn->init_time_filter();
conn->set_websocket_payload_location(this->client_->config_.websocket_payload_location);

Expand All @@ -340,8 +330,7 @@ void ConnectionManager::on_new_connection(std::unique_ptr<SendspinServerConnecti
SS_LOGD(TAG, "Existing connection present, setting as pending for handoff");
if (this->pending_connection_ != nullptr) {
SS_LOGW(TAG, "Already have pending connection, rejecting new connection");
this->disconnect_and_release(std::shared_ptr<SendspinConnection>(std::move(conn)),
SendspinGoodbyeReason::ANOTHER_SERVER);
this->disconnect_and_release(std::move(conn), SendspinGoodbyeReason::ANOTHER_SERVER);
return;
}
this->pending_connection_ = std::move(conn);
Expand Down Expand Up @@ -493,16 +482,15 @@ void ConnectionManager::complete_handoff(bool switch_to_new) {
}
}

void ConnectionManager::disconnect_and_release(std::shared_ptr<SendspinConnection> conn,
void ConnectionManager::disconnect_and_release(std::shared_ptr<SendspinConnection>&& conn,
SendspinGoodbyeReason reason) {
this->dying_connection_ = std::move(conn);
this->dying_connection_->disconnect(reason, [this]() {
// Defer the actual destruction to loop() to avoid use-after-free.
// This callback runs in the httpd worker thread (async_send_text), but the connection
// must stay alive until all pending httpd_queue_work items have been processed.
std::lock_guard<std::mutex> lock(this->conn_mutex_);
this->dying_connection_ready_to_release_ = true;
});
// Take ownership of the caller's shared_ptr into a local, send the goodbye, then let the
// local go out of scope. On ESP the httpd session slot keeps the conn alive until the
// goodbye worker runs, calls trigger_close, the session tears down, and the slot's free_fn
// fires. On host the IXWebSocket send is synchronous, so the goodbye + close have both
// completed by the time disconnect() returns.
auto local = std::move(conn);
local->disconnect(reason, nullptr);
}

} // namespace sendspin
15 changes: 8 additions & 7 deletions src/connection_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,10 @@ class ConnectionManager {
/// @param conn The connection to configure.
void setup_connection_callbacks(SendspinConnection* conn);
/// @brief Accepts an incoming server connection as current or pending for handoff.
/// @param conn The newly accepted server connection.
void on_new_connection(std::unique_ptr<SendspinServerConnection> conn);
/// @param conn The newly accepted server connection. The session slot keeps a parallel
/// refcount, so this observer can be reset at any time without freeing the conn
/// out from under in-flight httpd workers.
void on_new_connection(std::shared_ptr<SendspinServerConnection> conn);

// ========================================
// Hello handshake
Expand Down Expand Up @@ -178,10 +180,11 @@ class ConnectionManager {
/// @brief Completes a server handoff by promoting or discarding the pending connection.
/// @param switch_to_new True to promote the pending connection, false to discard it.
void complete_handoff(bool switch_to_new);
/// @brief Sends a goodbye, then defers destruction of the connection until loop() runs.
/// @param conn The connection to disconnect and release.
/// @brief Sends a goodbye and takes ownership of the caller's shared_ptr so it drops at
/// function exit.
/// @param conn The connection to disconnect and release. Caller's shared_ptr is left empty.
/// @param reason The goodbye reason to send before closing.
void disconnect_and_release(std::shared_ptr<SendspinConnection> conn,
void disconnect_and_release(std::shared_ptr<SendspinConnection>&& conn,
SendspinGoodbyeReason reason);

// Struct fields
Expand All @@ -196,15 +199,13 @@ class ConnectionManager {
// Pointer fields
SendspinClient* client_;
std::shared_ptr<SendspinConnection> current_connection_;
std::shared_ptr<SendspinConnection> dying_connection_;
std::shared_ptr<SendspinConnection> pending_connection_;
std::unique_ptr<SendspinWsServer> ws_server_;

// 32-bit fields
uint32_t last_played_server_hash_{0};

// 8-bit fields
bool dying_connection_ready_to_release_{false};
bool has_last_played_server_{false};
};

Expand Down
Loading
Loading