fix(RDKEMW-16441): gateway connect fails on numeric loopback IP when no external network#83
fix(RDKEMW-16441): gateway connect fails on numeric loopback IP when no external network#83brendanobra wants to merge 32 commits intomainfrom
Conversation
websocketpp constructs tcp::resolver::query(host, port) using the Boost ASIO default flags, which include AI_ADDRCONFIG. That flag tells getaddrinfo(3) to return results only when a local interface for the address family is configured. On certain embedded devices (RDK STBs) the loopback interface is not brought up independently -- it comes up as part of the general networking init. When eth0/wifi are down, lo is also down, so AI_ADDRCONFIG causes getaddrinfo to return HOST_NOT_FOUND even for 127.0.0.1, producing: asio async_resolve error: asio.netdb:1 (Host not found (authoritative)) WebSocket Connection Unknown ... websocketpp.transport:2 Underlying Transport Error Fix: vendor websocketpp/transport/asio/endpoint.hpp (the one file that constructs the resolver query). For hosts that parse as a numeric IP address via asio::ip::address::from_string(), use the numeric_host flag (AI_NUMERICHOST) instead of address_configured (AI_ADDRCONFIG). This bypasses the DNS resolver and the AI_ADDRCONFIG interface-presence check entirely. For genuine hostnames the existing address_configured behaviour is preserved. The vendor/ directory is placed before system websocketpp headers in target_include_directories so the patched file takes priority.
|
b'## Copyright scan failure |
There was a problem hiding this comment.
Pull request overview
This PR fixes Firebolt gateway connection failures on numeric loopback IPs (e.g., ws://127.0.0.1) in “no external network” environments by patching websocketpp resolver behavior, and adds targeted network-isolation tests and CI coverage to prevent regressions.
Changes:
- Patch vendored websocketpp Asio transport to treat numeric IP hosts as
numeric_host(AI_NUMERICHOST), bypassingAI_ADDRCONFIG. - Add new unit tests and a CI job that runs them under
docker --network none, plus a helper integration script to reproduce locally. - Introduce initial gateway connection retry settings/config and implement a retry loop in
GatewayImpl::connect().
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Resolver change to detect numeric IPs and force numeric_host during async resolve. |
src/CMakeLists.txt |
Ensure src/vendor headers take precedence over system websocketpp headers. |
test/unit/transportTest.cpp |
Add regression tests for numeric IPv4/IPv6 loopback and numeric-IP failure path. |
test/integration/test-no-network.sh |
Local reproduction script for loopback-only / no-network environment. |
.github/workflows/ci.yml |
Add network_isolation_tests job running selected tests with --network none. |
src/gateway.cpp |
Add initial connection retry loop + watchdog wakeups; adjust disconnect logging/behavior. |
include/firebolt/config.h.in |
Add reconnect_max_attempts and reconnect_delay_ms config knobs. |
cmake/version.cmake |
Prefer .version file over git describe when present. |
build.sh |
Add --docker mode and SYSROOT defaulting; attempt to wipe stale build dirs. |
|
b'## Copyright scan failure |
src/vendor/websocketpp is a patched copy of websocketpp 0.8.2 (BSD-3-Clause, Copyright Peter Thorson). It is documented in NOTICE and LICENSE. Exclude from BlackDuck scan to avoid component re-identification.
|
b'## Copyright scan failure |
|
@rdkcmf-jenkins re-scan |
- gateway.cpp: reset connectResultError to None at the start of each retry attempt so a stale error from attempt N is never reported as the final error of attempt N+1 (Copilot line 537) - gateway.cpp: add connectionListenerMtx to serialize reads/writes of connectionChangeListener between the websocketpp IO thread and the main thread (Copilot line 542) - gateway.cpp: on 10-second connect timeout (connectResultReady not set), call transport.disconnect() to abort the in-flight async attempt before retrying (Copilot line 529 - stale attempt race) - gatewayTest: add GatewayRetryUTest with three tests covering retry behaviour: server-delayed connect, no-retry fast-fail, and disconnect-aborts-retry-delay (Copilot line 529 - missing tests) - transportTest: catch std::exception in the echo server message handler and report via ADD_FAILURE() instead of swallowing silently (Copilot line 740) - build.sh: fix --docker incremental build wipe -- compare CMAKE_HOME_DIRECTORY against /workspace (container path) rather than $SCRIPT_DIR (host path) so Docker-built caches survive subsequent --docker invocations (Copilot build.sh:41)
|
b'## Copyright scan failure |
There was a problem hiding this comment.
Pull request overview
This PR addresses a websocketpp DNS resolution edge case where connecting to a numeric loopback IP (e.g., 127.0.0.1) can fail on embedded devices with no configured non-loopback interfaces due to AI_ADDRCONFIG. It vendors a patched websocketpp ASIO endpoint resolver behavior and adds coverage (including CI network-isolated runs) plus a new initial-connect retry policy in the Gateway.
Changes:
- Patch vendored websocketpp ASIO resolver to use
numeric_hostfor numeric IP literals (bypassingAI_ADDRCONFIG). - Add unit + integration/CI “no network” coverage for numeric IPv4/IPv6 loopback connections and numeric-IP failure behavior.
- Add Gateway initial connection retry policy (
reconnect_max_attempts/reconnect_delay_ms) and supporting tests; enhance shutdown/watchdog responsiveness.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Resolver flag change for numeric IP hosts to avoid AI_ADDRCONFIG failures. |
src/CMakeLists.txt |
Ensures vendored websocketpp headers take precedence over system headers. |
test/unit/transportTest.cpp |
Adds numeric IPv4/IPv6 loopback regression tests + numeric-IP failure test. |
test/unit/gatewayTest.cpp |
Adds retry-policy tests for Gateway connect behavior. |
include/firebolt/config.h.in |
Introduces public config knobs for initial connect retry policy. |
src/gateway.cpp |
Implements connect retry loop and adjusts watchdog/shutdown behavior. |
test/integration/test-no-network.sh |
Adds a reproducible “loopback-only” test runner (Docker/unshare). |
.github/workflows/ci.yml |
Adds a CI job to run selected tests under docker --network none. |
build.sh |
Adds --docker convenience and improves stale CMakeCache handling. |
cmake/version.cmake |
Prefers .version pin before git describe for deterministic versioning. |
.bdignore |
Excludes vendored third-party source from BlackDuck scans. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
b'## Copyright scan failure |
There was a problem hiding this comment.
Pull request overview
This PR fixes Firebolt gateway connection failures when using numeric loopback IPs on devices without external network interfaces, and adds test/CI coverage for this scenario (including network-isolated execution). It also introduces an initial-connect retry policy and some developer/CI build improvements.
Changes:
- Patch vendored WebSocket++ ASIO resolver behavior to use
numeric_hostfor numeric IPs (bypassingAI_ADDRCONFIG). - Add unit + network-isolation tests for numeric IPv4/IPv6 loopback connections and resolver error paths; add gateway initial-connect retry tests.
- Update gateway connect logic to support retries and improve disconnect/watchdog behavior; add CI job and build tooling updates.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Detect numeric IPs and resolve with numeric_host to avoid AI_ADDRCONFIG failures. |
src/CMakeLists.txt |
Ensure src/vendor include path precedes system WebSocket++ headers so the patch takes effect. |
src/gateway.cpp |
Add initial-connect retry loop, callback synchronization, and watchdog/disconnect responsiveness changes. |
include/firebolt/config.h.in |
Add reconnect_max_attempts / reconnect_delay_ms config fields. |
include/firebolt/gateway.h |
Document callback threading semantics for connect() vs subsequent callbacks. |
test/unit/transportTest.cpp |
Add numeric IPv4/IPv6 loopback connection + round-trip + failure-path tests. |
test/unit/gatewayTest.cpp |
Add retry behavior unit tests for the gateway connect logic. |
test/integration/test-no-network.sh |
Add a script to run targeted tests under network isolation (Docker/unshare). |
.github/workflows/ci.yml |
Add network_isolation_tests CI job running the numeric-IP tests with --network none. |
build.sh |
Add --docker mode, default SYSROOT_PATH, and stale cache wiping between environments. |
cmake/version.cmake |
Prefer .version pin before git describe to avoid ambiguous tag resolution. |
.version |
Pin project version to 1.1.8. |
LICENSE |
Append BSD-3 license text (currently with placeholders). |
.bdignore |
Exclude vendored sources from BlackDuck scans. |
…ht attempts promptly Transport::disconnect() only called client_->close() when connectionStatus_ was Connected. During DNS/TCP/WebSocket handshake the status stays Disconnected, so stop_perpetual() alone did not cancel the pending async operation and connectionThread_->join() could block for the full OS-level TCP timeout rather than returning promptly. Introduce TransportState::Connecting (set in connect() after client_->connect() is queued, cleared by onOpen/onFail/onClose): * disconnect() detects Connecting and calls client_->stop(), which forces the asio io_service to exit run() immediately so the connection thread can be joined without delay. onFail will not fire after stop(), which is safe because the gateway condvar is already woken by disconnectRequested_. * All other state transitions are unchanged: onOpen -> Connected, onFail/onClose -> Disconnected.
There was a problem hiding this comment.
Pull request overview
This PR addresses Firebolt gateway connection failures on numeric loopback IPs (e.g. ws://127.0.0.1) in “no external network” environments by patching the vendored WebSocket++ resolver behavior, and adds CI coverage for the regression scenario.
Changes:
- Patch WebSocket++ Asio resolver to use
numeric_hostfor numeric IP hosts, bypassingAI_ADDRCONFIG. - Add unit + network-isolation integration coverage for numeric IPv4/IPv6 loopback connections and gateway initial-connect retry behavior.
- Add build/CI/versioning updates (Docker build option,
.versionpin, CI job for--network nonetests).
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Uses numeric_host for numeric IP resolution to avoid AI_ADDRCONFIG failures. |
src/CMakeLists.txt |
Ensures vendored WebSocket++ headers take precedence over system headers. |
src/transport.cpp |
Adds Connecting state and aborts in-flight connects on disconnect() via client_->stop(). |
src/gateway.cpp |
Implements initial-connect retry loop, early-abort on disconnect, and watchdog thread wake-up. |
include/firebolt/config.h.in |
Adds reconnect_max_attempts / reconnect_delay_ms configuration fields. |
include/firebolt/gateway.h |
Documents connect callback threading behavior (connect thread vs IO thread). |
test/unit/transportTest.cpp |
Adds numeric-IP regression tests (IPv4 loopback, IPv6 loopback, failure path). |
test/unit/gatewayTest.cpp |
Adds retry-policy unit tests for delayed server, no-retry, disconnect abort, already-connected behavior. |
test/integration/test-no-network.sh |
Adds a network-isolation runner (Docker --network none, fallback to unshare --net). |
.github/workflows/ci.yml |
Adds a dedicated network_isolation_tests job to run the new tests under --network none. |
build.sh |
Adds --docker mode, SYSROOT defaulting, and stale CMake cache wiping when switching environments. |
cmake/version.cmake |
Prefers .version pin before attempting git describe. |
.version |
Pins project version to 1.1.8. |
LICENSE |
Appends a BSD-3 template block intended for third-party licensing. |
.bdignore |
Excludes src/vendor/ from BlackDuck scanning. |
gateway.cpp: clamp reconnect_max_attempts to 100 before 1+N to prevent unsigned overflow when caller passes UINT_MAX or similar large value. gateway.cpp + config.h.in: replace hard-coded 10s kConnectTimeout with cfg.connect_attempt_timeout_ms (default 10000 ms, backward compatible). Update reconnect_max_attempts docstring to describe the per-attempt timeout, the effective maximum blocking duration formula, and the 100- retry cap. Add connect_attempt_timeout_ms field with full docstring. build.sh: replace unsafe '-n "$@"' condition (undefined behaviour with multiple args inside [[ ]]) with '$# -gt 0'.
There was a problem hiding this comment.
Pull request overview
This PR addresses Firebolt gateway connection failures on RDK devices when connecting to numeric loopback IPs (e.g., 127.0.0.1) in “no external network” boot scenarios, and adds an initial-connect retry policy plus CI coverage for loopback-only environments.
Changes:
- Patch vendored WebSocket++ Asio resolver logic to treat numeric IP hosts as
numeric_host(AI_NUMERICHOST) to bypassAI_ADDRCONFIG. - Add initial connection retry/timeout behavior in the gateway + transport disconnect improvements for in-flight connects.
- Add unit/integration tests and a CI job that runs selected tests under
docker --network none.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Detect numeric IP hosts and force numeric_host resolver flags to avoid AI_ADDRCONFIG failures. |
src/CMakeLists.txt |
Ensures the vendored WebSocket++ header shadowing takes priority during library compilation. |
src/transport.cpp |
Adds a Connecting state and abort logic to stop an in-progress connect on disconnect. |
src/gateway.cpp |
Implements initial-connect retry loop with per-attempt timeout and early-abort on disconnect; adds synchronization and watchdog wake-up. |
include/firebolt/config.h.in |
Documents and introduces retry-related config knobs for initial connect. |
include/firebolt/gateway.h |
Documents callback threading behavior for connect() vs subsequent connection events. |
test/unit/transportTest.cpp |
Adds numeric-IP and IPv6 loopback regression/unit tests (including failure path). |
test/unit/gatewayTest.cpp |
Adds unit tests covering retry behavior, disconnect-abort, and AlreadyConnected callback semantics. |
test/integration/test-no-network.sh |
Adds a loopback-only runner using Docker --network none (or unshare --net). |
.github/workflows/ci.yml |
Adds a dedicated network_isolation_tests job running the loopback-only test filter. |
build.sh |
Adds --docker support, defaults SYSROOT_PATH, and wipes stale CMake caches when switching environments. |
cmake/version.cmake |
Prefers .version over git describe to make version selection deterministic. |
.version |
Pins version to 1.1.8. |
LICENSE |
Appends BSD-3 license text (but currently with placeholders). |
.bdignore |
Excludes src/vendor/ from BlackDuck scanning. |
The condvar wrapper is installed in connectionChangeListener before transport.connect() is called so it is in place when onOpen/onFail fires. If transport.connect() returns AlreadyConnected, the wrapper is restored to previousListener — but in the intervening window an IO-thread event (e.g. the live connection dropping) could hit the wrapper and be silently discarded instead of reaching the previously-registered callback. Fix: capture previousListener and a shared skipPrevForward flag inside the wrapper. Events arriving before skipPrevForward is set (i.e. during the AlreadyConnected window) are forwarded to previousListener so they are not dropped. skipPrevForward is set to true immediately after transport.connect() returns Firebolt::Error::None (a new attempt was actually queued), disabling the forward path so genuine new-attempt events are not spuriously delivered to the old callback.
There was a problem hiding this comment.
Pull request overview
This PR addresses connection failures when using numeric loopback IPs (e.g. ws://127.0.0.1) on devices with no external network configured at boot by forcing numeric-host resolution to bypass AI_ADDRCONFIG. It also adds an initial-connection retry policy to the gateway and introduces network-isolation coverage in CI.
Changes:
- Patch websocketpp’s ASIO resolver path to use
numeric_hostfor numeric IPs (IPv4/IPv6), avoidingAI_ADDRCONFIG-triggered failures. - Add initial connection retry/timeout behavior to
GatewayImpl::connect()plus watchdog shutdown improvements. - Add unit/integration tests (including
--network none) and CI job wiring; update build/version tooling.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Vendored override of websocketpp endpoint.hpp to apply numeric-host resolver flags for numeric IPs. |
src/CMakeLists.txt |
Ensures src/vendor include path is preferred so the patched websocketpp header takes effect. |
src/transport.cpp |
Adds Connecting state and aborts in-progress connects via client_->stop() during disconnect. |
src/gateway.cpp |
Implements initial connect retry loop with per-attempt timeout and disconnect-abort behavior; adds synchronization around connection listener and watchdog wakeups. |
include/firebolt/config.h.in |
Adds config fields for retry policy and per-attempt connect timeout (with documentation). |
include/firebolt/gateway.h |
Documents callback threading behavior for connect() vs subsequent connection events. |
test/unit/transportTest.cpp |
Adds regression/unit tests for numeric IPv4/IPv6 loopback behavior and numeric-IP failure path. |
test/unit/gatewayTest.cpp |
Adds unit tests validating retry policy behaviors (delayed server, no-retry fail fast, disconnect abort, AlreadyConnected behavior). |
test/integration/test-no-network.sh |
Adds a local runner for loopback-only / no-external-network reproduction via Docker or unshare. |
.github/workflows/ci.yml |
Adds a dedicated network_isolation_tests job running selected tests under --network none. |
build.sh |
Adds --docker helper mode, defaults SYSROOT_PATH to /, and wipes stale CMake caches when switching environments. |
cmake/version.cmake |
Prefers .version file over git describe when present. |
.version |
Pins project version string (1.1.8). |
LICENSE |
Appends a BSD-3 license template block (currently with placeholders). |
.bdignore |
Excludes vendored third-party source from BlackDuck scanning. |
connectionStatus_ is already std::atomic<TransportState> but all reads and writes used implicit conversion operators, obscuring the atomic semantics and leaving memory ordering unspecified. Replace with explicit .load()/.store() (default seq_cst) throughout connect(), disconnect(), send(), start(), and the three websocketpp callbacks.
There was a problem hiding this comment.
Pull request overview
This PR fixes Firebolt gateway connection failures in “no external network” environments (e.g., early boot on RDK STBs) by ensuring numeric loopback IPs (like 127.0.0.1 / ::1) resolve without AI_ADDRCONFIG interference, and adds an initial-connect retry policy plus CI coverage for network-isolated scenarios.
Changes:
- Patch websocketpp resolution to use
numeric_hostfor numeric IPs to bypassAI_ADDRCONFIG. - Add initial connection retry logic in
GatewayImpl::connect()(bounded attempts, delay slicing, per-attempt timeout) and Transport “Connecting” state handling. - Add numeric-IP + IPv6 regression tests and a dedicated CI job running tests under
docker --network none, plus supporting build/version script updates.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
test/unit/transportTest.cpp |
Adds numeric-IP + IPv6 loopback regression unit tests and a numeric-IP failure-path test. |
test/unit/gatewayTest.cpp |
Adds unit tests validating initial-connect retry behavior and edge cases (disconnect abort, already-connected). |
test/integration/test-no-network.sh |
Adds a helper script to run selected tests under network isolation (Docker preferred; unshare --net fallback). |
src/vendor/websocketpp/transport/asio/endpoint.hpp |
Vendored websocketpp header with resolver-flag change for numeric hosts. |
src/transport.cpp |
Introduces Connecting state and aborts in-flight connect attempts during disconnect() via client_->stop(). |
src/gateway.cpp |
Implements initial-connect retry loop, connection-result synchronization, and watchdog wake-up via condition variable. |
src/CMakeLists.txt |
Ensures src/vendor/ precedes system websocketpp headers for the transport target. |
include/firebolt/gateway.h |
Documents callback threading behavior for connect() vs subsequent events. |
include/firebolt/config.h.in |
Adds public config fields for retry policy and per-attempt connect timeout with documentation. |
cmake/version.cmake |
Prefers .version file over git describe when present. |
build.sh |
Adds --docker convenience mode, improves stale-cache handling, and defaults SYSROOT_PATH to /. |
LICENSE |
Appends BSD-3 license text intended for vendored websocketpp attribution. |
.version |
Pins project version to 1.1.8. |
.github/workflows/ci.yml |
Adds network_isolation_tests job executing unit tests under docker --network none. |
.bdignore |
Excludes src/vendor/ from BlackDuck scanning. |
Comments suppressed due to low confidence (2)
src/gateway.cpp:431
- GatewayImpl::~GatewayImpl() stops/join watchdog but doesn’t call disconnect(). Because members like connectionListenerMtx/connectResultCv are destroyed before the Transport member (due to declaration order), any late Transport onClose/onFail callback during Transport’s destructor can re-enter GatewayImpl::onConnectionChange() and attempt to lock already-destroyed mutexes (UB/crash). Call disconnect() (or otherwise fully tear down the Transport/IO thread and ensure callbacks can’t fire) from ~GatewayImpl() while all synchronization primitives are still alive.
~GatewayImpl()
{
if (watchdogRunning)
{
watchdogRunning = false;
watchdogCv.notify_one();
if (watchdogThread.joinable())
{
watchdogThread.join();
}
}
}
src/CMakeLists.txt:94
- The patched websocketpp header is only added as a PRIVATE include directory. Unit tests (and any other targets) compile websocketpp headers via system includes and won’t see src/vendor/, which can lead to different definitions of the same websocketpp templates across translation units (ODR violation / UB) if some TUs include the system endpoint.hpp and others include the patched one. Ensure all targets that include websocketpp (especially test/utApp, which directly uses websocketpp::server) see the same patched header, e.g. by propagating the vendor include path to those targets or making it an INTERFACE include for internal builds.
target_include_directories(${TARGET}
PRIVATE
# vendor/ must come before the system websocketpp headers so that
# our patched endpoint.hpp (numeric-IP resolver fix) takes priority.
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/vendor>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}>
$<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}>
PUBLIC
$<BUILD_INTERFACE:${CMAKE_SOURCE_DIR}/include>
$<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}>
$<INSTALL_INTERFACE:${CMAKE_INSTALL_INCLUDEDIR}>
)
The skipPrevForward flag was set after transport.connect() returned, but onOpen/onFail can fire on the IO thread between client_->connect() being queued inside transport.connect() and skipPrevForward->store(true) executing in gateway.cpp. In that window the wrapper incorrectly forwards the new-attempt callback to previousListener (the old session's callback), triggering a spurious user-visible event. Remove the forwarding mechanism entirely. The wrapper now only signals the condvar and never touches previousListener. For the AlreadyConnected window the theoretical one-event loss is negligible (synchronous return, listener restored before connect() returns) and correct behaviour is preferable to the data race.
There was a problem hiding this comment.
Pull request overview
This PR addresses connection failures to local numeric loopback WebSocket endpoints (e.g. ws://127.0.0.1) on devices with no configured external network interfaces at boot, and adds an initial-connect retry policy plus CI coverage for network-isolated scenarios.
Changes:
- Patch vendored WebSocket++ resolver behavior to treat numeric IP hosts as
AI_NUMERICHOSTto bypassAI_ADDRCONFIG. - Add initial connection retry/timeout behavior in
GatewayImpl::connect()and improve disconnect/watchdog responsiveness. - Add unit + network-isolated CI tests for numeric IP / IPv6 loopback connectivity and retry scenarios; add
.versionpin + build script enhancements.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/transportTest.cpp | Adds regression/unit tests for numeric IPv4/IPv6 loopback connect + send/receive + failure path. |
| test/unit/gatewayTest.cpp | Adds retry-policy unit tests for delayed server, no-retry fast-fail, disconnect abort, already-connected behavior. |
| test/integration/test-no-network.sh | Adds a helper script to run selected tests under Docker --network none (or unshare --net). |
| src/vendor/websocketpp/transport/asio/endpoint.hpp | Applies numeric-host detection and uses numeric_host resolver flag for numeric IPs. |
| src/transport.cpp | Adds Connecting state and adjusts disconnect behavior to abort in-flight connects; converts state ops to atomic loads/stores. |
| src/gateway.cpp | Implements initial-connect retry loop with per-attempt timeout, early-abort on disconnect, listener synchronization, watchdog wakeups. |
| src/CMakeLists.txt | Ensures vendored websocketpp headers take precedence over system headers. |
| include/firebolt/gateway.h | Documents callback threading behavior for connect() vs subsequent connection events. |
| include/firebolt/config.h.in | Introduces and documents reconnect_max_attempts, reconnect_delay_ms, connect_attempt_timeout_ms. |
| cmake/version.cmake | Prefers .version over git describe to avoid ambiguous tag resolution. |
| build.sh | Adds --docker mode, defaults SYSROOT_PATH to /, and wipes stale CMake caches when switching environments. |
| LICENSE | Appends BSD-3 license text (currently with placeholders). |
| .version | Pins project version to 1.1.8. |
| .github/workflows/ci.yml | Adds a network_isolation_tests job running selected tests under --network none. |
| .bdignore | Excludes src/vendor/ from BlackDuck scans. |
|
b'## WARNING: A Blackduck scan failure has been waived A prior failure has been upvoted
|
Problem
On RDK STBs where eth0/wifi are not active at boot (e.g. early app startup),
connecting to
ws://127.0.0.1:3474fails with "Underlying Transport Error".Root cause: websocketpp's
tcp::resolver::queryusesAI_ADDRCONFIG(the POSIXdefault) which causes
getaddrinfo("127.0.0.1")to returnHOST_NOT_FOUNDwhenno non-loopback interface is configured — even though the address is numeric and
loopback is always reachable.
Fix
Patched
src/vendor/websocketpp/transport/asio/endpoint.hpp:asio::ip::address::from_string()before callingasync_resolveresolver::query::numeric_host(AI_NUMERICHOST) for numeric IPs — bypassesAI_ADDRCONFIG, no DNS query is sent, resolution is in-kernel only
"localhost") continue to useaddress_configuredas beforeTests
TransportNumericIPUTest::ConnectViaNumericLoopbackIP— connect tows://127.0.0.1TransportNumericIPUTest::SendAndReceiveViaNumericIP— full data round-trip over numeric IPTransportNumericIPResolverTest::ConnectFailureViaNumericIP— error path with no server listeningTransportIPv6UTest::ConnectViaIPv6LoopbackIP— same fix path exercised for IPv6::1All four run under
--network none(Docker, loopback only) to precisely reproducethe STB environment. CI has a dedicated
network_isolation_testsjob that runsthese in
docker run --network none.Other changes
build.sh: defaultSYSROOT_PATHto/for dev-box builds;--dockerflagto build/run inside the CI image without local deps; stale CMakeCache wipe when
switching between host and Docker builds
cmake/version.cmake+.version: explicit1.1.8version pin (git describewas resolving ambiguously to
v1.1.7)