refactor: LND SubscribeState EOF-based shutdown and optimize Neutrino peer selection#3964
refactor: LND SubscribeState EOF-based shutdown and optimize Neutrino peer selection#3964ajaysehwal wants to merge 13 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request transitions the LND lifecycle management to an event-driven approach by utilizing the SubscribeState gRPC stream for both Android and iOS. Key changes include replacing polling-based shutdown detection with EOF signals from the state stream, registering state listeners before starting LND to prevent missing initial events, and refactoring the Neutrino peer optimization logic. Additionally, several asynchronous calls across the stores and views were updated with proper error handling to avoid unhandled promise rejections. Feedback was provided regarding the robustness of error message parsing in both Java and Swift, suggesting the use of regular expressions over manual string manipulation to handle potential format changes.
a14f54a to
00f71e8
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly improves the robustness of LND lifecycle management by transitioning from polling-based mechanisms to event-driven ones. Key changes include the native-side automatic initiation of the SubscribeState stream on both Android and iOS, ensuring that state events are captured immediately upon startup. The stopLnd function has been refactored to wait for a gRPC EOF signal rather than polling status, and error handling has been unified with regex-based parsing of gRPC status messages. Additionally, the PR optimizes Neutrino peer selection through concurrent pings and adds comprehensive promise rejection handling across the TypeScript stores and views. I have no feedback to provide as the implementation follows best practices for asynchronous coordination and native-to-JS communication.
3581eb2 to
e5ecd21
Compare
|
I tested this and wasn't able to start the embedded LND. I didn't look into the details myself, but tried a quick AI assisted analysis, this is the outcome, hope it helps: Observed regression: normal startup hangs indefinitely, restart loop Tested on Android (Samsung Galaxy S20+, Android 13). When trying to start the embedded LND, "ZEUS is starting your node." is displayed for ~60s, then From Likely cause In if (!streamsStarted.contains("SubscribeState")) {
streamsStarted.add("SubscribeState"); // added before the null-check
Method m = streamMethods.get("SubscribeState");
if (m != null) {
m.invoke(null, new byte[0], new LndStateStreamCallback(recipient));
}
// if m == null: stream is silently never started,
// but "SubscribeState" stays locked in streamsStarted
}If Note: this probably only manifests on the normal startup path. The LND_ALREADY_RUNNING retry path should work because Suggested fix Move Method m = streamMethods.get("SubscribeState");
if (m != null) {
streamsStarted.add("SubscribeState");
try {
m.invoke(null, new byte[0], new LndStateStreamCallback(recipient));
} catch (Exception e) {
Log.e(TAG, "Failed to start native SubscribeState stream", e);
streamsStarted.remove("SubscribeState");
}
} else {
Log.e(TAG, "SubscribeState method not found in streamMethods");
// JS subscribeState() call will handle subscription as fallback
} |
e5ecd21 to
b4c4ea1
Compare
696c5e8 to
64aab0f
Compare
64aab0f to
30a756f
Compare
2a9bee6 to
55f97a9
Compare
… and deduplication
701e592 to
2e5237a
Compare
Description
Please enter a description and screenshots, if appropriate, of the work covered in this PR
This PR makes embedded LND startup and shutdown event-driven instead of poll-and-sleep: we wait for the SubscribeState gRPC stream to end (EOF) after stopLnd, so we know the daemon is fully down before starting again. Native code on iOS and Android starts SubscribeState as soon as LND reports it has started, before the JS startLnd promise resolves, so the first state update is not lost.
It also tightens RPC error parsing on both platforms, hardens sync/recovery against transient RPC errors, and speeds up Neutrino peer selection with batched pings and single-pass tiered filtering
What changed
JS stopLnd registers a listener for SubscribeState, calls stop/kill, then awaits EOF (stream closed) as confirmation that LND has stopped.
Removes the previous retry loop around checkStatus() and related fixed post-stop sleeps on Android/iOS.
Android (LndMobileService): after Lndmobile.start succeeds, invoke SubscribeState via streamMethods and LndStateStreamCallback, then send MSG_START_LND_RESULT. On stream error/close, remove SubscribeState from streamsStarted so a later start can subscribe again.
iOS (Lnd / LndMobile): subscribeToStateChanges tracks activeStreams so duplicate subscriptions are skipped; cleanup on error/EOF matches Android behavior.
JS (LndMobileUtils): register the SubscribeState listener before calling startLnd; rely on native stream start and drop the extra subscribeState() + settle sleeps from the old path.
On LND_ALREADY_RUNNING, stopLnd and wait for EOF, then retry start without the long wallet-recovery polling loop and platform-specific cleanup delays that existed before.
Android / iOS: central regex parsing for gRPC-style messages (code = … desc = …) into error_code / error_desc bundles/events (aligned across platforms).
LndMobile.swift: use self.lndGrpcErrorCodeAndDesc inside escaping closures so the project builds with strict capture rules.
LndMobileErrors: treat WALLET_LOCKED as transient where appropriate; tests extended for RPC_NOT_READY / wallet-locked style messages.
SyncStore / NodeInfoStore: handle RPC_NOT_READY, RPC_CONNECTION_CLOSED, and avoid unhandled promise rejections during sync/recovery and reactive getNodeInfo.
Ping candidates in batches (limited concurrency=3).
Single-pass tiered selection (optimal → lax → threshold) with deduplication via a set, preserving the same latency thresholds and target peer count.
Only add SubscribeState to streamsStarted when the reflected Method is non-null and we are about to invoke, so a missing registration does not block a later JS streamOnlyOnce subscription.
How to test
This pull request is categorized as a:
Checklist
yarn run tscand made sure my code compiles correctlyyarn run lintand made sure my code didn’t contain any problematic patternsyarn run prettierand made sure my code is formatted correctlyyarn run testand made sure all of the tests passTesting
If you modified or added a utility file, did you add new unit tests?
I have tested this PR on the following platforms (please specify OS version and phone model/VM):
I have tested this PR with the following types of nodes (please specify node version and API version where appropriate):
On-device
Remote
Locales
Third Party Dependencies and Packages
yarnafter this PR is merged inpackage.jsonandyarn.lockhave been properly updatedOther: