Skip to content

fix: daemonize sequencer with setsid to survive shell closure#34

Open
ygd58 wants to merge 8 commits into
logos-co:masterfrom
ygd58:feat/localnet-daemon-mode-v2
Open

fix: daemonize sequencer with setsid to survive shell closure#34
ygd58 wants to merge 8 commits into
logos-co:masterfrom
ygd58:feat/localnet-daemon-mode-v2

Conversation

@ygd58
Copy link
Copy Markdown
Contributor

@ygd58 ygd58 commented Mar 27, 2026

Summary

Fixes a real DX issue: the sequencer process is a child of the shell running logos-scaffold localnet start. When that shell exits (tmux detach, SSH disconnect, terminal close), the sequencer dies silently and the localnet becomes unreachable.

Real User Need

Developers running localnet in a tmux or SSH session lose their localnet when the session detaches. This is a common workflow pain point.

Fix

Daemonize the sequencer with setsid() so it joins a new process group, detached from the controlling terminal. The setsid() return value is checked — if it fails, the error is propagated instead of silently continuing.

End-to-End Verification

logos-scaffold new test-app
cd test-app
logos-scaffold setup
logos-scaffold localnet start
# Close the terminal / detach tmux
# Sequencer continues running
logos-scaffold localnet status  # → ready: true

Local Checks

  • cargo build
  • cargo test: 62 passed, 0 failed ✅
  • cargo fmt --check

Previously the sequencer died when the parent shell/tmux session closed.
Fix: call setsid() in pre_exec hook on Unix to detach from controlling
terminal and parent process group.

Refs logos-co#33
@ygd58 ygd58 requested a review from a team March 27, 2026 07:39
@weboko weboko requested a review from Copilot April 10, 2026 21:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the localnet sequencer launch path so the spawned sequencer process is detached from the invoking shell’s session on Unix, allowing it to keep running after the terminal/tmux session closes.

Changes:

  • Add a Unix CommandExt::pre_exec hook that calls setsid() before exec when spawning the sequencer to a log file.
  • Add a direct libc dependency to support the setsid() call.
  • Update Cargo.lock accordingly.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/process.rs Adds a Unix pre_exec hook intended to detach the spawned process into a new session.
Cargo.toml Introduces libc as a dependency for setsid().
Cargo.lock Locks the libc version update and records the new direct dependency.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/process.rs Outdated
use std::os::unix::process::CommandExt;
cmd.pre_exec(|| {
// Create a new session — detaches from controlling terminal
libc::setsid();
Comment thread src/process.rs
Comment on lines +103 to +105
// Daemonize: detach from parent process group so the sequencer
// survives shell/tmux session closure (fixes logos-co/logos-scaffold#33)
#[cfg(unix)]
ygd58 added 2 commits April 24, 2026 21:25
If setsid() returns -1 (e.g. process is already a group leader),
propagate the OS error instead of silently continuing with the
child still attached to the original session.

Closes Copilot review feedback on PR logos-co#34.
@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented Apr 24, 2026

Copilot review feedback addressed ✅

setsid() error check: Now returns Err(std::io::Error::last_os_error()) when setsid() returns -1, so spawn() fails with a clear OS error instead of silently continuing.

Issue reference: The Refs #33 in the commit was a copy-paste error. This PR addresses the daemon mode issue (sequencer dying when shell closes), not the prebuilt binaries issue.

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 1, 2026

End-to-end verified on Linux:

$ logos-scaffold localnet start
localnet ready (sequencer pid=579990)

$ ps -o pid,ppid,pgid,sid -p 579990
    PID    PPID    PGID     SID
 579990       1  579990  579990

PPID=1 confirms the sequencer is reparented to init (detached from shell).
PGID=SID=PID confirms it is its own session leader via setsid().

setsid() error check also verified — returns Err on failure instead of silently continuing (addressed Copilot review feedback).

@fryorcraken
Copy link
Copy Markdown
Collaborator

Thanks for the patch — the technical fix is sound (verified locally: rebases cleanly onto current master, all 100 tests pass, and a focused probe confirms the spawned child lands in its own session with no controlling terminal). Before this can move forward I'd like to step back from the diff and ask for a few things that fall out of CONTRIBUTING.md.

1. Wrong issue reference shipped in code. src/process.rs carries (fixes logos-co/logos-scaffold#33) in the inline comment. You already noted in a follow-up that the Refs #33 was a copy-paste error — but the source comment still ships the misdirecting reference. Either remove it or point it at a dedicated issue for "localnet sequencer dies on shell closure". A merged comment that misdirects future readers is worse than no reference at all.

2. Verification trail. The "Testing" block in the description reads as a manual procedure, not as evidence the author ran it. Per CONTRIBUTING.md, localnet changes call for D1 + D2 reruns — please list the scenario IDs you actually rebuilt against, with whatever short evidence you captured. Separately, this behavior is trivially testable in CI without LEZ: spawn any long-lived process via spawn_to_log, read the child's SID via libc::getsid, assert SID == child PID. A regression test here would be ~20 lines and would catch any future change that drops the pre_exec hook silently.

3. User-facing docs. The sequencer's lifecycle relative to the launching shell is changing in a way users will notice. DOGFOODING.md D2 and the localnet section of README.md should reflect that — short additions, but they belong in the same PR per the policy.

None of these are deal-breakers on the fix itself; the patch is well-scoped and the Copilot setsid-return-value feedback was addressed promptly. Once those pieces are in place, this should be ready to land — happy to give it another pass as soon as you've pushed the updates.

One out-of-scope observation surfaced while dogfooding, just FYI: patch_sequencer_port in src/commands/localnet.rs writes the JSON port field, but sequencer_service takes --port on the CLI (default 3040) which overrides JSON. So any project on a non-3040 port currently fails to start regardless of this PR. Worth a separate issue.

- Remove wrong issue reference from src/process.rs comment
- Add regression test: daemonized_process_is_own_session_leader
  verifies spawned child has SID == PID (own session leader)
- Update README.md: document sequencer survives shell/tmux closure
- Update DOGFOODING.md D2: add daemon survival check to expected signals
@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 1, 2026

Addressed all review points:

1. Wrong issue reference removed — stripped the misdirecting #33 from the inline comment in src/process.rs.

2. Regression test addeddaemonized_process_is_own_session_leader in process::logged_tests:

test process::logged_tests::daemonized_process_is_own_session_leader ... ok

Spawns a process via spawn_to_log, reads SID via libc::getsid, asserts SID == child PID.

3. Docs updated — README.md and DOGFOODING.md D2 both updated to reflect that the sequencer survives shell/tmux closure.

Ready for another pass @fryorcraken.

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 1, 2026

Fixed two remaining issues:

  1. Wrong issue reference — removed the stale feat: support prebuilt binaries to skip compile-from-source on setup #33 reference from src/process.rs (was accidentally re-introduced in a rebase)

  2. README duplicate — removed the doubled sentence in localnet section

All three points from the review are now addressed:

  • ✅ Wrong issue reference removed
  • ✅ Regression test passes: daemonized_process_is_own_session_leader
  • ✅ README.md and DOGFOODING.md updated

@fryorcraken

@danisharora099
Copy link
Copy Markdown
Contributor

hmm - why would you want the sequencer to keep running? you could possible run only the sequencer as a separate daemon process or inside docker, not through localnet start.

am i missing something?

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 5, 2026

Good question @danisharora099. The use case is developer ergonomics:

logos-scaffold manages the full localnet lifecycle (setup, start, stop, status, logs) as a single cohesive workflow. Developers run localnet start from their project directory and expect it to keep running while they work — switching tmux windows, closing terminals, reconnecting via SSH.

Without setsid, closing the terminal kills the sequencer silently, leaving the project in a broken state. The developer then has to diagnose why wallet topup or deploy fails, which is not obvious.

Docker is a valid alternative but adds a dependency and complexity. The scaffold is designed to work without Docker for the basic development flow.

@danisharora099
Copy link
Copy Markdown
Contributor

if the terminal window for sequencer daemon, that currently runs as a child process inside of the localnet start command, gets closed - why should we expect the sequencer to keep running?

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 5, 2026

That's exactly what setsid() fixes. Without it:

  • localnet start spawns sequencer as a child process
  • The child inherits the parent's controlling terminal
  • When the terminal closes, SIGHUP is sent to the process group
  • Sequencer dies

With setsid():

  • The child creates a new session (verified: SID == PID in our regression test)
  • It has no controlling terminal
  • Terminal closure does not affect it
  • Sequencer keeps running

This is the standard Unix daemonization pattern. The regression test in this PR confirms: spawned child has SID == PID == its own process group leader.

@danisharora099
Copy link
Copy Markdown
Contributor

danisharora099 commented May 5, 2026

thanks for walking through setsid - i get how it works, the question is whether we should want it to.

couple things tho:

the tmux example doesn't really hold. tmux already persists sessions across ssh disconnects, so localnet start inside tmux survives ssh dropping just fine today. the bug only shows up if the dev closes the tmux window themselves, which is them ending the session on purpose. "i closed the terminal and the child died" is just how unix works, not a regression.

for dev tools the usual move is to run foreground and let the user wrap it in nohup or tmux new -d or a systemd user unit if they want persistence. vite dev, anvil, hardhat node, cargo run all do this. no new code, nothing to maintain, plays nice with whatever the user already has.

the bigger thing imo is setsid() only buys like 5% of being a daemon. once the sequencer outlives the shell we suddenly have to think about: two projects running localnet at once colliding on ports, scaffold getting killed mid spawn so the pidfile never gets written but the child is now orphaned to pid 1, stale pidfiles after reboot, sequencer crashing with nothing to restart it, scaffold getting upgraded while the old sequencer keeps running. none of that is in this pr and all of it is going to be our problem to deal with later.

honestly this feels like a feature being shipped as a fix. changing localnet start from "owns this child for the lifetime of the shell" to "kicks off a system-lifetime process" is a real behaviour change, and the fix: prefix + initially wrong #33 ref kinda hides that. so the actual question is just what is localnet start supposed to do.

if we want a system-lifetime daemon then we should do it properly with explicit up/down and handle the stuff above, not slip a setsid() into a generic spawn helper. if we don't, close this and put a one liner in the readme telling people to use nohup or tmux.

i lean foreground tbh but open to the other one. just not into shipping the daemon contract without actually picking it.

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 5, 2026

That's a fair and sharp critique thank you for spelling it out.

You're right that this PR ships a daemon contract without explicitly choosing it. The setsid() was added to fix a concrete symptom (sequencer dying on terminal close) without fully thinking through the lifecycle implications: orphans on crash, port collisions across projects, stale pidfiles, no supervisor.

I'll defer to @fryorcraken on which contract logos-scaffold should adopt for localnet start. If the answer is 'foreground supervisor', I'm happy to close this PR and instead add a README note about nohup/tmux for users who want persistence.

@weboko
Copy link
Copy Markdown
Collaborator

weboko commented May 13, 2026

please, actualize the PR and re-test it

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 14, 2026

Actualized against current master. Resolved README conflict keeping upstream's deploy/build idl/build client additions alongside our daemon note. All 298 lib tests pass. @weboko

@ygd58
Copy link
Copy Markdown
Contributor Author

ygd58 commented May 22, 2026

CI is failing with a Cargo.lock checksum error for r-efi v6.0.0 — this appears to be unrelated to the PR changes. @vpavlin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants