Skip to content

fix(noq): do a best-effort transmit on connection close#625

Open
colinmarc wants to merge 1 commit into
n0-computer:mainfrom
colinmarc:main
Open

fix(noq): do a best-effort transmit on connection close#625
colinmarc wants to merge 1 commit into
n0-computer:mainfrom
colinmarc:main

Conversation

@colinmarc
Copy link
Copy Markdown

Description

If we're shutting down ungracefully, we can make a best-effort attempt to write the CONNECTION_CLOSE to the socket and notify the peer. If we're in the middle of graceful shutdown, it doesn't hurt, either.

This can be taken advantage of by callers to do a quicker shutdown if they know the drain period isn't necessary. Discussion here: n0-computer/iroh#4201

Copy link
Copy Markdown
Member

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this @colinmarc :)

Before we put in more work, I need to check back with @flub whether our ideas in n0-computer/iroh#4201 are sound & desirable.

Other than that:

  • This needs a test! Ideally one that sets up a Runtime similar to the one we're using in iroh: Most importantly one that uses a TaskTracker and fairly aggressively closes all tasks (in this case those will be the connection and endpoint drivers).
  • Rather than driving the transmit in close, I'd like to drive the transmit only when the ConnectionDriver is dropped. Otherwise I worry the drive_transmit from the connection driver and the close call can get in each others way. I think there's an invariant that the connection driver is the only thing that calls drive_transmit and it never misses the consecutive calls to that function. The right place for calling drive_transmit might be in the impl Drop for ConnectionRef right after the conn.implicit_close(..).

@colinmarc
Copy link
Copy Markdown
Author

Before we put in more work, I need to check back with @flub whether our ideas in n0-computer/iroh#4201 are sound & desirable.

No worries, I just wanted to have something concrete to discuss :) This is only half of a workaround for that issue anyway.

The right place for calling drive_transmit might be in the impl Drop for ConnectionRef right after the conn.implicit_close(..).

That makes sense! Moved it to the drop for ConnectionRef. I wasn't sure if you meant inside the conditional, but that would mean we don't get this optimization for an explicit close/drop.

If we're shutting down ungracefully, we can make a best-effort attempt
to write the CONNECTION_CLOSE to the socket and notify the peer. If
we're in the middle of graceful shutdown, it doesn't hurt, either.
@matheus23 matheus23 changed the title fix: do a best-effort transmit on connection close fix(noq): do a best-effort transmit on connection close May 18, 2026
@dignifiedquire dignifiedquire moved this from 🚑 Needs Triage to 🏗 In progress in iroh May 19, 2026
Comment thread noq/src/connection.rs
// ungracefully.
let waker = std::task::Waker::noop();
let mut cx = std::task::Context::from_waker(waker);
let _ = conn.drive_transmit(&mut cx);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is kind of tempting to do this. But also, there's very little guarantee. E.g. even if a CONNECTION_CLOSE was scheduled for transmit, it is supposed to respect congestion control and might well not be sent (that could be buggy currently, I recall spotting some CONNECTION_CLOSE packets that did not respect congestion control).

I think at the end of the day when ungraceful things happen the remote has to deal with things just timing out. I think the edge-cases for which this would help are not really worth this extra complexity. I'd rather people discover these problems faster in tests if their code is genuinely misbehaving.

So I'm tempted to close this without merging.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I also ran some tests to see if this would actually help.
The idea was to make graceful close work when you just drop(endpoint), but keep the process alive a bit.

I didn't dive into the details of why it didn't quite work, but I didn't get the expected result. At least not when the AsyncUdpSocket implementation is the one it is in iroh (I haven't tested it with a normal UdpSocket in noq).

So yeah - I think this might need a bit more thorough work and looking into how this would actually work.

W.r.t. congestion control I think it is completely fine to not send the CONNECTION_CLOSE if CC doesn't allow it. I expect that in most cases CC will actually allow the close frame, though I admit relying on that behavior is a bad idea.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I wouldn't really expect it to work with anything other than a normal udp socket. One thing I'm not clear on - with iroh, after migration off the relay, isn't that the case?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, once iroh selected the direct path it should behave the same as noq with a normal udp socket as far as this PR is concerned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🏗 In progress

Development

Successfully merging this pull request may close these issues.

4 participants