Detail Bug Report
https://app.detail.dev/org_89d327b3-b883-4365-b6a3-46b6701342a9/bugs/bug_d9956c71-cb46-45c6-832d-cdf3a351da0c
Introduced in #1 by @quettabit on Apr 7, 2026
Summary
- Context: The append session retry logic uses
frame_signal to track whether HTTP/2 DATA frames were written to the socket, enabling safe retry decisions with the NO_SIDE_EFFECTS policy.
- Bug: In retry scenarios,
frame_signal.reset() is called when all resend acknowledgments are processed, but fresh data may have already been sent, causing the signal to be incorrectly cleared.
- Actual vs. expected: The signal should remain
True if any data (resend or fresh) was sent, but it gets reset to False after resend acks even when fresh data was already transmitted.
- Impact: With
NO_SIDE_EFFECTS policy, retries are incorrectly allowed for errors like 500/503/504/408/TransportError, leading to duplicate records in the stream.
Code with Bug
# _run_attempt in src/s2_sdk/_s2s/_append_session.py
# Lines 194-197 - The buggy reset logic
if resend_remaining > 0:
resend_remaining -= 1
if resend_remaining == 0 and frame_signal is not None:
frame_signal.reset() # <-- BUG 🔴 clears the signal even if fresh DATA has already been sent
Explanation
_run_attempt uses frame_signal to decide if it is safe to retry with AppendRetryPolicy.NO_SIDE_EFFECTS. The retry logic treats frame_signal being unset (False) as evidence that no side-effecting DATA was written.
- The ack loop decrements
resend_remaining as resend acknowledgments arrive and calls frame_signal.reset() when resend_remaining reaches zero.
- However, resend and fresh payloads can both be sent before all resend acks are processed (the send path calls
on_write()/signal() after releasing the write lock, while acks are processed concurrently). This allows the sequence:
- resend sent →
signal=True; 2) fresh sent → signal=True; 3) resend ack processed → reset() → signal=False; 4) an error occurs before the fresh ack.
- At that point there is still fresh data in flight, but
frame_signal is False. In _retrier.py, this makes not_signalled=True, which bypasses the has_inflight guard and allows a retry even for errors that are not considered “no side effects” (e.g., 500/503/504/408/TransportError). This can cause duplicate appends.
Recommended Fix
Track whether fresh data has been added to inflight_inputs:
# In _run_attempt, before the message loop
initial_inflight_count = len(inflight_inputs)
# In the ack loop (lines 194-197)
if resend_remaining > 0:
resend_remaining -= 1
if resend_remaining == 0 and frame_signal is not None:
# Only reset if no fresh data has been added
if len(inflight_inputs) == initial_inflight_count:
frame_signal.reset()
Safety: len(inflight_inputs) is atomic in Python (GIL). The check provides eventual consistency for the retry decision.
History
This bug was introduced in commit 3dc9795. The commit added the initial implementation of the S2 SDK including the retry mechanism for append sessions. The frame_signal.reset() logic was designed to clear the signal once all resend data was acknowledged, but it didn't account for the concurrent execution model where fresh data could already be sent via _body_gen while resend acknowledgments are being processed in the ack loop. The bug slipped in because the reset condition only checks resend_remaining == 0 without verifying that no fresh data has been added to inflight_inputs.
Detail Bug Report
https://app.detail.dev/org_89d327b3-b883-4365-b6a3-46b6701342a9/bugs/bug_d9956c71-cb46-45c6-832d-cdf3a351da0c
Introduced in #1 by @quettabit on Apr 7, 2026
Summary
frame_signalto track whether HTTP/2 DATA frames were written to the socket, enabling safe retry decisions with theNO_SIDE_EFFECTSpolicy.frame_signal.reset()is called when all resend acknowledgments are processed, but fresh data may have already been sent, causing the signal to be incorrectly cleared.Trueif any data (resend or fresh) was sent, but it gets reset toFalseafter resend acks even when fresh data was already transmitted.NO_SIDE_EFFECTSpolicy, retries are incorrectly allowed for errors like 500/503/504/408/TransportError, leading to duplicate records in the stream.Code with Bug
Explanation
_run_attemptusesframe_signalto decide if it is safe to retry withAppendRetryPolicy.NO_SIDE_EFFECTS. The retry logic treatsframe_signalbeing unset (False) as evidence that no side-effecting DATA was written.resend_remainingas resend acknowledgments arrive and callsframe_signal.reset()whenresend_remainingreaches zero.on_write()/signal()after releasing the write lock, while acks are processed concurrently). This allows the sequence:signal=True; 2) fresh sent →signal=True; 3) resend ack processed →reset()→signal=False; 4) an error occurs before the fresh ack.frame_signalisFalse. In_retrier.py, this makesnot_signalled=True, which bypasses thehas_inflightguard and allows a retry even for errors that are not considered “no side effects” (e.g., 500/503/504/408/TransportError). This can cause duplicate appends.Recommended Fix
Track whether fresh data has been added to
inflight_inputs:Safety:
len(inflight_inputs)is atomic in Python (GIL). The check provides eventual consistency for the retry decision.History
This bug was introduced in commit 3dc9795. The commit added the initial implementation of the S2 SDK including the retry mechanism for append sessions. The
frame_signal.reset()logic was designed to clear the signal once all resend data was acknowledged, but it didn't account for the concurrent execution model where fresh data could already be sent via_body_genwhile resend acknowledgments are being processed in the ack loop. The bug slipped in because the reset condition only checksresend_remaining == 0without verifying that no fresh data has been added toinflight_inputs.