perf: pool FetchResponse objects to reduce consumer GC pressure by thomhurst · Pull Request #581 · thomhurst/Dekaf

thomhurst · 2026-03-25T11:46:29Z

Summary

Pool FetchResponse, FetchResponseTopic, and FetchResponsePartition using ConcurrentStack<T>-based pools (matching the existing RecordBatch pooling pattern), eliminating per-fetch-cycle allocations of these envelope objects
Cascade return-to-pool from FetchResponse.ReturnToPool() through topics and partitions, clearing all fields to prevent reference leaks while preserving Records ownership transfer to PendingFetchData
Add unit tests verifying pool reuse, field clearing, cascading return, overflow handling, and concurrent thread-safety

Pool sizes: 64 for FetchResponse, 256 for FetchResponseTopic, 1024 for FetchResponsePartition -- scaled to typical consumer cardinality per fetch cycle.

Test plan

Unit tests pass: FetchResponsePoolingTests (10 tests covering rent/return, field clearing, cascading, overflow, concurrency)
Integration tests pass with Docker (consumer fetch cycles reuse pooled objects correctly)
Verify no regression in existing protocol encoding tests

claude

PR Review: Pool FetchResponse Objects to Reduce Consumer GC Pressure

Overall, the approach is sound and follows the established RecordBatch pooling pattern. The cascade return-to-pool design and the use of ConcurrentStack<T> + soft-limit counter are consistent with the existing codebase. However, there are several correctness and safety issues that should be addressed before merging.

Critical: No Exception Safety — Pool Objects Can Leak or Return Early

Both PrefetchFromBrokerAsync and FetchFromBrokerAsync call response.ReturnToPool() inline, with no try/finally guard. If any exception is thrown between the response = assignment and the ReturnToPool() call (e.g., from _prefetchChannel.Writer.WriteAsync, UpdateWatermarksFromFetchResponse, or channel backpressure OperationCanceledException), the pooled object is never returned to the pool. This is a resource leak on exception paths.

The fix is to wrap the body in try/finally:

// Correct pattern:
try
{
    // ... process response ...
}
finally
{
    response.ReturnToPool();
}

PrefetchFromBrokerWithErrorHandlingAsync (line 927) catches all exceptions except OperationCanceledException when wakeup-requested, so the leak is real and not just theoretical.

Correctness: `Responses = []` in `ReturnToPool()` Allocates on Every Return

In FetchResponse.ReturnToPool() (line 78) and FetchResponseTopic.ReturnToPool() (line 175), assigning Responses = [] and Partitions = [] will compile to Array.Empty<T>() (which is a cached singleton for the array, but the collection expression [] targeting IReadOnlyList<T> emits a new empty array instance in .NET 10, not Array.Empty<T>()).

Check whether this is Array.Empty<T>() or new T[0] at the IL level. If it is not the singleton, replace with the explicit singleton:

Responses = Array.Empty<FetchResponseTopic>();
Partitions = Array.Empty<FetchResponsePartition>();

The field initializers (= []) on lines 41 and 144 have the same concern for the initial object construction path.

Concern: `ReturnToPool()` Called Before Channel Write Completes

In PrefetchFromBrokerAsync, response.ReturnToPool() is called at line 1095, but the channel writes happen after this (lines 1107–1109). Because Records ownership was transferred to PendingFetchData beforehand, this is safe for records themselves. However, the AbortedTransactions list from partitionResponse.AbortedTransactions is also transferred directly to PendingFetchData (line 1063) — and ReturnToPool() sets AbortedTransactions = null on the partition objects before those PendingFetchData items are written to the channel. Verify that PendingFetchData takes a copy/ownership of AbortedTransactions at construction time (i.e., that it stores the reference, not reads through the partition) — if the partition is returned to the pool and AbortedTransactions is cleared before the channel consumer reads from PendingFetchData, there is no corruption only if PendingFetchData holds the reference directly.

This appears to be correct based on the constructor call at line 1059-1063, but it is a subtle invariant that should be documented with a comment in FetchResponsePartition.ReturnToPool().

Design: Breaking `init` API Contract Without a Clear Replacement Guarantee

All init properties on FetchResponse, FetchResponseTopic, and FetchResponsePartition are changed to internal set. The init modifier previously provided a compiler-enforced guarantee that these objects were immutable after construction, making them safe to inspect at any point. The internal set replacement is weaker:

The objects are now mutable at any point from within the assembly (not just during Read()).
Any internal consumer code that holds a FetchResponseTopic or FetchResponsePartition reference for longer than the fetch cycle (e.g., caching a reference for logging) will observe silently cleared/mutated fields after ReturnToPool().

Consider documenting the expected lifetime contract clearly, or use a marker (e.g., a bool _returned guard in debug builds) to detect use-after-return.

Design: `FetchResponsePartition` Pool Max Size (1024) May Be Misleading

The comment says "typically 1-50" partitions, but the pool size is set to 1024. The existing RecordBatch pool uses 2048 but the pool holds batches, not partitions. A consumer with 50 partitions fetching from 2 brokers concurrently needs at most 100 pooled partitions. A pool of 1024 wastes ~80KB of GC roots for no benefit. Consider aligning the pool size with the realistic cardinality, or at minimum document the reasoning for the 1024 cap.

Minor: `s_poolCount` Tracking Is Cosmetic for `ConcurrentStack<T>`

The pool uses both s_pool.Count (via Volatile.Read(ref s_poolCount)) and ConcurrentStack<T> which has its own O(n) Count property. Tracking s_poolCount separately with Interlocked is the correct approach (avoiding O(n) count), and this matches the RecordBatch pattern — this is fine. The comment "Soft limit: intentionally non-atomic check-then-act" correctly documents why a TOCTOU race is acceptable here. No action needed.

Minor: `[AggressiveInlining]` on `RentFromPool()` Is Questionable

RentFromPool() contains a branch (TryPop success vs. new()). [AggressiveInlining] for a two-branch method that invokes ConcurrentStack.TryPop (which is not trivially inlinable itself) is unlikely to provide measurable benefit and inflates the caller's code size. This matches the RecordBatch pattern, so it's consistent, but both could be revisited.

Tests: Good Coverage, One Gap

The 10 unit tests cover the key scenarios well (rent/return, field clearing, cascading, overflow, concurrency). One gap: there is no test verifying behavior when ReturnToPool() is called twice on the same object (double-return). A re-rented object's fields look cleared but the object could appear twice in the pool stack if the caller has a bug. A debug-mode ObjectDisposedException or similar guard would make this detectable.

Summary

Issue	Severity
No `try/finally` around `ReturnToPool()` — pool objects leaked on exception	High
`Responses = []` / `Partitions = []` may allocate on every return	Medium
`AbortedTransactions` lifetime invariant not documented	Low
`init` → `internal set` contract weakening not documented	Low
Pool size 1024 for FetchResponsePartition oversized	Low
No double-return detection in tests	Low

The try/finally issue is the most important to fix before merging. The PR achieves meaningful GC pressure reduction on the consumer path and the core design is correct.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

claude

Follow-up Review: All Previous Concerns Addressed

The author has responded to my earlier review with three follow-up commits. This is an assessment of the final state of the PR at HEAD (6c3bbcbf).

Resolved: All High/Medium Issues from Previous Review

try/finally around ReturnToPool() — Fixed. Both PrefetchFromBrokerAsync and FetchFromBrokerAsync now wrap the response-processing loop in try/finally, ensuring response.ReturnToPool() is called even when exceptions occur. This is the correct pattern.

Array.Empty() field initializers and ReturnToPool() cleanup — Fixed. The field initializers and all ReturnToPool() clearing assignments now use Array.Empty<FetchResponseTopic>() / Array.Empty<FetchResponsePartition>() explicitly, guaranteeing zero allocation via the singleton.

Double-return guard — Fixed. The _pooled flag prevents duplicate pool entries. ReturnToPool() silently no-ops on a second call, and the property guards detect use-after-return.

Use-after-return detection — Added. Responses, Partitions, and Records properties now throw ObjectDisposedException when _pooled == true. This turns silent data corruption into a loud, immediately detectable error — the right tradeoff for a pooled object.

Flaky IsSameReferenceAs test assertions — Fixed. The three ReturnAndRent_* tests no longer assert object identity; they only assert field clearing, which is the actual behavioral contract. This is correct per the CLAUDE.md guidance: pool identity is non-deterministic under parallel test execution.

Remaining Observations (Non-Blocking)

_pooled is a plain bool, not volatile

_pooled is read and written from ReturnToPool() and the property getters without any memory fence. In the normal consumer usage pattern this is single-threaded (one fetch loop per broker), so this is not a practical correctness issue. However, if a future consumer implementation ever shares a FetchResponsePartition reference across threads, the non-volatile flag could be missed. This is consistent with the RecordBatch pattern in the codebase (which also uses a plain bool for similar guards), so this is acceptable as-is.

ReturnToPool() iterates _responses (the backing field) rather than going through the guarded property

This is correct. If ReturnToPool() read through the Responses property it would throw ObjectDisposedException if called twice. Using _responses directly is the right choice.

FetchResponse_ReturnToPool_ClearsNestedTopicsAndPartitions test has a subtle pool-sharing assumption

The test rents a FetchResponse, FetchResponseTopic, and FetchResponsePartition, returns them via response.ReturnToPool(), then immediately re-rents three objects and asserts their fields are cleared. Under parallel test execution, the three re-rented objects are not guaranteed to be the same instances that were just returned — another test thread could drain one or more slots from the pool. The field-clearing assertions are correct regardless of object identity, but the test implicitly assumes it gets the same three objects back. If a different (freshly constructed) object is returned on re-rent, the field assertions still pass trivially (a new object always has cleared fields). The test does not actually prove that the cascading return clears fields on the pooled instances. To make it airtight, capture references to the inner objects before returning and assert on those specific references — but since the use-after-return ObjectDisposedException tests already cover the cascading behavior, this is low priority.

Summary

The implementation is correct, safe, and well-tested. The core design is sound: the ConcurrentStack<T> + soft-limit counter + _pooled guard + try/finally return pattern matches the established RecordBatch pooling approach. The ObjectDisposedException use-after-return guards are a good addition that makes lifecycle violations visible immediately.

The remaining items above are all non-blocking. This PR is ready to merge.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

claude

Follow-Up Review: Pool FetchResponse Objects (Updated PR)

The three follow-up commits address all the critical and medium-severity issues from the previous review. The PR is substantially improved. Here is an updated assessment of what remains.

What Was Fixed (Good)

try/finally around ReturnToPool() — Both PrefetchFromBrokerAsync and FetchFromBrokerAsync now use try/finally to guarantee pool return on exception paths. This was the most important fix.
Array.Empty<T>() instead of [] — The field initializers and ReturnToPool() reset paths now use the explicit singleton, eliminating the allocation-per-return concern.
ObjectDisposedException guard on Responses, Partitions, and Records — Use-after-return is now detectable at runtime, not silent data corruption.
Double-return guard via _pooled flag — Idempotent ReturnToPool() prevents pool corruption from accidental double-return.
Tests for use-after-return and double-return — Coverage is now comprehensive.

Remaining Concern: `_pooled` Flag Has a Non-Atomic Read-Then-Use Race

The _pooled field is a plain bool, not volatile, not protected by any synchronization. The guard in the Responses / Partitions / Records getters reads _pooled without a memory barrier:

get
{
    if (_pooled) throw new ObjectDisposedException(nameof(FetchResponse));
    return _responses;
}

Meanwhile, ReturnToPool() sets _pooled = true and then clears _responses without any barrier either:

_pooled = true;

if (Volatile.Read(ref s_poolCount) < MaxPoolSize)
{
    s_pool.Push(this);
    ...
}

If two threads race — one reading Responses while another calls ReturnToPool() — the getter can observe _pooled == false (stale read due to lack of volatile), bypass the guard, and then read a partially cleared _responses. This is particularly relevant since PrefetchFromBrokerAsync runs on a background thread and ReturnToPool() is in the finally of the same method, but if a consumer thread also holds a reference (e.g., from logging or error handling), the race window exists.

The fix is to mark _pooled as volatile:

private volatile bool _pooled;

The RecordBatch pool uses the same pattern — if that field is also non-volatile, both should be fixed together.

Remaining Concern: Double-Return Test Is Now a No-Op

Commit 6c3bbcbf0d removed the IsSameReferenceAs assertions from the double-return tests with the stated reason "flaky". The resulting tests are:

response.ReturnToPool();
response.ReturnToPool();   // <-- idempotent guard prevents double-push

var first = FetchResponse.RentFromPool();
var second = FetchResponse.RentFromPool();

if (ReferenceEquals(first, response) && ReferenceEquals(second, response))
{
    Assert.Fail("...");
}

first.ReturnToPool();
second.ReturnToPool();

This test can never fail, even if the double-return guard is removed. The pool is a shared static ConcurrentStack. When first and second are rented after a double-return of response, there may be other objects in the pool from other tests (since this is a static pool with no reset between tests). The condition ReferenceEquals(first, response) && ReferenceEquals(second, response) requires the same instance to be popped twice consecutively, which is impossible unless the stack is corrupted — but the stack being corrupted by double-push would produce the same object at two different stack positions, not necessarily at the top two positions.

The original intent (verify idempotency) is sound, but the test as written does not actually verify what it claims. The flakiness in the original version stemmed from the shared pool state. A correct, deterministic test would isolate pool state or verify the _pooled flag directly:

var response = FetchResponse.RentFromPool();
response.ReturnToPool();

// Second return should be silently ignored — no exception, _pooled guard fires
response.ReturnToPool(); // Should not throw

// Verify the object is not rented out twice by checking pool count via a known drain

Consider either reverting to a meaningful test or documenting that the double-return guard is tested via the ObjectDisposedException use-after-return tests (which indirectly cover it).

Remaining Minor: `AbortedTransactions` Lifetime Invariant Still Undocumented

FetchResponsePartition.ReturnToPool() sets AbortedTransactions = null, but PendingFetchData was constructed with the reference already captured before ReturnToPool() is called. This is safe because PendingFetchData stores the reference directly — but this invariant is not documented anywhere. A future refactor that changes when ReturnToPool() is called (e.g., moving it before PendingFetchData construction) would silently introduce a null dereference.

A comment in ReturnToPool() like // Safe: AbortedTransactions reference already captured by PendingFetchData before this point would make the invariant explicit and reviewable.

Remaining Minor: Pool Size 1024 for `FetchResponsePartition` Is Unjustified

The comment still says "typically 1-50" partitions while the pool size is 1024. A consumer with 50 partitions across 2 brokers needs at most ~100 pooled partitions. 1024 means ~900 objects are perpetually allocated but never used. This is GC roots that work against the goal of reducing GC pressure. Either reduce the cap to ~256 (matching FetchResponseTopic's 256) or add a comment explaining the reasoning for the larger cap.

Summary

Issue	Severity	Status
`try/finally` around `ReturnToPool()`	High	Fixed
`Array.Empty<T>()` allocation on return	Medium	Fixed
`_pooled` flag not `volatile` — potential race	Medium	Open
Double-return test is a no-op after commit `6c3bbcbf`	Medium	Open
`AbortedTransactions` lifetime invariant undocumented	Low	Open
`FetchResponsePartition` pool size 1024 unjustified	Low	Open

The core design is correct and the critical exception-safety issue is resolved. The volatile concern and the test no-op are the items worth addressing before merging.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

…ion objects Reduce GC pressure in the consumer fetch cycle by pooling the response envelope objects instead of allocating new ones each cycle.

…Array.Empty, double-return guard - Wrap response processing in try/finally in both prefetch and non-prefetch paths in KafkaConsumer.cs to ensure ReturnToPool() is always called even if an exception occurs during response iteration. - Replace [] with Array.Empty<T>() in FetchResponse.cs field initializers and ReturnToPool() methods to guarantee zero allocation. - Add _pooled flag to FetchResponse, FetchResponseTopic, and FetchResponsePartition to guard against double-return corrupting the pool. - Add double-return detection tests for all three pooled types.

Convert Responses, Partitions, and Records properties from auto-properties to backing fields with a guard that throws ObjectDisposedException when accessed after ReturnToPool(). This prevents silent use of zeroed/nulled fields when code holds a stale reference to a pooled object. The _pooled flag (already present for double-return detection) is reused for the guard check. ReturnToPool() methods now use backing fields directly to avoid triggering the guard during cleanup.

Pool identity assertions are non-deterministic under parallel test execution — another test can drain the pool between ReturnToPool() and RentFromPool(). Assert field clearing instead, which is the actual contract being tested.

claude

Follow-Up Review (Round 4): Status of Outstanding Issues

This review checks the current diff against the open items from the previous round.

Open: `_pooled` Flag Is Still Not `volatile`

The _pooled field across all three classes is still declared as private bool _pooled, without volatile. This was raised as a medium-severity issue in the previous review and has not been addressed.

The concern: ReturnToPool() sets _pooled = true and then clears backing fields without a memory fence. The Responses/Partitions/Records property getters read _pooled without a fence. In the normal single-threaded per-broker fetch loop this is fine in practice, but the absence of volatile is a correctness hazard that grows as the codebase evolves.

The fix is a one-line change per class:

private volatile bool _pooled;

This is the same fix needed in RecordBatch if that class uses the same non-volatile pattern. The cost is negligible (one extra memory fence on pool return, which is off the hot path), and the benefit is a documented, correct happens-before relationship between the writer and any reader on a different thread.

Open: Double-Return Test Is Still a No-Op

The FetchResponse_DoubleReturn_DoesNotDuplicateInPool test (and equivalents for Topic/Partition) has this structure:

// Drain the pool first to get a clean state
while (FetchResponse.RentFromPool() is not null)
{
    // RentFromPool always returns non-null (creates new if pool empty), so just rent a few
    break;  // ← immediately breaks, drains nothing
}

var response = FetchResponse.RentFromPool();
response.ReturnToPool();
response.ReturnToPool(); // second call is a no-op due to _pooled guard

var first = FetchResponse.RentFromPool();
var second = FetchResponse.RentFromPool();

if (ReferenceEquals(first, response) && ReferenceEquals(second, response))
{
    Assert.Fail("...");
}

Two problems:

The "drain" loop immediately breaks and drains nothing, so the comment is misleading.
The _pooled guard prevents the second ReturnToPool() from pushing to the stack at all. The Assert.Fail condition (first === response AND second === response) is therefore structurally impossible — response can appear at most once in the pool, so second can never be the same object as response if first already was. The test cannot fail regardless of whether the guard works.

The test as written does not verify the double-return guard. What it does demonstrate is that calling ReturnToPool() twice does not throw. That behavior is already covered by the use-after-return tests (the second ReturnToPool() will trigger the _pooled early-return path silently).

To actually test the guard, check the _pooled field state directly (requires making it accessible to tests, e.g., via internal + InternalsVisibleTo), or verify via the ObjectDisposedException behavior: after the first return, accessing a guarded property should throw; after the second return, accessing that same property should still throw (not reset to accessible). The latter is already covered by the existing use-after-return tests.

The cleanest fix is to either remove the duplicate tests (since the coverage they intended is already provided by the use-after-return tests) or replace them with a meaningful assertion:

var response = FetchResponse.RentFromPool();
response.ReturnToPool();

// Second return should not throw
response.ReturnToPool();

// The object must still be considered "returned" (use-after-return guard still active)
await Assert.That(() => response.Responses).Throws<ObjectDisposedException>();

Open: `AbortedTransactions` Lifetime Invariant Undocumented

FetchResponsePartition.ReturnToPool() sets AbortedTransactions = null while the caller has already transferred this reference to PendingFetchData. The invariant (safe because PendingFetchData captures the reference at construction, before ReturnToPool() is called) is still undocumented. A single comment in ReturnToPool() is sufficient:

// Safe: AbortedTransactions reference already captured by PendingFetchData before ReturnToPool() is called.
AbortedTransactions = null;

Open: Pool Size 1024 Comment Is Internally Inconsistent

The comment on FetchResponsePartition still reads "typically 1-50" with a MaxPoolSize = 1024. With "typically 1-50" partitions per broker and "typically 1-2" concurrent brokers, the steady-state working set is 50–100 objects. A pool of 1024 holds ~900 idle objects permanently on the GC heap. Either reduce the cap to 128 or 256 (matching FetchResponseTopic), or replace the comment with justification for the larger cap.

Summary

Issue	Severity	Status
`try/finally` around `ReturnToPool()`	High	Fixed
`Array.Empty<T>()` allocation on return	Medium	Fixed
`ObjectDisposedException` use-after-return guards	Medium	Fixed
`_pooled` flag not `volatile`	Medium	Still open
Double-return tests are no-ops	Medium	Still open (test logic flawed)
`AbortedTransactions` lifetime invariant undocumented	Low	Still open
`FetchResponsePartition` pool size 1024 unjustified	Low	Still open

The implementation is functionally correct and safe. The volatile issue and the test no-op are the items most worth fixing before merging, as the test currently provides false confidence in the double-return guard.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

Tests share a global ConcurrentStack pool, so parallel execution can rent objects returned by other tests, resetting their _pooled flag and causing use-after-return guard assertions to fail. Added [NotInParallel] to serialize these tests.

claude

Follow-Up Review (Round 5): Current State at HEAD

The two most recent commits (10a264f0, 5d5168c2) address test reliability only — removing flaky reference-identity assertions and adding [NotInParallel] to prevent pool state bleed between tests. These are good, necessary fixes. However, the four open issues from Round 4 are still unaddressed.

Still Open: `_pooled` Flag Is Not `volatile`

All three classes declare:

private bool _pooled;

The RecordBatch class in the same codebase uses private volatile bool _disposed for its equivalent guard. Without volatile, there is no guaranteed memory ordering between the thread that calls ReturnToPool() (which sets _pooled = true and then clears fields) and any thread that calls a property getter (which reads _pooled to check for use-after-return). In the current consumer loop, all accesses happen on the same thread, so this is not an active bug today — but it is a correctness hazard as the codebase evolves, and it deviates from the established pattern in RecordBatch.

Fix (one line per class):

private volatile bool _pooled;

Still Open: Double-Return Tests Are No-Ops

The FetchResponse_DoubleReturn_DoesNotDuplicateInPool test and its Topic/Partition equivalents do not exercise the guard they claim to test. The _pooled guard prevents the second ReturnToPool() call from pushing to the stack, so the Assert.Fail condition (first === response && second === response) is structurally impossible — response can appear in the stack at most once. The test cannot fail regardless of whether the guard is present or absent.

The simplest fix that actually validates the guard:

[Test]
public async Task FetchResponse_DoubleReturn_ObjectStillConsideredReturned()
{
    var response = FetchResponse.RentFromPool();
    response.ReturnToPool();

    // Second return should not throw
    response.ReturnToPool();

    // Object must still be in the "returned" state — the guard should not reset on the second call
    await Assert.That(() => response.Responses).Throws<ObjectDisposedException>();
}

This replaces the current meaningless reference-identity check with a meaningful behavioral assertion: calling ReturnToPool() twice leaves the object in the "disposed" state, not in a halfway-reset state.

Still Open: `AbortedTransactions` Lifetime Invariant Is Undocumented

FetchResponsePartition.ReturnToPool() nulls out AbortedTransactions after the caller has already transferred that reference to PendingFetchData. This is safe because the transfer happens before ReturnToPool() is called, but the ReturnToPool() method has no comment explaining why it is safe to null this field despite it being observable from PendingFetchData.

A future maintainer reading ReturnToPool() in isolation would see a live reference being zeroed and might reasonably worry about a use-after-free. A single comment resolves this:

// Safe: AbortedTransactions reference already captured by PendingFetchData before ReturnToPool() is called.
AbortedTransactions = null;

Still Open: Pool Size 1024 Is Inconsistent With "Typically 1–50" Comment

FetchResponsePartition has MaxPoolSize = 1024 with the comment "typically 1-50" partitions per fetch cycle. With 1–2 concurrent brokers that is a steady-state working set of 50–100 objects. The pool will permanently hold ~900 idle objects on the GC heap. Either:

Reduce to 128 or 256 (matching FetchResponseTopic), or
Update the comment with the justification for 1024 (e.g., burst tolerance, pre-warming intent).

As written, the comment argues against the cap value, which creates confusion.

Summary

Issue	Severity	Status
`try/finally` around `ReturnToPool()`	High	Fixed ✓
`Array.Empty<T>()` allocation on return	Medium	Fixed ✓
`ObjectDisposedException` use-after-return guards	Medium	Fixed ✓
Flaky reference-identity assertions in tests	Medium	Fixed ✓
Parallel test pool interference	Medium	Fixed ✓
`_pooled` flag not `volatile`	Medium	Still open
Double-return tests are no-ops	Medium	Still open
`AbortedTransactions` lifetime invariant undocumented	Low	Still open
`FetchResponsePartition` pool size 1024 unjustified	Low	Still open

The implementation is functionally correct. The volatile issue and the no-op double-return tests are the most important remaining items — the former for correctness consistency with RecordBatch, the latter because the tests currently provide false confidence in the double-return guard.

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

thomhurst · 2026-03-26T19:23:13Z

Closing: this PR's core pooling work was superseded by #591 (merged to main). The beneficial additions (use-after-return guards, pooling tests) will be recreated in a new PR on top of the current ObjectPool implementation.

Add volatile _pooled flag and ObjectDisposedException guards to FetchResponse, FetchResponseTopic, and FetchResponsePartition. This detects use-after-return bugs at runtime instead of allowing silent data corruption when a pooled object is accessed after being returned. Also adds double-return idempotency (second ReturnToPool is a no-op) and comprehensive pooling tests covering field clearing, cascading return, thread safety, and use-after-return detection. Salvaged from #581 which was superseded by #591.

* fix: add use-after-return guards to pooled FetchResponse objects Add volatile _pooled flag and ObjectDisposedException guards to FetchResponse, FetchResponseTopic, and FetchResponsePartition. This detects use-after-return bugs at runtime instead of allowing silent data corruption when a pooled object is accessed after being returned. Also adds double-return idempotency (second ReturnToPool is a no-op) and comprehensive pooling tests covering field clearing, cascading return, thread safety, and use-after-return detection. Salvaged from #581 which was superseded by #591. * fix: address review — atomic pool guard, guard AbortedTransactions - Replace volatile bool with int + Interlocked.CompareExchange in ReturnToPool() on all three classes, eliminating the TOCTOU race where two concurrent callers could both pass the check and double- return the same object to the pool. - Add use-after-return guard to AbortedTransactions on FetchResponsePartition (was inconsistently unguarded while Records was guarded). Uses same backing field + ObjectDisposedException pattern. - Add test for AbortedTransactions guard. * fix: address review feedback — cache Records reference, strengthen tests - Cache partitionResponse.Records in a local variable in both consumer fetch paths to avoid repeated Volatile.Read from the pool guard - Add IsSameReferenceAs assertions to ReturnAndRent tests to prove Reset() actually clears the dirty instance (not a fresh default) - Use non-pooled FetchResponsePartition in topic test to avoid leaking a rented partition into the pool via cascading ReturnToPool

claude bot reviewed Mar 25, 2026

View reviewed changes

thomhurst added 4 commits March 25, 2026 14:09

perf: pool FetchResponse, FetchResponseTopic, and FetchResponsePartit…

e0f7c07

…ion objects Reduce GC pressure in the consumer fetch cycle by pooling the response envelope objects instead of allocating new ones each cycle.

thomhurst force-pushed the perf/pool-fetch-response-objects branch from 6c3bbcb to 10a264f Compare March 25, 2026 14:12

claude bot reviewed Mar 25, 2026

View reviewed changes

thomhurst closed this Mar 26, 2026

thomhurst mentioned this pull request Mar 26, 2026

fix: add use-after-return guards to pooled FetchResponse objects #602

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: pool FetchResponse objects to reduce consumer GC pressure#581

perf: pool FetchResponse objects to reduce consumer GC pressure#581
thomhurst wants to merge 5 commits intomainfrom
perf/pool-fetch-response-objects

thomhurst commented Mar 25, 2026

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

thomhurst commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thomhurst commented Mar 25, 2026

Summary

Test plan

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review: Pool FetchResponse Objects to Reduce Consumer GC Pressure

Critical: No Exception Safety — Pool Objects Can Leak or Return Early

Correctness: Responses = [] in ReturnToPool() Allocates on Every Return

Concern: ReturnToPool() Called Before Channel Write Completes

Design: Breaking init API Contract Without a Clear Replacement Guarantee

Design: FetchResponsePartition Pool Max Size (1024) May Be Misleading

Minor: s_poolCount Tracking Is Cosmetic for ConcurrentStack<T>

Minor: [AggressiveInlining] on RentFromPool() Is Questionable

Tests: Good Coverage, One Gap

Summary

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-up Review: All Previous Concerns Addressed

Resolved: All High/Medium Issues from Previous Review

Remaining Observations (Non-Blocking)

Summary

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-Up Review: Pool FetchResponse Objects (Updated PR)

What Was Fixed (Good)

Remaining Concern: _pooled Flag Has a Non-Atomic Read-Then-Use Race

Remaining Concern: Double-Return Test Is Now a No-Op

Remaining Minor: AbortedTransactions Lifetime Invariant Still Undocumented

Remaining Minor: Pool Size 1024 for FetchResponsePartition Is Unjustified

Summary

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-Up Review (Round 4): Status of Outstanding Issues

Open: _pooled Flag Is Still Not volatile

Open: Double-Return Test Is Still a No-Op

Open: AbortedTransactions Lifetime Invariant Undocumented

Open: Pool Size 1024 Comment Is Internally Inconsistent

Summary

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-Up Review (Round 5): Current State at HEAD

Still Open: _pooled Flag Is Not volatile

Still Open: Double-Return Tests Are No-Ops

Still Open: AbortedTransactions Lifetime Invariant Is Undocumented

Still Open: Pool Size 1024 Is Inconsistent With "Typically 1–50" Comment

Summary

Uh oh!

thomhurst commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Correctness: `Responses = []` in `ReturnToPool()` Allocates on Every Return

Concern: `ReturnToPool()` Called Before Channel Write Completes

Design: Breaking `init` API Contract Without a Clear Replacement Guarantee

Design: `FetchResponsePartition` Pool Max Size (1024) May Be Misleading

Minor: `s_poolCount` Tracking Is Cosmetic for `ConcurrentStack<T>`

Minor: `[AggressiveInlining]` on `RentFromPool()` Is Questionable

Remaining Concern: `_pooled` Flag Has a Non-Atomic Read-Then-Use Race

Remaining Minor: `AbortedTransactions` Lifetime Invariant Still Undocumented

Remaining Minor: Pool Size 1024 for `FetchResponsePartition` Is Unjustified

Open: `_pooled` Flag Is Still Not `volatile`

Open: `AbortedTransactions` Lifetime Invariant Undocumented

Still Open: `_pooled` Flag Is Not `volatile`

Still Open: `AbortedTransactions` Lifetime Invariant Is Undocumented