Skip to content

Pool memqueue producer response channels via sync.Pool#49337

Open
Copilot wants to merge 3 commits intomainfrom
copilot/reduce-memqueue-allocations
Open

Pool memqueue producer response channels via sync.Pool#49337
Copilot wants to merge 3 commits intomainfrom
copilot/reduce-memqueue-allocations

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 7, 2026

Summary

Pool and reuse chan queue.EntryID response channels in the memqueue producer with sync.Pool, matching the existing batchPool pattern in broker.go. This eliminates the per-publish channel allocation that dominated the allocation profile (~96% of alloc_space on BenchmarkProducerThroughput).

Changes

  • Add respChanPool (sync.Pool) producing make(chan queue.EntryID, 1)
  • makePushRequest acquires from pool via respChanPool.Get().(chan queue.EntryID)
  • publish and tryPublish return channels via defer respChanPool.Put(req.resp) — safe because handlePendingResponse fully drains the channel before returning

Benchmark

BenchmarkProducerThroughput  (before)  ~10,007 allocs/op  ~1,120,530 B/op
BenchmarkProducerThroughput  (after)        ~5 allocs/op       ~230 B/op

Test plan

  • All existing memqueue tests pass
  • All queue tests pass (memqueue + diskqueue)

🤖 Generated with Claude Code

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 7, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 7, 2026

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @Copilot? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

Copilot AI changed the title [WIP] Reduce memqueue producer allocations by pooling response channels Pool memqueue producer response channels via sync.Pool Mar 7, 2026
@strawgate strawgate marked this pull request as ready for review March 7, 2026 00:50
@strawgate strawgate requested a review from a team as a code owner March 7, 2026 00:50
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces a package-level sync.Pool that provides reusable buffered chan queue.EntryID channels. Producers (forgetfulProducer and ackProducer) obtain response channels from the pool instead of allocating per request. openState.publish and openState.tryPublish defer returning non-nil req.resp to the pool, relying on existing draining behavior. Channel lifecycle and reuse are centralized around the open-state publish paths. No public APIs or exported signatures were changed.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed All coding requirements from issue #49192 are met: sync.Pool pooling implemented, response channels reused across producers, defer-based lifecycle management in publish/tryPublish, and benchmarks confirm 99.95% allocation reduction.
Out of Scope Changes check ✅ Passed All changes are scoped to the memqueue producer channel pooling objective. No unrelated modifications detected in the affected file.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch copilot/reduce-memqueue-allocations
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libbeat/publisher/queue/memqueue/produce.go`:
- Around line 151-152: The publish function currently defers
putRespChan(req.resp) immediately, which can return the response channel to the
pool while the queue goroutine may still send to it (race between st.events <-
req and handlePendingResponse), causing a pooled channel to receive a stale
EntryID; remove the early defer in openState.publish and instead call
putRespChan(req.resp) only on the error/early-return paths where the request is
not enqueued (i.e., when you know the queue goroutine will not send), and let
the consumer of the request (the queue goroutine/handler) be responsible for
returning req.resp to the pool after it has sent or closed the response; update
paths that call handlePendingResponse and any return branches to avoid returning
the channel to the pool when st.events <- req succeeded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8403320f-eba1-45f2-99e6-674c31af592e

📥 Commits

Reviewing files that changed from the base of the PR and between 13108e8 and e98267e.

📒 Files selected for processing (1)
  • libbeat/publisher/queue/memqueue/produce.go

Comment thread libbeat/publisher/queue/memqueue/produce.go Outdated
@leehinman leehinman requested review from faec and removed request for leehinman March 9, 2026 14:33
@strawgate
Copy link
Copy Markdown
Contributor

/test

@strawgate
Copy link
Copy Markdown
Contributor

strawgate commented Mar 26, 2026

Profiling / Benchmarking confirm that the allocation is gone. In our Filebeat perf benchmark of about 10m events, it decreases total allocations by a little over 1%. https://buildkite.com/elastic/filebeat-benchmark/builds/2282

In the baseline makePushRequest shows up as 2,576,698 allocations (1.05%) — in the PR version it's completely gone (0 flat allocs).

@strawgate strawgate enabled auto-merge (squash) March 26, 2026 22:01
@strawgate
Copy link
Copy Markdown
Contributor

/test

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Mar 28, 2026
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 28, 2026
Comment thread libbeat/publisher/queue/memqueue/produce.go Outdated
Comment thread libbeat/publisher/queue/memqueue/produce.go
Comment thread libbeat/publisher/queue/memqueue/produce.go
Comment thread libbeat/publisher/queue/memqueue/produce.go Outdated
@strawgate strawgate force-pushed the copilot/reduce-memqueue-allocations branch from e0bb981 to c7de87a Compare April 9, 2026 15:03
strawgate added a commit to strawgate/beats that referenced this pull request Apr 9, 2026
PR elastic#49337

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 10, 2026

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b copilot/reduce-memqueue-allocations upstream/copilot/reduce-memqueue-allocations
git merge upstream/main
git push upstream copilot/reduce-memqueue-allocations

Rebase onto current main (post queue-generics refactor #49954) and
adapt the respChanPool optimization to the new generic types.

No logic change: respChanPool.Get/Put wraps chan queue.EntryID exactly
as before; makePushRequest on both forgetfulProducer[T] and
ackProducer[T] now calls getRespChan() and publish/tryPublish defer
respChanPool.Put. Benchmarks confirm: 10009 → 7 allocs/op on
BenchmarkProducerThroughput (Apple M2 Pro, arm64).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@strawgate strawgate force-pushed the copilot/reduce-memqueue-allocations branch from c7de87a to c18400f Compare April 17, 2026 14:54
@strawgate
Copy link
Copy Markdown
Contributor

Rebased onto current main to resolve the merge conflict from the queue generics refactor (#49954). The pool optimization required adapting all type references to use [T any]forgetfulProducer[T], ackProducer[T], openState[T], pushRequest[T] — but the logic is identical. Re-ran BenchmarkProducerThroughput to confirm the win still holds on the rebased branch: 10,009 → 7 allocs/op (Apple M2 Pro, arm64). The optimization is still valid.

@strawgate strawgate requested a review from faec April 17, 2026 15:00
Two tests and one benchmark addition:

- TestRespChanAlwaysEmptyOnAcquire: verifies the safety invariant that
  channels returned to the pool by a completed publish are always empty
  when subsequently acquired. A stale buffered EntryID would corrupt the
  return value of the next publish that reuses the channel.

- TestRespChanPoolNoAllocs: verifies 0 heap allocs per getRespChan/Put
  cycle after warmup, confirming channels are genuinely reused. GC is
  disabled during measurement (debug.SetGCPercent(-1)) to make the
  result deterministic.

- BenchmarkProducerThroughput: add b.ReportAllocs() to surface the
  alloc reduction in CI output. With pool: 6-7 allocs/op for 10,000
  events; without pool: ~10,009 allocs/op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@strawgate
Copy link
Copy Markdown
Contributor

Added tests for the pool optimization:

  • TestRespChanAlwaysEmptyOnAcquire — correctness invariant: a channel returned to the pool by a completed publish must always be empty on next acquisition. A stale buffered EntryID would silently corrupt the return value of the next publish reusing that channel.

  • TestRespChanPoolNoAllocs — directly proves pool reuse: getRespChan() + pool.Put produces 0 heap allocs after warmup. GC is disabled during measurement (debug.SetGCPercent(-1)) to make the result deterministic. Without the pool this would be 1 alloc per call.

  • BenchmarkProducerThroughput — added b.ReportAllocs(). CI will now surface the alloc numbers: 6–7 allocs/op for a full 10,000-event batch vs. ~10,009 allocs/op on main.

The previous two tests were testing pool mechanics in isolation
(raw Get/Put and empty-channel checks) rather than the actual
publish path. TestRespChanAlwaysEmptyOnAcquire in particular was
unreliable: getRespChan() after a publish might return a freshly
allocated channel via New rather than a recycled one, making it
trivially pass even with a broken pool.

Replace both with TestPublishPoolNoAllocsInSteadyState, which
runs testing.AllocsPerRun on the real p.Publish path (with a live
queue and background consumer). This directly validates the PR's
claim: with the pool warm and GC disabled, Publish allocates 0
channels per op. Without the pool, this would be 1 alloc/op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

TL;DR

golangci-lint failed in all matrix jobs (ubuntu, macos, windows) on the same two testifylint findings in memqueue tests. Update those two assertions, then re-run CI.

Remediation

  • In libbeat/publisher/queue/memqueue/produce_pool_test.go:95, replace the float direct-compare assertion with assert.InEpsilon (or assert.InDelta) to satisfy testifylint float-compare.
  • In libbeat/publisher/queue/memqueue/queue_test.go:316, fix the require.Nilf format call: the message has %v but no arg. Add the missing index argument (or remove %v).
  • Push the fix commit and re-run the PR checks (golangci-lint workflow).
Investigation details

Root Cause

The golangci-lint step fails due to two deterministic testifylint violations in test code; this is not platform-specific, which is why all three OS jobs fail identically.

Evidence

  • Workflow: https://github.com/elastic/beats/actions/runs/24572208781
  • Job/step: lint (ubuntu-latest)golangci-lint (same errors also in lint (macos-latest) and lint (windows-latest))
  • Key log excerpt:
    • libbeat/publisher/queue/memqueue/produce_pool_test.go:95:2: float-compare: use assert.InEpsilon (or InDelta) (testifylint)
    • libbeat/publisher/queue/memqueue/queue_test.go:316:3: formatter: require.Nilf format %v reads arg #1, but call has 0 args (testifylint)

Validation

  • Not run locally in this workflow context (read-only detective run).

Follow-up

  • After the two assertion fixes, re-trigger CI; if lint still fails, capture the updated golangci-lint output and compare against this report.

What is this? | From workflow: PR Actions Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-changelog Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants