Skip to content

logpuller: extract generic priority queue#5487

Open
lidezhu wants to merge 7 commits into
masterfrom
ldz/improve-log-puller0523
Open

logpuller: extract generic priority queue#5487
lidezhu wants to merge 7 commits into
masterfrom
ldz/improve-log-puller0523

Conversation

@lidezhu

@lidezhu lidezhu commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #5488

What is changed and how it works?

This pull request refactors the priority queue used in the log service into a generic utility package. By moving this logic to utils/priorityqueue, the codebase reduces duplication and provides a more robust, thread-safe implementation that can be leveraged by other components. The changes include updating existing consumers to use the new generic queue and ensuring proper error handling during queue closure.

Highlights

  • Refactoring Priority Queue: Extracted the priority queue implementation from logservice/logpuller into a generic, reusable utils/priorityqueue package.
  • Improved Queue API: Updated the priority queue to support generic types and improved the blocking Pop mechanism with proper ErrClosed handling.
  • Code Cleanup: Removed the redundant local PriorityQueue implementation in logservice/logpuller and updated all references to use the new utility.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Summary by CodeRabbit

  • Refactor
    • Consolidated custom priority queue into a reusable generic utility package.
    • Updated region task queue to use the new utility.
    • Added test coverage for priority ordering verification.

@ti-chi-bot

ti-chi-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jun 23, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign 3aceshowhand for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@lidezhu, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 44 minutes and 13 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses rolling per-developer review limits. Reviews become available again as older review attempts age out of the rolling limit window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 520549ea-94b4-49d9-8b20-f5ed4318f16c

📥 Commits

Reviewing files that changed from the base of the PR and between 7bc477c and c9c9aab.

📒 Files selected for processing (1)
  • utils/priorityqueue/priority_queue_test.go
📝 Walkthrough

Walkthrough

Removes the logpuller-local PriorityQueue type and its test file, and introduces a new generic utils/priorityqueue package providing a thread-safe, heap-backed, blocking priority queue with ErrClosed semantics. The subscription_client field type, constructor, and shutdown error handling are updated to use the new package.

Changes

Generic PriorityQueue extraction and migration

Layer / File(s) Summary
Generic utils/priorityqueue implementation and tests
utils/priorityqueue/priority_queue.go, utils/priorityqueue/priority_queue_test.go
Introduces ErrClosed, generic PriorityQueue[T] struct with mutex and buffered notify channel, New constructor, Push/AddOrUpdate (closed-guard + heap update + wakeup), blocking Pop loop with ctx/channel select, TryPop/Peek/Len helpers, Close with idempotent channel close, and notifyOneLocked. Tests cover ordering, blocking, AddOrUpdate semantics, concurrent producers/consumers, Close lifecycle, wake-on-close, and multi-waiter wakeup.
logpuller migration to utils/priorityqueue
logservice/logpuller/subscription_client.go, logservice/logpuller/subscription_client_test.go, logservice/logpuller/priority_task_test.go
Changes regionTaskQueue field type to *priorityqueue.PriorityQueue[PriorityTask], replaces NewPriorityQueue() with priorityqueue.New[PriorityTask](), and updates handleRegions to return nil on priorityqueue.ErrClosed. Test files mirror the constructor swap and add TestRegionPriorityTaskQueueOrder validating high-before-low dequeue ordering.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • wk989898
  • asddongmen

Poem

🐇 Hoppity-hop, the queue found a new home,
From logpuller's burrow to utils it roams.
Generic T where PriorityTask once sat,
ErrClosed now signals — imagine that!
The rabbit approves this tidy refactor,
Reusable queues are a joyful factor. 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'logpuller: extract generic priority queue' clearly and concisely summarizes the main change: extracting a priority queue from logpuller into a generic utility package.
Description check ✅ Passed The PR description includes the required Issue Number reference (close #5488), explains what changed and how it works, provides highlights of the refactoring, documents test coverage (Unit test), and addresses the template questions about performance and documentation.
Linked Issues check ✅ Passed The PR successfully meets all objectives from #5488: extracting priority queue into utils/priorityqueue, building on utils/heap infrastructure, separating concerns (logpuller-specific logic vs generic queue), reducing duplication risk, enabling component reuse, and clarifying semantics through separation.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the extraction objective: removal of old implementation, addition of generic utils/priorityqueue package, updates to logpuller to use the new generic queue, and comprehensive test coverage for both packages.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ldz/improve-log-puller0523

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@ti-chi-bot ti-chi-bot Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 23, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the specialized, non-generic PriorityQueue in logpuller with a generic, thread-safe blocking priority queue under utils/priorityqueue. The subscriptionClient and related tests have been updated to use this new generic implementation. Regarding the feedback, the reviewer correctly pointed out that calling notifyOneLocked() inside TryPop is unnecessary and can be removed to prevent spurious wakeups and redundant channel operations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread utils/priorityqueue/priority_queue.go
@lidezhu lidezhu marked this pull request as ready for review June 23, 2026 18:47
@ti-chi-bot ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 23, 2026
@lidezhu

lidezhu commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@lidezhu

lidezhu commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

/gemini summary

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@lidezhu

lidezhu commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

/gemini summary

@gemini-code-assist

Copy link
Copy Markdown

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

This pull request refactors the priority queue used in the log service into a generic utility package. By moving this logic to utils/priorityqueue, the codebase reduces duplication and provides a more robust, thread-safe implementation that can be leveraged by other components. The changes include updating existing consumers to use the new generic queue and ensuring proper error handling during queue closure.

Highlights

  • Refactoring Priority Queue: Extracted the priority queue implementation from logservice/logpuller into a generic, reusable utils/priorityqueue package.
  • Improved Queue API: Updated the priority queue to support generic types and improved the blocking Pop mechanism with proper ErrClosed handling.
  • Code Cleanup: Removed the redundant local PriorityQueue implementation in logservice/logpuller and updated all references to use the new utility.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Activity
  • Pull request created by lidezhu.
  • Bot commented that CI is skipped for draft PR.
  • Bot flagged that the PR is not approved.
  • CodeRabbit performed an initial review.
  • Author requested a review from CodeRabbit.
  • Gemini provided feedback on TryPop implementation, which was subsequently fixed by the author.
  • Author requested a summary via /gemini summary.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utils/priorityqueue/priority_queue_test.go`:
- Around line 137-158: The issue is that require.NotNil and require.True
assertions are being called directly inside the spawned goroutines created by
wg.Go, which is unsafe because FailNow may be invoked from child goroutines
causing flaky tests. Remove the require assertions from inside both the consumer
goroutine (where require.NotNil is called on task) and the producer goroutine
(where require.True is called on q.Push result). Instead, collect any assertion
failures in shared state such as an error channel, then after wg.Wait()
completes, assert all collected errors from the main test goroutine to ensure
deterministic behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7e675952-1e67-4f8e-89de-ff5041981e3f

📥 Commits

Reviewing files that changed from the base of the PR and between 0eec971 and 7bc477c.

📒 Files selected for processing (7)
  • logservice/logpuller/priority_queue.go
  • logservice/logpuller/priority_queue_test.go
  • logservice/logpuller/priority_task_test.go
  • logservice/logpuller/subscription_client.go
  • logservice/logpuller/subscription_client_test.go
  • utils/priorityqueue/priority_queue.go
  • utils/priorityqueue/priority_queue_test.go
💤 Files with no reviewable changes (2)
  • logservice/logpuller/priority_queue.go
  • logservice/logpuller/priority_queue_test.go

Comment on lines +137 to +158
for range numConsumers {
wg.Go(func() {
for {
task, err := q.Pop(ctx)
if err != nil {
return
}
require.NotNil(t, task)
if atomic.AddInt64(&consumedCount, 1) >= int64(totalTasks) {
cancel()
return
}
}
})
}

for producerID := range numProducers {
wg.Go(func() {
for j := range tasksPerProducer {
priority := producerID*tasksPerProducer + j
require.True(t, q.Push(newMockItem(priority, "task")))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether require/assertions are called inside goroutines in this test file.
ast-grep run --lang go \
  --pattern 'go func() { $$$ require.$F($$$) $$$ }()' \
  utils/priorityqueue/priority_queue_test.go

Repository: pingcap/ticdc

Length of output: 151


🏁 Script executed:

# First, verify the file exists and check its size
ls -la utils/priorityqueue/priority_queue_test.go

# Read the specific lines mentioned in the review (137-158, 174-175)
echo "=== Lines 137-158 ==="
sed -n '137,158p' utils/priorityqueue/priority_queue_test.go

echo ""
echo "=== Lines 174-175 ==="
sed -n '174,175p' utils/priorityqueue/priority_queue_test.go

# Get full test function context around these lines
echo ""
echo "=== Full TestQueueConcurrentOperations function ==="
ast-grep outline utils/priorityqueue/priority_queue_test.go --match TestQueueConcurrentOperations --view expanded

Repository: pingcap/ticdc

Length of output: 957


🏁 Script executed:

# Get more context around the function to verify line 174-175 situation
sed -n '120,180p' utils/priorityqueue/priority_queue_test.go | cat -n

# Also check if there's actual `require` usage in goroutines at the claimed lines
echo "=== Checking for require calls in goroutines ==="
grep -n "require\." utils/priorityqueue/priority_queue_test.go | head -20

Repository: pingcap/ticdc

Length of output: 2589


Avoid require.* inside spawned goroutines

require calls like NotNil and True may invoke FailNow, which is unsafe from child goroutines and can cause flaky/non-deterministic test failures (lines 144, 157). Record failures in shared state (or an error channel) and assert from the main test goroutine instead.

Proposed fix
 func TestQueueConcurrentOperations(t *testing.T) {
 	q := New[*mockItem]()
@@
 	var wg sync.WaitGroup
 	var consumedCount int64
+	var poppedNil atomic.Bool
+	var pushFailed atomic.Bool
 	ctx, cancel := context.WithCancel(context.Background())
 	defer cancel()
@@
 	for range numConsumers {
 		wg.Go(func() {
 			for {
 				task, err := q.Pop(ctx)
 				if err != nil {
 					return
 				}
-				require.NotNil(t, task)
+				if task == nil {
+					poppedNil.Store(true)
+					cancel()
+					return
+				}
 				if atomic.AddInt64(&consumedCount, 1) >= int64(totalTasks) {
 					cancel()
 					return
 				}
 			}
 		})
 	}
@@
 	for producerID := range numProducers {
 		wg.Go(func() {
 			for j := range tasksPerProducer {
 				priority := producerID*tasksPerProducer + j
-				require.True(t, q.Push(newMockItem(priority, "task")))
+				if !q.Push(newMockItem(priority, "task")) {
+					pushFailed.Store(true)
+					cancel()
+					return
+				}
 			}
 		})
 	}
@@
 	require.Equal(t, int64(totalTasks), consumedCount)
+	require.False(t, poppedNil.Load())
+	require.False(t, pushFailed.Load())
 }

Per coding guidelines: **/*_test.go: "Prefer focused deterministic tests; see docs/agents/testing.md before adding or changing tests."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/priorityqueue/priority_queue_test.go` around lines 137 - 158, The issue
is that require.NotNil and require.True assertions are being called directly
inside the spawned goroutines created by wg.Go, which is unsafe because FailNow
may be invoked from child goroutines causing flaky tests. Remove the require
assertions from inside both the consumer goroutine (where require.NotNil is
called on task) and the producer goroutine (where require.True is called on
q.Push result). Instead, collect any assertion failures in shared state such as
an error channel, then after wg.Wait() completes, assert all collected errors
from the main test goroutine to ensure deterministic behavior.

Source: Coding guidelines

@lidezhu

lidezhu commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

/retest

@lidezhu

lidezhu commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator Author

/test all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extract logpuller priority queue into a reusable utility

1 participant