Skip to content

[WIP]#5479

Draft
lidezhu wants to merge 3 commits into
masterfrom
ldz/optimize-log-puller0618
Draft

[WIP]#5479
lidezhu wants to merge 3 commits into
masterfrom
ldz/optimize-log-puller0618

Conversation

@lidezhu

@lidezhu lidezhu commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #xxx

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 56024428-2922-43bd-8726-ddd8910ced28

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ldz/optimize-log-puller0618

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@lidezhu lidezhu changed the title alpha version [WIP] Jun 22, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a dedicated, configurable memory quota and region scan throttling mechanism for the log puller, separating deregister requests into a control queue to prevent blocking cleanup tasks. The review feedback suggests adding defensive checks against division-by-zero when maxPendingSize is zero, passing the subscription's stoppedCh to popControlOrWait to prevent potential deadlocks or delays when a subscription is stopped while the scan gate is closed, and adding a nil check for c.Puller to avoid a potential nil pointer dereference.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread logservice/logpuller/memory_quota.go Outdated
Comment on lines +49 to +53
func (q *pullerMemoryQuota) ShouldPausePath(
_ bool, pathPendingSize int64, _ int64, maxPendingSize uint64, _ int64,
) (bool, bool, float64) {
return false, false, float64(pathPendingSize) / float64(maxPendingSize)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If maxPendingSize is 0, dividing pathPendingSize by maxPendingSize will result in NaN or +Inf. It is safer to add a defensive check for maxPendingSize == 0 to avoid unexpected float values.

func (q *pullerMemoryQuota) ShouldPausePath(
	_ bool, pathPendingSize int64, _ int64, maxPendingSize uint64, _ int64,
) (bool, bool, float64) {
	if maxPendingSize == 0 {
		return false, false, 0.0
	}
	return false, false, float64(pathPendingSize) / float64(maxPendingSize)
}

Comment thread logservice/logpuller/memory_quota.go Outdated
Comment on lines +56 to +66
func (q *pullerMemoryQuota) ShouldPauseArea(
paused bool, pendingSize int64, maxPendingSize uint64,
) (bool, bool, float64) {
usageRatio := float64(pendingSize) / float64(maxPendingSize)
q.updateRegionScanState(usageRatio, pendingSize, maxPendingSize)

if paused {
return false, usageRatio < 1, usageRatio
}
return usageRatio >= 1, false, usageRatio
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similarly to ShouldPausePath, if maxPendingSize is 0, usageRatio will be NaN or +Inf, which can cause comparisons like usageRatio < 1 or usageRatio >= 1 to behave unexpectedly. Adding a guard for maxPendingSize > 0 ensures robust behavior.

func (q *pullerMemoryQuota) ShouldPauseArea(
	paused bool, pendingSize int64, maxPendingSize uint64,
) (bool, bool, float64) {
	var usageRatio float64
	if maxPendingSize > 0 {
		usageRatio = float64(pendingSize) / float64(maxPendingSize)
	}
	q.updateRegionScanState(usageRatio, pendingSize, maxPendingSize)

	if paused {
		return false, usageRatio < 1, usageRatio
	}
	return usageRatio >= 1, false, usageRatio
}

Comment on lines +169 to +180
func (c *requestCache) popControlOrWait(
ctx context.Context, resume <-chan struct{},
) (regionReq, bool, error) {
select {
case req := <-c.controlQueue:
return req, true, nil
case <-resume:
return regionReq{}, false, nil
case <-ctx.Done():
return regionReq{}, false, ctx.Err()
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential deadlocks or unnecessary delays when a subscription is stopped while the scan gate is closed, we should pass the subscription's stoppedCh to popControlOrWait and select on it. Additionally, we should prioritize controlQueue over resume using a non-blocking check to ensure deregister requests are processed immediately when both are ready.

func (c *requestCache) popControlOrWait(
	ctx context.Context, resume <-chan struct{}, stoppedCh <-chan struct{},
) (regionReq, bool, error) {
	select {
	case req := <-c.controlQueue:
		return req, true, nil
	default:
	}
	select {
	case req := <-c.controlQueue:
		return req, true, nil
	case <-resume:
		return regionReq{}, false, nil
	case <-stoppedCh:
		return regionReq{}, false, nil
	case <-ctx.Done():
		return regionReq{}, false, ctx.Err()
	}
}

if !paused {
break
}
controlReq, ok, err := s.requestCache.popControlOrWait(ctx, resume)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass region.subscribedSpan.stoppedCh to popControlOrWait to ensure the worker immediately unblocks and exits the loop if the subscription is stopped while waiting for the scan gate to resume.

Suggested change
controlReq, ok, err := s.requestCache.popControlOrWait(ctx, resume)
controlReq, ok, err := s.requestCache.popControlOrWait(ctx, resume, region.subscribedSpan.stoppedCh)

Comment thread pkg/config/debug.go
Comment on lines +53 to +56
if c.Puller.MemoryQuota == 0 {
return cerror.ErrInvalidServerOption.GenWithStackByArgs(
"debug.puller.memory-quota must be greater than 0")
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If c.Puller is nil (e.g., due to an incomplete configuration file), accessing c.Puller.MemoryQuota will cause a nil pointer dereference panic during startup. Adding a nil check or initializing it with defaults prevents this.

Suggested change
if c.Puller.MemoryQuota == 0 {
return cerror.ErrInvalidServerOption.GenWithStackByArgs(
"debug.puller.memory-quota must be greater than 0")
}
if c.Puller == nil {
c.Puller = NewDefaultPullerConfig()
} else if c.Puller.MemoryQuota == 0 {
return cerror.ErrInvalidServerOption.GenWithStackByArgs(
"debug.puller.memory-quota must be greater than 0")
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant