100w test#4942
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This cherry pick PR is for a release branch and has not yet been approved by triage owners. To merge this cherry pick:
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request updates the project to Go 1.25 and introduces features such as dynamic log redaction, failpoint management via API, and table ID-based paths for cloud storage. It also includes architectural improvements to the coordinator's bootstrap and GC cleanup logic, along with fixes for DDL ordering and virtual column handling in sinks. Technical feedback identifies potential API hangs in the coordinator due to missing context checks in wait loops, a resource leak in the cloud storage writer when write operations fail, and a potential busy-wait performance issue in the block event executor.
I am having trouble creating individual review comments. Click here to see my feedback.
coordinator/controller.go (656-664)
This loop waits for the operator to finish without any timeout or context cancellation check. If the operator gets stuck, the API request will hang indefinitely. It is recommended to use a select statement to check for ctx.Done().
ticker := time.NewTicker(time.Second)
defer ticker.Stop()
for {
if op.IsFinished() {
break
}
select {
case <-ctx.Done():
return 0, context.Cause(ctx)
case <-ticker.C:
count += 1
log.Info("wait for stop changefeed operator finished", zap.Int("count", count), zap.Any("id", id))
}
}
coordinator/controller.go (692-700)
This loop waits for the operator to finish without any timeout or context cancellation check. If the operator gets stuck, the API request will hang indefinitely. It is recommended to use a select statement to check for ctx.Done().
ticker := time.NewTicker(time.Second)
defer ticker.Stop()
for {
if op.IsFinished() {
break
}
select {
case <-ctx.Done():
return context.Cause(ctx)
case <-ticker.C:
count += 1
log.Info("wait for stop changefeed operator finished", zap.Int("count", count), zap.Any("id", id))
}
}downstreamadapter/sink/cloudstorage/writer.go (254-256)
If writer.Write returns an error, the function returns immediately without calling writer.Close. This can lead to resource leaks (e.g., open file descriptors or incomplete multipart uploads in cloud storage). You should ensure writer.Close is called even if Write fails.
if _, inErr = writer.Write(ctx, buf.Bytes()); inErr != nil {
_ = writer.Close(ctx)
return 0, 0, inErr
}
downstreamadapter/dispatcher/block_event_executor.go (61-71)
This logic can lead to a busy-wait loop if the ready queue only contains dispatchers that are currently inUse. This will cause high CPU usage as workers repeatedly pop and re-push the same dispatcher IDs. Consider using a more efficient way to handle per-dispatcher task serialization, such as a set of per-worker queues or a more advanced scheduling mechanism.
What problem does this PR solve?
Issue Number: close #xxx
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note