feat(go): introduce context.Context to client methods#2964
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #2964 +/- ##
============================================
- Coverage 73.29% 73.16% -0.14%
Complexity 943 943
============================================
Files 1126 1126
Lines 98435 99069 +634
Branches 75608 75611 +3
============================================
+ Hits 72148 72479 +331
- Misses 23683 23984 +301
- Partials 2604 2606 +2
🚀 New features to boost your workflow:
|
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
…data and verifyClientConnection
1c9e728 to
e0f7a2e
Compare
…hen I/O error happens.
b20ea38 to
8ba63f5
Compare
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
/ready |
|
looks like CI is failing. /author |
hubcio
left a comment
There was a problem hiding this comment.
the PR doesn't compile. go build ./... fails:
client/tcp/tcp_segment_management.go:31:17: not enough arguments in call to c.do
DeleteSegments wasn't migrated - the impl in tcp_segment_management.go still calls c.do(&command.DeleteSegments{...}) without the new ctx arg, and the interface method in contracts/client.go still lacks ctx. that file isn't part of this PR's diff so it can't be commented on inline, but ctx needs threading through both the interface signature and the impl, same as every other client method. go vet flags the same.
the rest of the findings are inline on the Files changed tab.
| // clear the deadline after the operation is done. | ||
| var deadlineMu sync.Mutex | ||
|
|
||
| stop := context.AfterFunc(ctx, func() { |
There was a problem hiding this comment.
the context.AfterFunc callback reads c.conn without holding c.mtx, and stop() does not wait for an already-started callback - it can return false while the callback goroutine is still running. once sendAndFetchResponse returns and the mutex is released, connect() can reassign c.conn at line 459, so the callback's read races that write. worse, the callback then calls SetDeadline(time.Now()) on whatever c.conn points to, which can be a fresh healthy connection, poisoning it into an immediate timeout. fix: snapshot conn := c.conn under the mutex before registering the callback and have the callback use the snapshot. a ctx.Done() == nil fast-path also closes the window for context.Background() callers.
| return nil, err | ||
| } | ||
|
|
||
| // deadlineMu makes sure that the deadline won't be set to now by the |
There was a problem hiding this comment.
the comment claims deadlineMu stops the callback from setting the deadline to now right after clearDeadline clears it, but a mutex only gives mutual exclusion, not ordering. if clearDeadline acquires deadlineMu first it sets the deadline to zero, then the callback runs and sets it to time.Now(), leaving the connection with a past deadline. needs a cleared flag set under deadlineMu that the callback checks, not just the mutex.
| func (c *IggyTcpClient) sendAndFetchResponse(message []byte, command command.Code) ([]byte, error) { | ||
| func (c *IggyTcpClient) sendAndFetchResponse(ctx context.Context, message []byte, command command.Code) ([]byte, error) { | ||
| if ctx == nil { | ||
| return nil, errors.New("nil context") |
There was a problem hiding this comment.
errors.New("nil context") is an untyped error - callers can't errors.Is it. every other error path here returns typed values (context.Canceled, context.DeadlineExceeded, ierror.*). make this a sentinel like ierror.ErrNilContext for consistency at the SDK boundary.
| } | ||
|
|
||
| // invalidateConnLocked closes the connection and marks it as disconnected | ||
| func (c *IggyTcpClient) invalidateConnLocked() { |
There was a problem hiding this comment.
invalidateConnLocked closes the conn and sets StateDisconnected on every i/o error, but nothing on the operation path ever reconnects - connect() is only called from NewIggyTcpClient and the login-after-redirect path, and the heartbeat goroutine just logs ping failures. so after the first transient i/o error the client is permanently dead: every later call writes to the closed conn and fails forever, and the reconnection config (maxRetries/interval) becomes dead code on this path. pre-PR an i/o error left the conn open and state unchanged. either wire up a reconnect trigger here or document this as a hard behavior change.
There was a problem hiding this comment.
Yeah I almost forgot about this . I was planning to solve this in another PR. Maybe I can create a PR to implement the reconnect logic first.
| // deadlineMu makes sure that the deadline won't be set to now by the | ||
| // AfterFunc callback right after we call SetDeadline(time.Time{}) to | ||
| // clear the deadline after the operation is done. | ||
| var deadlineMu sync.Mutex |
There was a problem hiding this comment.
var deadlineMu sync.Mutex plus the context.AfterFunc registration and the clearDeadline/ioError closures are allocated on every sendAndFetchResponse call - including the hot SendMessages/PollMessages path and context.Background() callers whose ctx.Done() is nil and can never fire. a ctx.Done() == nil fast-path that skips the deadline machinery entirely avoids the waste, and as noted on the AfterFunc line also removes the c.conn race for non-cancellable contexts.
| return nil, err | ||
| } | ||
| return c.LoginUser(username, password) | ||
| return c.LoginUser(ctx, username, password) |
There was a problem hiding this comment.
also at line 71 in LoginWithPersonalAccessToken. the recursion re-passes the same ctx, and the first c.do's AfterFunc callback can still be live when connect() (line 46 / 68) reassigns c.conn - same c.conn race flagged in tcp_core.go, reachable single-threaded here. fixing the callback to use a conn snapshot covers this too.
| log.Println("Checking cluster metadata for leader detection") | ||
|
|
||
| meta, err := c.GetClusterMetadata() | ||
| meta, err := c.GetClusterMetadata(ctx) |
There was a problem hiding this comment.
if ctx is cancelled during this GetClusterMetadata call, sendAndFetchResponse closes the connection via invalidateConnLocked and returns ctx.Err() - but the if err != nil block just below logs it and returns ("", nil), swallowing the cancellation. HandleLeaderRedirection then returns (false, nil) and LoginUser returns identity, nil: a successful login on a dead connection. the swallow is intentional for non-clustered servers, so don't propagate all errors blindly - check ctx.Err() specifically and return it before the swallow.
| t.Errorf("got %v, want context.DeadlineExceeded", err) | ||
| } | ||
| // After a timeout, the connection should be invalidated. | ||
| if c.state != iggcon.StateDisconnected { |
There was a problem hiding this comment.
this asserts c.state == StateDisconnected after a timeout, which codifies the sticky-failure behavior (closed conn, no reconnect path) as expected - if that gets fixed, this test changes. separately, every test here is single-operation: none exercise connect() reassigning c.conn while an AfterFunc callback from a prior op is still live, which is exactly the race this PR introduces. go test -race stays green only because of that gap. worth adding a concurrent-reassignment test.
|
Ok it seems that something happend after merging(). Will start working on this on April 20 |
Which issue does this PR close?
N/A
Rationale
Standardizing the Go SDK to support context.Context across all client methods. This allows users to handle timeouts, cancellations
What changed?
The iggcon.Client interface and its TCP implementation have been updated to include context.Context as the first parameter for all API methods.
Correspondingly, all internal command-handling logic now respects context deadlines and cancellations.
Local Execution
AI Usage