CSHARP-5887: Simplify retryable read and writes#1882
CSHARP-5887: Simplify retryable read and writes#1882papafe wants to merge 9 commits intomongodb:mainfrom
Conversation
3321aad to
5a8fcca
Compare
| serverResponse = commandException.Result; | ||
| } | ||
| catch (Exception exception) | ||
| catch (Exception exception) when (!context.ErrorDuringLastChannelAcquisition) |
There was a problem hiding this comment.
This avoids an exception in EnsureCanProceedNextBatch and ToFinalResultsOrThrow, as the context does not have the channel anymore, as there was an exception.
This error was not visible before, as the channel acquisition was done outside of the try catch and it will just raise the exception on the whole method.
The question here is we want the non-retryable channel acquisition to make the whole method throw (like it was before), or we need to find another way to make EnsureCanProceedNextBatch and ToFinalResultsOrThrow work.
| { | ||
| _databaseNamespace = Ensure.IsNotNull(databaseNamespace, nameof(databaseNamespace)); | ||
| _command = Ensure.IsNotNull(command, nameof(command)); | ||
| _command = command; //can be null |
There was a problem hiding this comment.
This is used so that we can modify the command once the ReadCommandOperation has been created.
This is done because some operations need to have the operationContext to properly create the command, but now the operationContext is not available until later.
I'm not a fan of this implementation, to be honest.
We got two other possibilities in my opinion:
- We make
ReadCommandOperationequivalent toWriteCommandOperation, so it does not concern with retryability. Then we make a new class,RetryableReadCommandOperationthat has retryability and usesReadCommandOperationinside - Instead of having the possibility of setting the command, we can add a method to modify it when we got the operation context.
I think nr1 is more desirable, but requires further much more refactoring.
src/MongoDB.Driver/Core/Operations/RetryableInsertCommandOperation.cs
Outdated
Show resolved
Hide resolved
| { | ||
| HashSet<ServerDescription> deprioritizedServers = null; | ||
| var attempt = 1; | ||
| var totalAttempts = 0; |
There was a problem hiding this comment.
Renamed to be aligned with the write version, and starting from 0 to be more compatible with the future client backpressure integration
src/MongoDB.Driver/Core/Operations/RetryableInsertCommandOperation.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Simplifies retryable read/write execution by centralizing channel acquisition/replacement inside the retry executors/contexts, and introduces dynamic command creation for retryable read command operations.
Changes:
- Refactors
RetryableReadOperationExecutor/RetryableWriteOperationExecutorto acquire/replace channels per attempt and track last acquired server. - Removes
RetryableReadContext.Create*/RetryableWriteContext.Create*factories; callers now construct contexts directly. - Adds
ICommandCreator+ aReadCommandOperationoverload to build commands dynamically per attempt/connection (used by find/aggregate/count/distinct).
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/MongoDB.Driver.Tests/Core/Operations/RetryableWriteOperationExecutorTests.cs | Updates test helper to construct RetryableWriteContext and acquire a channel explicitly. |
| tests/MongoDB.Driver.Tests/Core/Operations/CommandOperationBaseTests.cs | Removes test asserting command must be non-null (aligning with dynamic-command support). |
| tests/MongoDB.Driver.Tests/Core/LoadBalancingIntergationTests.cs | Updates helper methods to construct contexts and explicitly acquire channels sync/async. |
| src/MongoDB.Driver/Core/Operations/RetryableWriteOperationExecutor.cs | Moves channel acquisition into executor loop; adds attempt counters and uses LastAcquiredServer. |
| src/MongoDB.Driver/Core/Operations/RetryableWriteContext.cs | Removes Create/CreateAsync; adds ErrorDuringLastChannelAcquisition and LastAcquiredServer. |
| src/MongoDB.Driver/Core/Operations/RetryableWriteCommandOperationBase.cs | Switches to direct context construction (executor now acquires channels). |
| src/MongoDB.Driver/Core/Operations/RetryableUpdateCommandOperation.cs | Adjusts retry payload construction and payload variable used in the message section. |
| src/MongoDB.Driver/Core/Operations/RetryableReadOperationExecutor.cs | Moves channel acquisition into executor loop; uses LastAcquiredServer for deprioritization. |
| src/MongoDB.Driver/Core/Operations/RetryableReadContext.cs | Removes Create/CreateAsync; tracks LastAcquiredServer; pins channel after acquisition. |
| src/MongoDB.Driver/Core/Operations/RetryableInsertCommandOperation.cs | Adjusts retry payload construction for insert retries. |
| src/MongoDB.Driver/Core/Operations/RetryableDeleteCommandOperation.cs | Adjusts retry payload construction for delete retries. |
| src/MongoDB.Driver/Core/Operations/ReadCommandOperation.cs | Adds dynamic-command constructor via ICommandCreator; sets command per attempt. |
| src/MongoDB.Driver/Core/Operations/ListIndexesUsingCommandOperation.cs | Switches to direct RetryableReadContext construction. |
| src/MongoDB.Driver/Core/Operations/ListIndexesOperation.cs | Switches to direct RetryableReadContext construction. |
| src/MongoDB.Driver/Core/Operations/ListCollectionsOperation.cs | Switches to direct RetryableReadContext construction. |
| src/MongoDB.Driver/Core/Operations/ICommandCreator.cs | Introduces interface for creating commands dynamically from session/connection info. |
| src/MongoDB.Driver/Core/Operations/FindOperation.cs | Implements ICommandCreator; passes creator into ReadCommandOperation. |
| src/MongoDB.Driver/Core/Operations/EstimatedDocumentCountOperation.cs | Switches to direct RetryableReadContext construction; updates BeginOperation overload usage. |
| src/MongoDB.Driver/Core/Operations/DistinctOperation.cs | Implements ICommandCreator; passes creator into ReadCommandOperation; updates BeginOperation. |
| src/MongoDB.Driver/Core/Operations/CountOperation.cs | Implements ICommandCreator; passes creator into ReadCommandOperation. |
| src/MongoDB.Driver/Core/Operations/CommandOperationBase.cs | Allows null command to support dynamic command building; adds SetCommand. |
| src/MongoDB.Driver/Core/Operations/ClientBulkWriteOperation.cs | Constructs RetryableWriteContext directly; filters exception handling for acquisition failures. |
| src/MongoDB.Driver/Core/Operations/ChangeStreamOperation.cs | Switches to direct RetryableReadContext construction (aggregate operation will acquire channels). |
| src/MongoDB.Driver/Core/Operations/BulkUnmixedWriteOperationBase.cs | Constructs RetryableWriteContext directly; makes final result creation tolerant of null channel. |
| src/MongoDB.Driver/Core/Operations/BulkMixedWriteOperation.cs | Constructs RetryableWriteContext directly. |
| src/MongoDB.Driver/Core/Operations/AggregateOperation.cs | Implements ICommandCreator and uses dynamic ReadCommandOperation construction. |
| src/MongoDB.Driver/Core/Misc/BatchableSource.cs | Adds internal ctor to preserve processedCount; adjusts public ctor delegation/validation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/MongoDB.Driver/Core/Operations/RetryableInsertCommandOperation.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableDeleteCommandOperation.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableUpdateCommandOperation.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableWriteOperationExecutor.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableReadOperationExecutor.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableReadOperationExecutor.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableWriteOperationExecutor.cs
Outdated
Show resolved
Hide resolved
866f8a9 to
0fc39ef
Compare
| } | ||
|
|
||
| private EventContext.OperationNameDisposer BeginOperation() => EventContext.BeginOperation(OperationName); | ||
| private EventContext.OperationIdDisposer BeginOperation() => EventContext.BeginOperation(null, OperationName); |
There was a problem hiding this comment.
This is something strange. If we use this overloads, then the name of the command gets set properly.
This happens because before this PR, we were connecting to the server when creating the RetryableContext, and at that time the command name was set properly. Now this is done inside the retryable operation executor, and by that time (for example) ReadCommandOperation.Execute is called and that contains BeginOperation(null, null), that puts the command name to null. This means that the command name in ClusterSelectingServerEvent is null.
We need to understand if this is the way we want to go, and so do it for other read operations as well.
| /// Sets the command to be executed. This is used by derived classes that build commands dynamically. | ||
| /// </summary> | ||
| /// <param name="command">The command.</param> | ||
| protected void SetCommand(BsonDocument command) |
There was a problem hiding this comment.
As far as I understood we need this method so inhereted classes could set the command just before executing it. Also as far as I understood we do not expose the _command (other then for testing purposes). Should we instead of SetCommand have a CreateCommand abstarct method and use it from ExecuteProtocol, so each inherited class will be able to produce a command.
There was a problem hiding this comment.
Your comment made me realise we could actually just pass the command to ExecuteProtocol, so I removed the SetCommand method.
src/MongoDB.Driver/Core/Operations/RetryableDeleteCommandOperation.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableInsertCommandOperation.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/Operations/RetryableUpdateCommandOperation.cs
Outdated
Show resolved
Hide resolved
| server = context.SelectServer(operationContext, deprioritizedServers); | ||
| context.AcquireChannel(operationContext); | ||
|
|
||
| return operation.ExecuteAttempt(operationContext, context, totalAttempts, transactionNumber: null); |
There was a problem hiding this comment.
Should we track separatelly executionAttempts? Because in no-CSOT scenario we have only 2 attempts, in case there will be an error on server selection or channel aquision - we might exit after just 1 attempt.
There was a problem hiding this comment.
Are we talking about the attempt variable that goes into operation.ExecuteAttempt? If so then this variable it is not actually used for read operation, but instead we have a difference for write operation in RetryableWriteOperationExecutor, where we have both totalAttempts and operationExecutionAttempts. We could add it here too for the sake of symmetry, but it won't be used as well.
If it's about counting how many attempts we have done, and so to count how many attempts we have left, those errors should be counted as well as far as I have understood from the retryable reads and writes specs.
src/MongoDB.Driver/Core/Operations/RetryableWriteOperationExecutor.cs
Outdated
Show resolved
Hide resolved
| { | ||
| _databaseNamespace = Ensure.IsNotNull(databaseNamespace, nameof(databaseNamespace)); | ||
| _command = Ensure.IsNotNull(command, nameof(command)); | ||
| _command = command; //can be null |
There was a problem hiding this comment.
I would prefer to get rid of _command at all, especially after adding that as a parameter to ExecuteProtocol.
| var maxBatchCount = Math.Min(MaxBatchCount ?? int.MaxValue, channel.ConnectionDescription.MaxBatchCount); | ||
| var maxDocumentSize = channel.ConnectionDescription.MaxWireDocumentSize; | ||
| var payload = new Type1CommandMessageSection<UpdateRequest>("updates", _updates, UpdateRequestSerializer.Instance, NoOpElementNameValidator.Instance, maxBatchCount, maxDocumentSize); | ||
| var payload = new Type1CommandMessageSection<UpdateRequest>("updates", updates, UpdateRequestSerializer.Instance, NoOpElementNameValidator.Instance, maxBatchCount, maxDocumentSize); |
There was a problem hiding this comment.
I think we should revert this change too. Again it clearly looks like an issue, but it could be the reason why we did not face the missing operations on retry. Let's do this in separate PR/ticket after deeper investigation.
No description provided.