CSHARP-5646: Implement vector similarity match expressions#1907
CSHARP-5646: Implement vector similarity match expressions#1907ajcvickers wants to merge 4 commits intomongodb:mainfrom
Conversation
|
Assigned |
There was a problem hiding this comment.
Pull request overview
Adds LINQ3 support for MongoDB vector similarity aggregation expressions by introducing a new SimilarityFunctions public API and translating it into the corresponding $similarity* MQL operators, including AST + serializer support and an appropriate server feature gate.
Changes:
- Introduces
MongoDB.Driver.Linq.SimilarityFunctions(DotProduct/Cosine/Euclidean) for use in LINQ queries. - Implements LINQ3 translation + AST rendering for
$similarityDotProduct,$similarityCosine, and$similarityEuclidean. - Adds integration tests covering translation and execution for multiple vector container types (arrays/lists/collections/ReadOnlyMemory).
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/MongoDB.Driver.Tests/Linq/Linq3Implementation/Translators/ExpressionToAggregationExpressionTranslators/MethodTranslators/SimilarityMethodToAggregationExpressionTranslatorTests.cs | New integration tests validating translation/output for similarity functions across vector shapes and element types. |
| src/MongoDB.Driver/Linq/SimilarityFunctions.cs | New public LINQ API surface for similarity functions (throws on local evaluation). |
| src/MongoDB.Driver/Linq/Linq3Implementation/Translators/ExpressionToAggregationExpressionTranslators/MethodTranslators/SimilarityFunctionsMethodToAggregationExpressionTranslator.cs | New method translator mapping SimilarityFunctions calls to AST similarity expressions. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Translators/ExpressionToAggregationExpressionTranslators/MethodCallExpressionToAggregationExpressionTranslator.cs | Routes DotProduct/Cosine/Euclidean method calls to the new translator. |
| src/MongoDB.Driver/Linq/Linq3Implementation/SerializerFinders/SerializerFinderVisitMethodCall.cs | Deduces numeric return serializer for SimilarityFunctions calls. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Reflection/SimilarityFunctionsMethod.cs | Registers SimilarityFunctions overloads for method identification in translation/serializer deduction. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Visitors/AstNodeVisitor.cs | Adds visitor support for the new similarity AST expression node. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstSimilarityFunctionExpression.cs | New AST node that renders similarity operators with { vectors: [...], score: ... }. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstNaryOperator.cs | Adds operator enum values + rendering for $similarity* operators. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstExpression.cs | Adds factory method for creating similarity AST expressions. |
| src/MongoDB.Driver/Linq/Linq3Implementation/Ast/AstNodeType.cs | Adds SimilarityFunctionExpression node type. |
| src/MongoDB.Driver/Core/Misc/Feature.cs | Adds Feature.SimilarityFunctions gated at WireVersion.Server82. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
...nTranslators/MethodTranslators/SimilarityFunctionsMethodToAggregationExpressionTranslator.cs
Show resolved
Hide resolved
...ssionTranslators/MethodTranslators/SimilarityMethodToAggregationExpressionTranslatorTests.cs
Outdated
Show resolved
Hide resolved
...slators/MethodTranslators/SimilarityFunctionsMethodToAggregationExpressionTranslatorTests.cs
Show resolved
Hide resolved
| /// C# to Mongo Query Language (MQL) for execution in the MongoDB database. Calling the method directly is | ||
| /// not supported. | ||
| /// </summary> | ||
| public static class SimilarityFunctions |
There was a problem hiding this comment.
We usually put such methods into Mql static class. It might be not too bad idea to group methods by the functionality, but also it will be more discoverable if it was under the Mql class. May be something like:
Mql.VectorSimilarity.DotProduct<>()
There was a problem hiding this comment.
I considered this option, but decided functional grouping into a different class was fine because of the existence of MongoDBMath, which seems like has less justification for a new type than here. Is there something different about MongoDBMath, or was that a mistake?
There was a problem hiding this comment.
I think MongoDBMath pre-dated Mql.
| => Throw<TElement>(nameof(Euclidean)); | ||
|
|
||
| private static double Throw<TElement>(string methodName) | ||
| => throw new InvalidOperationException( |
There was a problem hiding this comment.
Similar methods in Mql class throws NotSupportedException. I think we should be consistent here and throw the same exception.
There was a problem hiding this comment.
Technically, we could choose to implement this. But I'm fine with changing it.
| BitXor, | ||
| Concat, | ||
| ConcatArrays, | ||
| Cosine, |
There was a problem hiding this comment.
Do we really need this new enum values? Somehow I was under impression that AstNaryOperator enum works together with AstNaryExpression class, but if we have a specialized class, we probably do not need to have this enum values.
There was a problem hiding this comment.
This makes sense. Have made updates.
src/MongoDB.Driver/Mql.cs
Outdated
| /// <param name="normalizeScore">Whether to normalize the result for use as a vector search score.</param> | ||
| /// <typeparam name="TElement">The vector element type</typeparam> | ||
| /// <returns>The dot-product measure between the two vectors.</returns> | ||
| /// <exception cref="NotSupportedException">if executed.</exception> |
There was a problem hiding this comment.
Minor: exception here seems to be a little misleading. Let's remove it or explain it a little more:
"NotSupportedException if executed outside of LINQ expression."
There was a problem hiding this comment.
What does, "outside the LINQ expression" mean? What kind of execution can happen that is not "outside the LINQ expression?" If the LINQ expression is translated, then it isn't executed. The only thing I can think of is compiling and then executing the LINQ expression, but that will still throw, and it's not really "outside" anyway.
There was a problem hiding this comment.
How about, "if used for anything other than translating to MQL?"
This commit introduces support for MongoDB similarity functions (`$similarityDotProduct`, `$similarityCosine`, and `$similarityEuclidean`) in LINQ3 queries. #### Summary of Changes - **New Public API**: Added `SimilarityFunctions` static class with `DotProduct`, `Cosine`, and `Euclidean` methods supporting `IEnumerable<T>` and `ReadOnlyMemory<T>`. - **LINQ Translation**: Implemented translation of these methods to their respective MQL aggregation operators. - **AST Support**: Added `AstSimilarityFunctionExpression` and associated `AstNaryOperator` values to represent these functions in the MongoDB Abstract Syntax Tree. - **Serializer Deduction**: Updated `SerializerFinderVisitMethodCall` to handle return type (double) for these new functions.
This commit introduces support for MongoDB similarity functions (
$similarityDotProduct,$similarityCosine, and$similarityEuclidean) in LINQ3 queries.Summary of Changes
SimilarityFunctionsstatic class withDotProduct,Cosine, andEuclideanmethods supportingIEnumerable<T>andReadOnlyMemory<T>.AstSimilarityFunctionExpressionand associatedAstNaryOperatorvalues to represent these functions in the MongoDB Abstract Syntax Tree.SerializerFinderVisitMethodCallto handle return type (double) for these new functions.