Skip to content

CSHARP-5646: Implement vector similarity match expressions#1907

Open
ajcvickers wants to merge 4 commits intomongodb:mainfrom
ajcvickers:CSHARP-5646
Open

CSHARP-5646: Implement vector similarity match expressions#1907
ajcvickers wants to merge 4 commits intomongodb:mainfrom
ajcvickers:CSHARP-5646

Conversation

@ajcvickers
Copy link
Contributor

This commit introduces support for MongoDB similarity functions ($similarityDotProduct, $similarityCosine, and $similarityEuclidean) in LINQ3 queries.

Summary of Changes

  • New Public API: Added SimilarityFunctions static class with DotProduct, Cosine, and Euclidean methods supporting IEnumerable<T> and ReadOnlyMemory<T>.
  • LINQ Translation: Implemented translation of these methods to their respective MQL aggregation operators.
  • AST Support: Added AstSimilarityFunctionExpression and associated AstNaryOperator values to represent these functions in the MongoDB Abstract Syntax Tree.
  • Serializer Deduction: Updated SerializerFinderVisitMethodCall to handle return type (double) for these new functions.

@ajcvickers ajcvickers requested a review from a team as a code owner March 10, 2026 13:28
@ajcvickers ajcvickers added the feature Adds new user-facing functionality. label Mar 10, 2026
@codeowners-service-app
Copy link

codeowners-service-app bot commented Mar 10, 2026

Assigned jordan-smith721 for team dbx-csharp-dotnet because adelinowona is out of office.
Assigned damieng for team dbx-csharp-dotnet because adelinowona is out of office.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds LINQ3 support for MongoDB vector similarity aggregation expressions by introducing a new SimilarityFunctions public API and translating it into the corresponding $similarity* MQL operators, including AST + serializer support and an appropriate server feature gate.

Changes:

  • Introduces MongoDB.Driver.Linq.SimilarityFunctions (DotProduct/Cosine/Euclidean) for use in LINQ queries.
  • Implements LINQ3 translation + AST rendering for $similarityDotProduct, $similarityCosine, and $similarityEuclidean.
  • Adds integration tests covering translation and execution for multiple vector container types (arrays/lists/collections/ReadOnlyMemory).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/MongoDB.Driver.Tests/Linq/Linq3Implementation/Translators/ExpressionToAggregationExpressionTranslators/MethodTranslators/SimilarityMethodToAggregationExpressionTranslatorTests.cs New integration tests validating translation/output for similarity functions across vector shapes and element types.
src/MongoDB.Driver/Linq/SimilarityFunctions.cs New public LINQ API surface for similarity functions (throws on local evaluation).
src/MongoDB.Driver/Linq/Linq3Implementation/Translators/ExpressionToAggregationExpressionTranslators/MethodTranslators/SimilarityFunctionsMethodToAggregationExpressionTranslator.cs New method translator mapping SimilarityFunctions calls to AST similarity expressions.
src/MongoDB.Driver/Linq/Linq3Implementation/Translators/ExpressionToAggregationExpressionTranslators/MethodCallExpressionToAggregationExpressionTranslator.cs Routes DotProduct/Cosine/Euclidean method calls to the new translator.
src/MongoDB.Driver/Linq/Linq3Implementation/SerializerFinders/SerializerFinderVisitMethodCall.cs Deduces numeric return serializer for SimilarityFunctions calls.
src/MongoDB.Driver/Linq/Linq3Implementation/Reflection/SimilarityFunctionsMethod.cs Registers SimilarityFunctions overloads for method identification in translation/serializer deduction.
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Visitors/AstNodeVisitor.cs Adds visitor support for the new similarity AST expression node.
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstSimilarityFunctionExpression.cs New AST node that renders similarity operators with { vectors: [...], score: ... }.
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstNaryOperator.cs Adds operator enum values + rendering for $similarity* operators.
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstExpression.cs Adds factory method for creating similarity AST expressions.
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/AstNodeType.cs Adds SimilarityFunctionExpression node type.
src/MongoDB.Driver/Core/Misc/Feature.cs Adds Feature.SimilarityFunctions gated at WireVersion.Server82.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@jordan-smith721 jordan-smith721 removed their request for review March 10, 2026 14:59
/// C# to Mongo Query Language (MQL) for execution in the MongoDB database. Calling the method directly is
/// not supported.
/// </summary>
public static class SimilarityFunctions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually put such methods into Mql static class. It might be not too bad idea to group methods by the functionality, but also it will be more discoverable if it was under the Mql class. May be something like:

Mql.VectorSimilarity.DotProduct<>()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this option, but decided functional grouping into a different class was fine because of the existence of MongoDBMath, which seems like has less justification for a new type than here. Is there something different about MongoDBMath, or was that a mistake?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MongoDBMath pre-dated Mql.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

=> Throw<TElement>(nameof(Euclidean));

private static double Throw<TElement>(string methodName)
=> throw new InvalidOperationException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar methods in Mql class throws NotSupportedException. I think we should be consistent here and throw the same exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, we could choose to implement this. But I'm fine with changing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

BitXor,
Concat,
ConcatArrays,
Cosine,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this new enum values? Somehow I was under impression that AstNaryOperator enum works together with AstNaryExpression class, but if we have a specialized class, we probably do not need to have this enum values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. Have made updates.

Copy link
Member

@sanych-sun sanych-sun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM + minor comment

/// <param name="normalizeScore">Whether to normalize the result for use as a vector search score.</param>
/// <typeparam name="TElement">The vector element type</typeparam>
/// <returns>The dot-product measure between the two vectors.</returns>
/// <exception cref="NotSupportedException">if executed.</exception>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: exception here seems to be a little misleading. Let's remove it or explain it a little more:
"NotSupportedException if executed outside of LINQ expression."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does, "outside the LINQ expression" mean? What kind of execution can happen that is not "outside the LINQ expression?" If the LINQ expression is translated, then it isn't executed. The only thing I can think of is compiling and then executing the LINQ expression, but that will still throw, and it's not really "outside" anyway.

Copy link
Contributor Author

@ajcvickers ajcvickers Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about, "if used for anything other than translating to MQL?"

This commit introduces support for MongoDB similarity functions (`$similarityDotProduct`, `$similarityCosine`, and `$similarityEuclidean`) in LINQ3 queries.

#### Summary of Changes
- **New Public API**: Added `SimilarityFunctions` static class with `DotProduct`, `Cosine`, and `Euclidean` methods supporting `IEnumerable<T>` and `ReadOnlyMemory<T>`.
- **LINQ Translation**: Implemented translation of these methods to their respective MQL aggregation operators.
- **AST Support**: Added `AstSimilarityFunctionExpression` and associated `AstNaryOperator` values to represent these functions in the MongoDB Abstract Syntax Tree.
- **Serializer Deduction**: Updated `SerializerFinderVisitMethodCall` to handle return type (double) for these new functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Adds new user-facing functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants