Skip to content

Conversation

@dluc
Copy link
Collaborator

@dluc dluc commented Dec 18, 2025

Summary

  • Implement SQLite-based vector search index with math utilities and match models, plus factory wiring for embedding providers.
  • Centralize constants into src/Core/Constants.cs and update Core/Main code to consume the new structure.
  • Add doctor command and embedding generator factory integration.
  • Expand test suite: new vector index unit/integration tests, updated core/main tests, and new e2e harness/tests with per-test C# log files.
  • Update format.sh and add e2e-tests.sh runner; e2e tests default to Release build and use KM_BIN.

Details

  • New vector search components: IVectorIndex, SqliteVectorIndex, VectorMath, VectorMatch, plus tests covering persistence, error handling, and math.
  • Constants consolidation removed module-specific constant files and unified them in Constants.cs (search, embeddings, logging, config/app).
  • CLI/Services: added DoctorCommand, EmbeddingGeneratorFactory, enhanced SearchIndexFactory; CLI builder now emits bootstrap logs and respects --log-file.
  • E2E: added framework helpers (cli, db, logging), five e2e scenarios with per-test log files; runner script e2e-tests.sh builds if needed and sets KM_BIN.

@dluc dluc requested a review from Copilot December 18, 2025 13:23
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive vector search feature with SQLite-based indexing, embeddings caching, and diagnostic tooling. The changes consolidate constants, add factory patterns for embedding providers, introduce a doctor command for system health checks, and establish an end-to-end testing framework with per-test C# logging.

Key changes:

  • Vector search implementation with SQLite storage, math utilities, and embedding provider factories
  • Constants consolidation from module-specific files into src/Core/Constants.cs
  • New doctor command for validating configuration and checking system dependencies
  • Comprehensive test coverage including new E2E test framework with Python-based test harness

Reviewed changes

Copilot reviewed 118 out of 119 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/e2e/*.py Five new E2E test scenarios covering CRUD, search, FTS stemming, vector search, and embeddings cache
tests/e2e/framework/*.py E2E testing framework with CLI execution, database inspection, and logging helpers
tests/Main.Tests/Unit/Commands/DoctorCommandTests.cs Unit tests for new doctor command validating config and dependencies
tests/Main.Tests/Services/*FactoryTests.cs Tests for SearchIndexFactory and EmbeddingGeneratorFactory
tests/Core.Tests/Search/*VectorIndexTests.cs Comprehensive unit/integration tests for vector index functionality
tests/Core.Tests/Embeddings/Cache/SqliteEmbeddingCacheTests.cs Updated cache tests to include token_count parameter
src/Main/Services/EmbeddingGeneratorFactory.cs New factory for creating embedding generators from config
src/Main/CLI/Commands/DoctorCommand.cs New diagnostic command for system health checks
src/Core/Search/VectorMatch.cs Model for vector search results
src/Directory.Packages.props Package version updates to Microsoft.Extensions.* 10.0.0
Comments suppressed due to low confidence (1)

src/Directory.Packages.props:1

  • The Microsoft.Extensions packages are being updated to version 10.0.0. As of the knowledge cutoff (January 2025), .NET 10 has not been released yet. The latest stable version is .NET 9, with Microsoft.Extensions packages at version 9.x. Version 10.0.0 may not exist or may be a pre-release version. Verify that these package versions are valid and available in NuGet.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dluc dluc merged commit 9c6ba4f into microsoft:main Dec 18, 2025
3 checks passed
@dluc dluc deleted the 08vecsearchindexes branch December 18, 2025 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant