Skip to content

Conversation

@dluc
Copy link
Collaborator

@dluc dluc commented Dec 18, 2025

Complete Embedding Generators and Cache feature by implementing remaining Must-Have acceptance criteria.

Changes:

  • Add configurable embeddings batch sizing (default 10) via EmbeddingsConfig.batchSize and chunking for batch-capable providers.
  • Add shared transient HTTP retry/backoff (honors Retry-After) and use it in OpenAI/Azure/HF/Ollama providers.
  • Add Azure OpenAI managed identity auth (DefaultAzureCredential) alongside API key auth.
  • Add HuggingFace HF_TOKEN environment variable fallback.
  • Add/adjust unit + integration tests and stabilize env-var dependent tests.

- Add EmbeddingsConfig.batchSize + validate provider configs

- Implement shared HttpRetryPolicy (Retry-After + exponential backoff) and use it in OpenAI/Azure/HF/Ollama

- Support Azure OpenAI managed identity via DefaultAzureCredential

- Support HuggingFace HF_TOKEN fallback

- Update docs + expand unit/integration tests; stabilize env-var tests
@dluc dluc changed the title Complete feature 00007: embeddings batching, retries, managed identity Complete feature: embeddings batching, retries, managed identity Dec 18, 2025
Restore the original OLLAMA_AVAILABLE gating and remove the unrequested FTS-only config change.
@dluc
Copy link
Collaborator Author

dluc commented Dec 18, 2025

Restored to upstream behavior (reverted the FTS-only config + removed the accidental change to Ollama gating).

@dluc
Copy link
Collaborator Author

dluc commented Dec 18, 2025

Restored tests/Main.Tests/Integration/SearchProcessTests.cs to upstream behavior (reverted the FTS-only config change and restored the original Ollama/OLLAMA_AVAILABLE gating).

- Add per-attempt timeout support and Ollama-specific timeout

- Avoid retrying connection-refused/host-not-found to prevent long-running test hangs
@dluc
Copy link
Collaborator Author

dluc commented Dec 18, 2025

Pushed a fix for the retry regression: HttpRetryPolicy now fails fast on connection-refused/host-not-found/network-unreachable and enforces per-attempt timeouts (Ollama uses a low timeout).

This prevents coverage.sh/test runs from appearing hung when Ollama is stopped.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the embeddings feature by implementing batching with configurable batch sizes, adding HTTP retry/backoff logic for transient failures, supporting Azure OpenAI managed identity authentication, and adding HuggingFace HF_TOKEN environment variable fallback support. The changes include comprehensive test coverage and validation.

Key Changes

  • Embeddings Batching: Adds configurable BatchSize property (default: 10) to all embeddings configs and implements chunking logic in batch-capable providers (OpenAI, HuggingFace, Azure OpenAI)
  • HTTP Retry Policy: Implements a shared retry/backoff mechanism with exponential backoff, Retry-After header support, and per-attempt timeouts for all embedding providers
  • Azure Managed Identity: Adds support for DefaultAzureCredential authentication in Azure OpenAI alongside existing API key auth

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/Main.Tests/Services/EmbeddingGeneratorFactoryTests.cs Adds test for HF_TOKEN environment variable fallback
tests/Core.Tests/TestCollections.cs Defines non-parallel test collection for environment variable tests
tests/Core.Tests/Logging/SensitiveDataScrubbingPolicyTests.cs Adds collection attribute for thread-safe environment variable tests
tests/Core.Tests/Logging/EnvironmentDetectorTests.cs Adds collection attribute for thread-safe environment variable tests
tests/Core.Tests/Http/HttpRetryPolicyTests.cs Adds basic retry policy tests for 429 responses
tests/Core.Tests/Embeddings/Providers/OpenAIEmbeddingGeneratorTests.cs Updates tests with batchSize parameter and adds chunking test
tests/Core.Tests/Embeddings/Providers/OllamaEmbeddingGeneratorTests.cs Adds delayAsync parameter for fast unit tests
tests/Core.Tests/Embeddings/Providers/HuggingFaceEmbeddingGeneratorTests.cs Updates tests with batchSize parameter
tests/Core.Tests/Embeddings/Providers/AzureOpenAIEmbeddingGeneratorTests.cs Adds managed identity authentication tests with TestTokenCredential
src/Main/Services/EmbeddingGeneratorFactory.cs Updates factory to pass batchSize and handle HF_TOKEN fallback
src/Core/Http/HttpRetryPolicy.cs Implements new shared retry policy with exponential backoff and Retry-After support
src/Core/Embeddings/Providers/OpenAIEmbeddingGenerator.cs Adds batching with chunking and integrates retry policy
src/Core/Embeddings/Providers/OllamaEmbeddingGenerator.cs Integrates retry policy (no batching as Ollama doesn't support it)
src/Core/Embeddings/Providers/HuggingFaceEmbeddingGenerator.cs Adds batching with chunking and integrates retry policy
src/Core/Embeddings/Providers/AzureOpenAIEmbeddingGenerator.cs Adds managed identity support, batching, and retry policy integration
src/Core/Core.csproj Adds Azure.Identity package reference
src/Core/Constants.cs Adds HttpRetryDefaults constants for retry behavior
src/Core/Config/Embeddings/OpenAIEmbeddingsConfig.cs Adds BatchSize validation
src/Core/Config/Embeddings/OllamaEmbeddingsConfig.cs Adds BatchSize validation
src/Core/Config/Embeddings/HuggingFaceEmbeddingsConfig.cs Updates validation to allow HF_TOKEN env var and adds BatchSize validation
src/Core/Config/Embeddings/EmbeddingsConfig.cs Adds BatchSize property to base config
src/Core/Config/Embeddings/AzureOpenAIEmbeddingsConfig.cs Adds BatchSize validation
src/Core/Config/AppConfig.cs Adds pragma to suppress CA1724 warning from Azure.Identity namespace conflict

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dluc added 2 commits December 18, 2025 18:47
Resolve HF_TOKEN into HuggingFaceEmbeddingsConfig.ApiKey during validation so factories/providers rely only on config.
Add comment explaining the final throw is unreachable in normal flow and kept as defensive code.
@dluc dluc merged commit 94b69d3 into microsoft:main Dec 18, 2025
3 checks passed
@dluc dluc deleted the feat/00007-complete branch December 18, 2025 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant