-
Notifications
You must be signed in to change notification settings - Fork 397
Complete feature: embeddings batching, retries, managed identity #1114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add EmbeddingsConfig.batchSize + validate provider configs - Implement shared HttpRetryPolicy (Retry-After + exponential backoff) and use it in OpenAI/Azure/HF/Ollama - Support Azure OpenAI managed identity via DefaultAzureCredential - Support HuggingFace HF_TOKEN fallback - Update docs + expand unit/integration tests; stabilize env-var tests
Restore the original OLLAMA_AVAILABLE gating and remove the unrequested FTS-only config change.
|
Restored to upstream behavior (reverted the FTS-only config + removed the accidental change to Ollama gating). |
|
Restored |
- Add per-attempt timeout support and Ollama-specific timeout - Avoid retrying connection-refused/host-not-found to prevent long-running test hangs
|
Pushed a fix for the retry regression: This prevents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR completes the embeddings feature by implementing batching with configurable batch sizes, adding HTTP retry/backoff logic for transient failures, supporting Azure OpenAI managed identity authentication, and adding HuggingFace HF_TOKEN environment variable fallback support. The changes include comprehensive test coverage and validation.
Key Changes
- Embeddings Batching: Adds configurable
BatchSizeproperty (default: 10) to all embeddings configs and implements chunking logic in batch-capable providers (OpenAI, HuggingFace, Azure OpenAI) - HTTP Retry Policy: Implements a shared retry/backoff mechanism with exponential backoff, Retry-After header support, and per-attempt timeouts for all embedding providers
- Azure Managed Identity: Adds support for DefaultAzureCredential authentication in Azure OpenAI alongside existing API key auth
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Main.Tests/Services/EmbeddingGeneratorFactoryTests.cs | Adds test for HF_TOKEN environment variable fallback |
| tests/Core.Tests/TestCollections.cs | Defines non-parallel test collection for environment variable tests |
| tests/Core.Tests/Logging/SensitiveDataScrubbingPolicyTests.cs | Adds collection attribute for thread-safe environment variable tests |
| tests/Core.Tests/Logging/EnvironmentDetectorTests.cs | Adds collection attribute for thread-safe environment variable tests |
| tests/Core.Tests/Http/HttpRetryPolicyTests.cs | Adds basic retry policy tests for 429 responses |
| tests/Core.Tests/Embeddings/Providers/OpenAIEmbeddingGeneratorTests.cs | Updates tests with batchSize parameter and adds chunking test |
| tests/Core.Tests/Embeddings/Providers/OllamaEmbeddingGeneratorTests.cs | Adds delayAsync parameter for fast unit tests |
| tests/Core.Tests/Embeddings/Providers/HuggingFaceEmbeddingGeneratorTests.cs | Updates tests with batchSize parameter |
| tests/Core.Tests/Embeddings/Providers/AzureOpenAIEmbeddingGeneratorTests.cs | Adds managed identity authentication tests with TestTokenCredential |
| src/Main/Services/EmbeddingGeneratorFactory.cs | Updates factory to pass batchSize and handle HF_TOKEN fallback |
| src/Core/Http/HttpRetryPolicy.cs | Implements new shared retry policy with exponential backoff and Retry-After support |
| src/Core/Embeddings/Providers/OpenAIEmbeddingGenerator.cs | Adds batching with chunking and integrates retry policy |
| src/Core/Embeddings/Providers/OllamaEmbeddingGenerator.cs | Integrates retry policy (no batching as Ollama doesn't support it) |
| src/Core/Embeddings/Providers/HuggingFaceEmbeddingGenerator.cs | Adds batching with chunking and integrates retry policy |
| src/Core/Embeddings/Providers/AzureOpenAIEmbeddingGenerator.cs | Adds managed identity support, batching, and retry policy integration |
| src/Core/Core.csproj | Adds Azure.Identity package reference |
| src/Core/Constants.cs | Adds HttpRetryDefaults constants for retry behavior |
| src/Core/Config/Embeddings/OpenAIEmbeddingsConfig.cs | Adds BatchSize validation |
| src/Core/Config/Embeddings/OllamaEmbeddingsConfig.cs | Adds BatchSize validation |
| src/Core/Config/Embeddings/HuggingFaceEmbeddingsConfig.cs | Updates validation to allow HF_TOKEN env var and adds BatchSize validation |
| src/Core/Config/Embeddings/EmbeddingsConfig.cs | Adds BatchSize property to base config |
| src/Core/Config/Embeddings/AzureOpenAIEmbeddingsConfig.cs | Adds BatchSize validation |
| src/Core/Config/AppConfig.cs | Adds pragma to suppress CA1724 warning from Azure.Identity namespace conflict |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/Core.Tests/Embeddings/Providers/AzureOpenAIEmbeddingGeneratorTests.cs
Show resolved
Hide resolved
Resolve HF_TOKEN into HuggingFaceEmbeddingsConfig.ApiKey during validation so factories/providers rely only on config.
Add comment explaining the final throw is unreachable in normal flow and kept as defensive code.
Complete Embedding Generators and Cache feature by implementing remaining Must-Have acceptance criteria.
Changes:
EmbeddingsConfig.batchSizeand chunking for batch-capable providers.Retry-After) and use it in OpenAI/Azure/HF/Ollama providers.DefaultAzureCredential) alongside API key auth.HF_TOKENenvironment variable fallback.