Skip to content

feat: implement /v1/embeddings endpoint for OpenAI-compatible embeddings support#5

Open
dittops wants to merge 6 commits intomainfrom
feature/4-embeddings-endpoint
Open

feat: implement /v1/embeddings endpoint for OpenAI-compatible embeddings support#5
dittops wants to merge 6 commits intomainfrom
feature/4-embeddings-endpoint

Conversation

@dittops
Copy link
Member

@dittops dittops commented Jun 18, 2025

This pr implements comprehensive embeddings support across all AIBrix components:

Core Implementation:

  • Add EmbeddingRequest, EmbeddingResponse, EmbeddingData, and EmbeddingUsage protocol models
  • Implement create_embeddings abstract method in InferenceEngine base class
  • Add VLLM engine implementation for embeddings with proper error handling
  • Create /v1/embeddings FastAPI endpoint following existing patterns

Gateway Integration:

  • Update request validation in gateway util.go to handle embeddings requests
  • Support multiple input formats (string, string array, token arrays)
  • Extract model and message fields for routing decisions

Testing:

  • Comprehensive unit tests for all protocol models
  • Integration tests for VLLM engine and API endpoint
  • Test coverage for error scenarios and edge cases
  • Validation tests for different input formats

Documentation:

  • Complete embeddings API guide with usage examples
  • RAG integration examples and best practices
  • Configuration instructions and troubleshooting guide
  • Performance optimization recommendations

Features:

  • OpenAI-compatible API specification
  • Support for float and base64 encoding formats
  • Batch processing capabilities
  • Proper error handling with meaningful messages
  • Authentication support via API keys

@dittops dittops force-pushed the feature/4-embeddings-endpoint branch from 952e2b8 to ecdfaf9 Compare June 23, 2025 06:49
dittops added 5 commits June 23, 2025 07:21
…ngs support

Signed-off-by: dittops <dittops@gmail.com>
Signed-off-by: dittops <dittops@gmail.com>
  - Add try-except for JSON parsing errors in vllm.py
  - Use specific httpx.RequestError instead of generic Exception
  - Improve token array handling in util.go for numeric inputs

Signed-off-by: dittops <dittops@gmail.com>
Signed-off-by: dittops <dittops@gmail.com>
Signed-off-by: dittops <dittops@gmail.com>
@dittops dittops force-pushed the feature/4-embeddings-endpoint branch from 58692c4 to a4ff6a6 Compare June 23, 2025 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant