Skip to content

feat: Add Synthetic and OpenRouter embedding providers#106

Merged
yoanbernabeu merged 7 commits intoyoanbernabeu:mainfrom
Revaz-Goguadze:feat/add-synthetic-and-openrouter-providers
Feb 19, 2026
Merged

feat: Add Synthetic and OpenRouter embedding providers#106
yoanbernabeu merged 7 commits intoyoanbernabeu:mainfrom
Revaz-Goguadze:feat/add-synthetic-and-openrouter-providers

Conversation

@Revaz-Goguadze
Copy link
Copy Markdown
Contributor

  • Added synthetic.go: Support for Synthetic API (https://api.synthetic.new) with nomic-ai/nomic-embed-text-v1.5 model
  • Added openrouter.go: Support for OpenRouter API gateway with multiple provider access
  • Updated config.yaml: Added 'synthetic' and 'openrouter' to provider options with default endpoints
  • Updated CLI handlers: Added cases in watch.go (1 location), search.go (3 locations), mcp/server.go (2 locations)
  • Updated .gitignore: Exclude example documentation files (OPENROUTER.md, openrouter-example.sh)
  • Environment variable support: Both providers accept their own API keys or fallback to OPENAI_API_KEY
  • Default dimensions: synthetic (768), openrouter (1536)
  • Connection testing: Ping() method for both providers

Description

This PR adds support for two new embedding providers: Synthetic API and
OpenRouter, expanding grepai's embedding options beyond Ollama, LM Studio, and
OpenAI.

Synthetic API (synthetic provider)

  • New provider embedder/synthetic.go implementing the Embedder interface
  • Supports the Synthetic API at https://api.synthetic.new/openai/v1
  • Default model: hf:nomic-ai/nomic-embed-text-v1.5 (768 dimensions)
  • Accepts SYNTHETIC_API_KEY environment variable or falls back to OPENAI_API_KEY
  • Includes Ping() method for connection testing
  • 90-second timeout for API requests (longer than default for production
    stability)

OpenRouter (openrouter provider)

  • New provider embedder/openrouter.go implementing the Embedder interface`
    interface
  • Supports the OpenRouter API gateway at https://openrouter.ai/api/v1
  • Acts as a multi-provider gateway giving access to OpenAI, Anthropic, Cohere,
    and more
  • Default model: openai/text-embedding-3-small (1536 dimensions)
  • Accepts OPENROUTER_API_KEY environment variable or falls back to
    OPENAI_API_KEY
  • Configurable parallelism for batch embedding
  • Sends proper OpenRouter-specific headers (HTTP-Referer, X-Title)
  • Includes Ping() method for connection testing

Configuration Updates

CLI Integration

Updated embedder initialization in:

  • cli/watch.go: Added both providers for grepai watch command
  • cli/search.go: Added both providers for grepai search command (3 locations for
    different contexts)
  • mcp/server.go: Added both providers for MCP server (2 locations)

Documentation

  • Updated .gitignore to exclude example documentation files (OPENROUTER.md,
    openrouter-example.sh)
  • Created example files locally (not committed) for user reference

Related Issue

None

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to
    change)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

How Has This Been Tested?

  • Unit tests
  • Integration tests
  • Manual testing

Test Configuration

  • OS: Linux amd64
  • Go version: 1.25.6 X:nodwarf5
  • Embedding providers: Existing tests pass (22 tests in embedder package)

All existing tests continue to pass:

  ok      github.com/yoanbernabeu/grepai/embedder    (cached)

Manual Testing

Created and tested locally with both providers:

  • Synthetic API: Successfully connected and generated embeddings with test
    queries using SYNTHETIC_API_KEY
  • OpenRouter: Successfully connected with OPENROUTER_API_KEY

Checklist

  • My code follows the project's code style
  • I have run golangci-lint run and fixed any issues
  • I have added tests that prove my fix/feature works
  • I have updated the documentation if needed
  • I have added an entry to CHANGELOG.md (if applicable)
  • My changes generate no new warnings
  • All new and existing tests pass

- Added synthetic.go: Support for Synthetic API (https://api.synthetic.new) with nomic-ai/nomic-embed-text-v1.5 model
- Added openrouter.go: Support for OpenRouter API gateway with multiple provider access
- Updated config.yaml: Added 'synthetic' and 'openrouter' to provider options with default endpoints
- Updated CLI handlers: Added cases in watch.go (1 location), search.go (3 locations), mcp/server.go (2 locations)
- Updated .gitignore: Exclude example documentation files (OPENROUTER.md, openrouter-example.sh)
- Environment variable support: Both providers accept their own API keys or fallback to OPENAI_API_KEY
- Default dimensions: synthetic (768), openrouter (1536)
- Connection testing: Ping() method for both providers
Copilot AI review requested due to automatic review settings February 5, 2026 16:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for two new embedding providers to expand grepai's embedding options: Synthetic API (a cloud-based service) and OpenRouter (a multi-provider gateway).

Changes:

  • Added two new embedder implementations (synthetic.go and openrouter.go) following the established Embedder interface pattern
  • Updated configuration to include default endpoints and dimensions for the new providers
  • Integrated both providers into CLI commands (watch, search) and MCP server with appropriate API key handling and error messages

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
embedder/synthetic.go New Synthetic API embedder with default nomic-ai/nomic-embed-text-v1.5 model (768 dimensions)
embedder/openrouter.go New OpenRouter embedder with default openai/text-embedding-3-small model (1536 dimensions)
config/config.go Added synthetic and openrouter to provider list, default endpoints, and dimension handling
cli/watch.go Added synthetic and openrouter cases with Ping checks and parallelism support
cli/search.go Added synthetic and openrouter cases across three initialization locations
mcp/server.go Added synthetic and openrouter cases in both createEmbedder and createWorkspaceEmbedder functions
.gitignore Excluded example documentation files (OPENROUTER.md, openrouter-example.sh)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread config/config.go Outdated
}

// Only set default dimensions for local embedders (Ollama, LMStudio).
// Only set default dimensions for local embedders (Ollama, LMStudio, Synthetic).
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading comment: Synthetic is not a local embedder but a cloud API (https://api.synthetic.new). Unlike Ollama and LMStudio which run locally, Synthetic requires an API key and is hosted externally. The comment should distinguish between local embedders (Ollama, LMStudio) and API-based ones (OpenAI, Synthetic, OpenRouter), or simply remove "local" from the description.

Suggested change
// Only set default dimensions for local embedders (Ollama, LMStudio, Synthetic).
// Only set default dimensions for specific embedders (Ollama, LMStudio, Synthetic).

Copilot uses AI. Check for mistakes.
Comment thread mcp/server.go Outdated
opts := []embedder.OpenRouterOption{
embedder.WithOpenRouterModel(ws.Embedder.Model),
embedder.WithOpenRouterKey(ws.Embedder.APIKey),
embedder.WithOpenRouterEndpoint(ws.Embedder.Endpoint),
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing parallelism parameter for OpenRouter embedder in workspace configuration. The OpenAI embedder passes ws.Embedder.Parallelism (line 431), but OpenRouter does not. This is inconsistent with the createEmbedder function (line 892) where OpenRouter does pass cfg.Embedder.Parallelism. For consistency, consider adding: embedder.WithOpenRouterParallelism(ws.Embedder.Parallelism)

Suggested change
embedder.WithOpenRouterEndpoint(ws.Embedder.Endpoint),
embedder.WithOpenRouterEndpoint(ws.Embedder.Endpoint),
embedder.WithOpenRouterParallelism(ws.Embedder.Parallelism),

Copilot uses AI. Check for mistakes.
Comment thread embedder/synthetic.go
Comment on lines +1 to +210
package embedder

import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"time"
)

const (
defaultSyntheticEndpoint = "https://api.synthetic.new"
defaultSyntheticPath = "/openai/v1"
defaultSyntheticModel = "hf:nomic-ai/nomic-embed-text-v1.5"
syntheticEmbedDimensions = 768
)

type SyntheticEmbedder struct {
endpoint string
model string
apiKey string
dimensions int
client *http.Client
}

type syntheticEmbedRequest struct {
Model string `json:"model"`
Input []string `json:"input"`
Dimensions *int `json:"dimensions,omitempty"`
}

type syntheticEmbedResponse struct {
Data []struct {
Embedding []float32 `json:"embedding"`
Index int `json:"index"`
} `json:"data"`
Model string `json:"model,omitempty"`
Usage struct {
PromptTokens int `json:"prompt_tokens"`
TotalTokens int `json:"total_tokens"`
} `json:"usage"`
}

type syntheticErrorResponse struct {
Error struct {
Message string `json:"message"`
Type string `json:"type"`
} `json:"error"`
}

type SyntheticOption func(*SyntheticEmbedder)

func WithSyntheticEndpoint(endpoint string) SyntheticOption {
return func(e *SyntheticEmbedder) {
e.endpoint = endpoint
}
}

func WithSyntheticModel(model string) SyntheticOption {
return func(e *SyntheticEmbedder) {
e.model = model
}
}

func WithSyntheticKey(key string) SyntheticOption {
return func(e *SyntheticEmbedder) {
e.apiKey = key
}
}

func WithSyntheticDimensions(dimensions int) SyntheticOption {
return func(e *SyntheticEmbedder) {
e.dimensions = dimensions
}
}

func NewSyntheticEmbedder(opts ...SyntheticOption) (*SyntheticEmbedder, error) {
e := &SyntheticEmbedder{
endpoint: defaultSyntheticEndpoint + defaultSyntheticPath,
model: defaultSyntheticModel,
dimensions: syntheticEmbedDimensions,
client: &http.Client{
Timeout: 90 * time.Second, // Longer timeout for synthetic API
},
}

for _, opt := range opts {
opt(e)
}

// Try to get API key from environment if not set
if e.apiKey == "" {
e.apiKey = os.Getenv("SYNTHETIC_API_KEY")
}

if e.apiKey == "" {
e.apiKey = os.Getenv("OPENAI_API_KEY")
}

if e.apiKey == "" {
return nil, fmt.Errorf("Synthetic API key not set (use SYNTHETIC_API_KEY or OPENAI_API_KEY environment variable)")
}

return e, nil
}

func (e *SyntheticEmbedder) Embed(ctx context.Context, text string) ([]float32, error) {
embeddings, err := e.EmbedBatch(ctx, []string{text})
if err != nil {
return nil, err
}
return embeddings[0], nil
}

func (e *SyntheticEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error) {
if len(texts) == 0 {
return nil, nil
}

reqBody := syntheticEmbedRequest{
Model: e.model,
Input: texts,
Dimensions: &e.dimensions,
}

jsonData, err := json.Marshal(reqBody)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}

url := fmt.Sprintf("%s/embeddings", e.endpoint)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", e.apiKey))

resp, err := e.client.Do(req)
if err != nil {
return nil, fmt.Errorf("failed to send request to Synthetic: %w", err)
}
defer resp.Body.Close()

body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response: %w", err)
}

if resp.StatusCode != http.StatusOK {
var errResp syntheticErrorResponse
msg := string(body)
if json.Unmarshal(body, &errResp) == nil && errResp.Error.Message != "" {
msg = errResp.Error.Message
}
return nil, fmt.Errorf("Synthetic API error (status %d): %s", resp.StatusCode, msg)
}

var result syntheticEmbedResponse
if err := json.Unmarshal(body, &result); err != nil {
return nil, fmt.Errorf("failed to decode response: %w", err)
}

if len(result.Data) != len(texts) {
return nil, fmt.Errorf("expected %d embeddings, got %d", len(texts), len(result.Data))
}

// Sort by index to maintain order
embeddings := make([][]float32, len(texts))
for _, item := range result.Data {
embeddings[item.Index] = item.Embedding
}

return embeddings, nil
}

func (e *SyntheticEmbedder) Dimensions() int {
return e.dimensions
}

func (e *SyntheticEmbedder) Close() error {
return nil
}

// Ping checks if Synthetic API is reachable
func (e *SyntheticEmbedder) Ping(ctx context.Context) error {
url := fmt.Sprintf("%s/embeddings", e.endpoint)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader([]byte(`{"model":"hf:nomic-ai/nomic-embed-text-v1.5","input":"test"}`)))
if err != nil {
return fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", e.apiKey))

resp, err := e.client.Do(req)
if err != nil {
return fmt.Errorf("failed to reach Synthetic at %s: %w", e.endpoint, err)
}
defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return fmt.Errorf("Synthetic returned status %d: %s", resp.StatusCode, string(body))
}

return nil
}
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the SyntheticEmbedder. All other embedders (Ollama, LMStudio, OpenAI) have corresponding test cases in embedder_test.go that verify default values, options, and the Dimensions() method. Consider adding similar test coverage for SyntheticEmbedder following the established pattern.

Copilot uses AI. Check for mistakes.
Comment thread embedder/openrouter.go
Comment on lines +1 to +224
package embedder

import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
"time"
)

const (
defaultOpenRouterEndpoint = "https://openrouter.ai/api/v1"
defaultOpenRouterModel = "openai/text-embedding-3-small"
openRouterDimensions = 1536
)

type OpenRouterEmbedder struct {
endpoint string
model string
apiKey string
dimensions *int
parallelism int
client *http.Client
}

type openRouterEmbedRequest struct {
Model string `json:"model"`
Input []string `json:"input"`
Dimensions *int `json:"dimensions,omitempty"`
}

type openRouterEmbedResponse struct {
Data []struct {
Embedding []float32 `json:"embedding"`
Index int `json:"index"`
} `json:"data"`
Usage struct {
PromptTokens int `json:"prompt_tokens"`
TotalTokens int `json:"total_tokens"`
} `json:"usage"`
}

type openRouterErrorResponse struct {
Error struct {
Message string `json:"message"`
Type string `json:"type"`
} `json:"error"`
}

type OpenRouterOption func(*OpenRouterEmbedder)

func WithOpenRouterEndpoint(endpoint string) OpenRouterOption {
return func(e *OpenRouterEmbedder) {
e.endpoint = endpoint
}
}

func WithOpenRouterModel(model string) OpenRouterOption {
return func(e *OpenRouterEmbedder) {
e.model = model
}
}

func WithOpenRouterKey(key string) OpenRouterOption {
return func(e *OpenRouterEmbedder) {
e.apiKey = key
}
}

func WithOpenRouterDimensions(dimensions int) OpenRouterOption {
return func(e *OpenRouterEmbedder) {
e.dimensions = &dimensions
}
}

func WithOpenRouterParallelism(parallelism int) OpenRouterOption {
return func(e *OpenRouterEmbedder) {
if parallelism > 0 {
e.parallelism = parallelism
}
}
}

func NewOpenRouterEmbedder(opts ...OpenRouterOption) (*OpenRouterEmbedder, error) {
e := &OpenRouterEmbedder{
endpoint: defaultOpenRouterEndpoint,
model: defaultOpenRouterModel,
dimensions: nil, // nil = let the model use its native dimensions
parallelism: 4, // default parallelism
client: &http.Client{
Timeout: 60 * time.Second,
},
}

for _, opt := range opts {
opt(e)
}

// Try to get API key from environment if not set
if e.apiKey == "" {
e.apiKey = os.Getenv("OPENROUTER_API_KEY")
}

if e.apiKey == "" {
e.apiKey = os.Getenv("OPENAI_API_KEY")
}

if e.apiKey == "" {
return nil, fmt.Errorf("OpenRouter API key not set (use OPENROUTER_API_KEY or OPENAI_API_KEY environment variable)")
}

return e, nil
}

func (e *OpenRouterEmbedder) Embed(ctx context.Context, text string) ([]float32, error) {
embeddings, err := e.EmbedBatch(ctx, []string{text})
if err != nil {
return nil, err
}
return embeddings[0], nil
}

func (e *OpenRouterEmbedder) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error) {
if len(texts) == 0 {
return nil, nil
}

reqBody := openRouterEmbedRequest{
Model: e.model,
Input: texts,
Dimensions: e.dimensions,
}

jsonData, err := json.Marshal(reqBody)
if err != nil {
return nil, fmt.Errorf("failed to marshal request: %w", err)
}

url := fmt.Sprintf("%s/embeddings", e.endpoint)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(jsonData))
if err != nil {
return nil, fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", e.apiKey))
req.Header.Set("HTTP-Referer", "grepai")
req.Header.Set("X-Title", "grepai")

resp, err := e.client.Do(req)
if err != nil {
return nil, fmt.Errorf("failed to send request to OpenRouter: %w", err)
}
defer resp.Body.Close()

body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response: %w", err)
}

if resp.StatusCode != http.StatusOK {
var errResp openRouterErrorResponse
msg := string(body)
if json.Unmarshal(body, &errResp) == nil && errResp.Error.Message != "" {
msg = errResp.Error.Message
}
return nil, fmt.Errorf("OpenRouter API error (status %d): %s", resp.StatusCode, msg)
}

var result openRouterEmbedResponse
if err := json.Unmarshal(body, &result); err != nil {
return nil, fmt.Errorf("failed to decode response: %w", err)
}

if len(result.Data) != len(texts) {
return nil, fmt.Errorf("expected %d embeddings, got %d", len(texts), len(result.Data))
}

// Sort by index to maintain order
embeddings := make([][]float32, len(texts))
for _, item := range result.Data {
embeddings[item.Index] = item.Embedding
}

return embeddings, nil
}

func (e *OpenRouterEmbedder) Dimensions() int {
if e.dimensions == nil {
return openRouterDimensions
}
return *e.dimensions
}

func (e *OpenRouterEmbedder) Close() error {
return nil
}

// Ping checks if OpenRouter API is reachable
func (e *OpenRouterEmbedder) Ping(ctx context.Context) error {
url := fmt.Sprintf("%s/embeddings", e.endpoint)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader([]byte(`{"model":"openai/text-embedding-3-small","input":"test"}`)))
if err != nil {
return fmt.Errorf("failed to create request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", e.apiKey))
req.Header.Set("HTTP-Referer", "grepai")

resp, err := e.client.Do(req)
if err != nil {
return fmt.Errorf("failed to reach OpenRouter at %s: %w", e.endpoint, err)
}
defer resp.Body.Close()

if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return fmt.Errorf("OpenRouter returned status %d: %s", resp.StatusCode, string(body))
}

return nil
}
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the OpenRouterEmbedder. All other embedders (Ollama, LMStudio, OpenAI) have corresponding test cases in embedder_test.go that verify default values, options, and the Dimensions() method. Consider adding similar test coverage for OpenRouterEmbedder following the established pattern.

Copilot uses AI. Check for mistakes.
Comment thread embedder/openrouter.go
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", e.apiKey))
req.Header.Set("HTTP-Referer", "grepai")
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent header usage in Ping method. The EmbedBatch method sets both "HTTP-Referer" and "X-Title" headers (lines 149-150), but the Ping method only sets "HTTP-Referer" (line 210) and omits "X-Title". For consistency and to ensure Ping accurately tests the same request pattern, consider adding the "X-Title" header here as well.

Suggested change
req.Header.Set("HTTP-Referer", "grepai")
req.Header.Set("HTTP-Referer", "grepai")
req.Header.Set("X-Title", "grepai")

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 37.57576% with 206 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.80%. Comparing base (a322537) to head (f0741a9).
⚠️ Report is 74 commits behind head on main.

Files with missing lines Patch % Lines
embedder/openrouter.go 33.98% 68 Missing ⚠️
embedder/synthetic.go 34.02% 64 Missing ⚠️
cli/init.go 1.81% 54 Missing ⚠️
embedder/factory.go 82.97% 4 Missing and 4 partials ⚠️
cli/watch.go 53.84% 5 Missing and 1 partial ⚠️
cli/search.go 0.00% 5 Missing ⚠️
mcp/server.go 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #106       +/-   ##
===========================================
+ Coverage   27.16%   44.80%   +17.64%     
===========================================
  Files          32       54       +22     
  Lines        3711     9679     +5968     
===========================================
+ Hits         1008     4337     +3329     
- Misses       2620     4965     +2345     
- Partials       83      377      +294     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yoanbernabeu
Copy link
Copy Markdown
Owner

PR Review: Add Synthetic and OpenRouter embedding providers

Thanks for this contribution @Revaz-Goguadze! Adding more embedding providers is great for the project. Here are a few items that need to be addressed before we can merge:

CI Failures

  1. Lint failure - There's an indentation issue in cli/watch.go: the default: case lost its indentation and is at column 0 instead of being aligned within the switch block.

  2. Code coverage (codecov/patch) - No unit tests were added for the new embedder/synthetic.go and embedder/openrouter.go files. Please add tests similar to the existing ones in the embedder/ package.

Code Quality

  1. Code duplication - The embedder initialization pattern is copy-pasted across 5+ locations (search.go x3, watch.go x1, mcp/server.go x2). We'd like to take this opportunity to refactor this into a factory function (e.g., embedder.NewFromConfig(cfg)) that centralizes provider initialization. This would benefit all existing and future providers.

  2. Unused parallelism field - The OpenRouterEmbedder.parallelism field is configured but never used in EmbedBatch(). Either implement parallel batch processing or remove the field.

  3. Error message convention - In Go, error messages should start with a lowercase letter. Please update messages like "Synthetic API key not set...""synthetic API key not set...".

Minor

  1. GetDimensions() in config.go - Consider adding an explicit case for openrouter (default 1536) to keep the dimension defaults consistent.

Please fix these issues and the PR should be good to go. Happy to help if you have any questions!

Revaz-Goguadze and others added 3 commits February 7, 2026 17:07
- Fixed lint issue: corrected indentation of default case in cli/watch.go
- Fixed error message convention: changed to lowercase (synthetic/openrouter)
- Removed unused parallelism field from OpenRouterEmbedder
- Added documentation note explaining intentional omission of parallelism
- Removed WithOpenRouterParallelism option function
- All lints pass (go vet + golangci-lint)

Note: Parallelism was intentionally removed to keep the implementation simple.
OpenRouter processes batches efficiently as-is. Future enhancement could
implement BatchEmbedder interface with adaptive rate limiting.
…d tests

- Added synthetic and openrouter to grepai init interactive and non-interactive modes
- Added --model flag for OpenRouter model selection (text-embedding-3-small, text-embedding-3-large, qwen3-embedding-8b)
- Updated GetDimensions() to include openrouter with 1536 default
- Created embedder.NewFromConfig() factory function to reduce code duplication
- Added comprehensive unit tests for synthetic, openrouter, and factory
- Fixed lowercase error message convention
- All tests pass (go test ./...)
@Revaz-Goguadze
Copy link
Copy Markdown
Contributor Author

PR Review: Add Synthetic and OpenRouter embedding providers

Thanks for this contribution @Revaz-Goguadze! Adding more embedding providers is great for the project. Here are a few items that need to be addressed before we can merge:

CI Failures

1. **Lint failure** - There's an indentation issue in `cli/watch.go`: the `default:` case lost its indentation and is at column 0 instead of being aligned within the `switch` block.

2. **Code coverage (codecov/patch)** - No unit tests were added for the new `embedder/synthetic.go` and `embedder/openrouter.go` files. Please add tests similar to the existing ones in the `embedder/` package.

Code Quality

3. **Code duplication** - The embedder initialization pattern is copy-pasted across 5+ locations (`search.go` x3, `watch.go` x1, `mcp/server.go` x2). We'd like to take this opportunity to refactor this into a factory function (e.g., `embedder.NewFromConfig(cfg)`) that centralizes provider initialization. This would benefit all existing and future providers.

4. **Unused `parallelism` field** - The `OpenRouterEmbedder.parallelism` field is configured but never used in `EmbedBatch()`. Either implement parallel batch processing or remove the field.

5. **Error message convention** - In Go, error messages should start with a lowercase letter. Please update messages like `"Synthetic API key not set..."` → `"synthetic API key not set..."`.

Minor

6. **`GetDimensions()` in `config.go`** - Consider adding an explicit case for `openrouter` (default 1536) to keep the dimension defaults consistent.

Please fix these issues and the PR should be good to go. Happy to help if you have any questions!

@yoanbernabeu All review items fixed. Ready for review!

@yoanbernabeu
Copy link
Copy Markdown
Owner

Review Summary

Thanks for this contribution! The two new providers follow the existing patterns well and the test coverage for options/configuration is solid. Here are some changes I'd like to see before merging:

Critical

1. Factory created but not used
You added embedder/factory.go with NewFromConfig() / NewFromWorkspaceConfig() which is a great idea to reduce duplication. However, it's not actually used anywhere — the same switch/case blocks are still duplicated in cli/search.go (3 places), cli/watch.go (1 place), and mcp/server.go (2 places). Please either:

  • Replace all 6 duplicated blocks with calls to the factory, or
  • Remove the factory file if you prefer to keep the current approach

2. Ping() uses hardcoded models instead of e.model

  • openrouter.go:207 hardcodes "openai/text-embedding-3-small"
  • synthetic.go:196 hardcodes "hf:nomic-ai/nomic-embed-text-v1.5"

If a user configures a different model, Ping() won't test the actual configuration. Please use e.model instead.

Minor

3. Misleading comment in config/config.go:275
The comment says "local embedders (Ollama, LMStudio, Synthetic)" but Synthetic is a cloud API, not local. Suggestion: "Only set default dimensions for specific embedders (Ollama, LMStudio, Synthetic)."

4. Missing X-Title header in OpenRouter Ping()
EmbedBatch() sets both HTTP-Referer and X-Title, but Ping() only sets HTTP-Referer. Please add X-Title for consistency.

5. .gitignore cleanup

  • AGENT_CONTEXT.md doesn't seem related to this PR — is it needed?
  • There's a trailing double blank line at the end of the file

Revaz-Goguadze and others added 3 commits February 9, 2026 16:22
Replace hardcoded JSON strings with json.Marshal in both OpenRouter

and Synthetic embedders to prevent potential injection attacks.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace duplicate switch statements with NewFromConfig and

NewFromWorkspaceConfig factory methods in cli/search, cli/watch,

and mcp/server.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Remove AGENT_CONTEXT.md from .gitignore and update comment

in config.go for clarity.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@Revaz-Goguadze
Copy link
Copy Markdown
Contributor Author

Revaz-Goguadze commented Feb 17, 2026

@yoanbernabeu Thanks for the detailed review and sorry for delay, I addressed those issues and pushed again

@yoanbernabeu
Copy link
Copy Markdown
Owner

Thanks for the update @Revaz-Goguadze! All the review items from the previous round have been addressed:

  • ✅ Factory pattern (NewFromConfig/NewFromWorkspaceConfig) now used across all CLI commands and MCP server — great refactoring!
  • Ping() uses e.model instead of hardcoded values
  • ✅ Misleading "local" comment fixed
  • X-Title header added to OpenRouter Ping()
  • .gitignore cleaned up

Backward compatibility verified: full workflow (init → watch → search → trace) with Ollama + GOB produces identical results on both main and this branch. All unit tests pass.

LGTM — merging! 🎉

@yoanbernabeu yoanbernabeu merged commit 5e3b827 into yoanbernabeu:main Feb 19, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants