Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -719,7 +719,7 @@ npm install @xenova/transformers
| Provider | Model | Cost | Notes |
|---|---|---|---|
| **Local (recommended)** | `all-MiniLM-L6-v2` | Free | Offline, +8pp recall over BM25-only |
| Gemini | `text-embedding-004` | Free tier | 1500 RPM |
| Gemini | `gemini-embedding-001` | Free tier | 100+ languages, 768/1536/3072 dims (MRL), 2048-token input. Replaces `text-embedding-004` ([deprecated, shutdown Jan 14, 2026](https://ai.google.dev/gemini-api/docs/deprecations)) |
| OpenAI | `text-embedding-3-small` | $0.02/1M | Highest quality |
| Voyage AI | `voyage-code-3` | Paid | Optimized for code |
| Cohere | `embed-english-v3.0` | Free trial | General purpose |
Expand Down
2 changes: 1 addition & 1 deletion src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ function detectProvider(env: Record<string, string>): ProviderConfig {
}
return {
provider: "gemini",
model: env["GEMINI_MODEL"] || "gemini-2.0-flash",
model: env["GEMINI_MODEL"] || "gemini-2.5-flash",
maxTokens,
};
}
Expand Down
29 changes: 26 additions & 3 deletions src/providers/embedding/gemini.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ import type { EmbeddingProvider } from "../../types.js";
import { getEnvVar } from "../../config.js";

const BATCH_LIMIT = 100;
const API_BASE = "https://generativelanguage.googleapis.com/v1beta/models/text-embedding-004:batchEmbedContent";
const MODEL = "models/gemini-embedding-001";
const API_BASE = `https://generativelanguage.googleapis.com/v1beta/${MODEL}:batchEmbedContents`;

export class GeminiEmbeddingProvider implements EmbeddingProvider {
readonly name = "gemini";
Expand All @@ -29,8 +30,9 @@ export class GeminiEmbeddingProvider implements EmbeddingProvider {
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
requests: chunk.map((t) => ({
model: "models/text-embedding-004",
model: MODEL,
content: { parts: [{ text: t }] },
outputDimensionality: this.dimensions,
})),
Comment on lines 32 to 36
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Gemini gemini-embedding-001 outputDimensionality parameter reduce dimensions

💡 Result:

For the Gemini embedding model gemini-embedding-001, the parameter you use to reduce the embedding dimensionality (i.e., the “outputDimensionality/output_dimensionality” behavior) is output_dimensionality / outputDimensionality. 1) What it does - By default, gemini-embedding-001 returns 3072-dimensional embeddings. [1] - You can pass output_dimensionality (or outputDimensionality) to control the size of the output embedding vector; smaller values truncate the embedding to the requested size (to save storage and improve downstream efficiency). [1][2] 2) Parameter name(s) by API surface - Gemini API (ai.google.dev docs): use output_dimensionality in the embed content request. [3] - Vertex AI Generative AI API (cloud.google.com docs): use output_dimensionality in EmbedContentConfig (shown as output_dimensionality=... in examples). [1] - Vertex AI model reference also describes outputDimensionality as an optional int used to specify output embedding size and truncates if set. [2] 3) Practical example (what to change) - Request without dimensionality: defaults to 3072 dims. [1] - Request with dimensionality: set output_dimensionality/outputDimensionality to e.g. 768, 1536, or 3072 (Google recommends 768/1536/3072 for quality/storage tradeoffs). [3][4] 4) Common gotcha - Ensure your client/library actually sends the parameter; otherwise you’ll get the default 3072-dimension vectors, which can break downstream vector databases expecting the reduced dimension. [5] So, to “reduce dimensions” for gemini-embedding-001, set output_dimensionality (Gemini API) or outputDimensionality/output_dimensionality (Vertex AI) in the embedContent request config; e.g., 768 instead of the default 3072. [1][2][3]

Citations:


🏁 Script executed:

cat -n src/providers/embedding/gemini.ts | head -50

Repository: rohitg00/agentmemory

Length of output: 2110


Change outputDimensionality to output_dimensionality (snake_case).

The code targets the Gemini API (ai.google.dev), which expects the parameter as output_dimensionality in snake_case, not outputDimensionality in camelCase (which is the Vertex AI convention). Using the wrong parameter name will cause the API to ignore it and return default 3072-dimensional vectors instead of the intended 768 dimensions, breaking compatibility with existing indexes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/providers/embedding/gemini.ts` around lines 32 - 36, The requests payload
is using camelCase key outputDimensionality which Gemini expects as snake_case
output_dimensionality; update the chunk.map(...) object so the property is
output_dimensionality: this.dimensions (instead of outputDimensionality)
wherever you build requests for MODEL in the embedding/gemini provider, ensuring
any other occurrences of outputDimensionality are renamed to
output_dimensionality so the API receives the intended 768-dimension vector
setting.

}),
});
Expand All @@ -45,10 +47,31 @@ export class GeminiEmbeddingProvider implements EmbeddingProvider {
};

for (const emb of data.embeddings) {
results.push(new Float32Array(emb.values));
results.push(l2Normalize(new Float32Array(emb.values)));
}
}

return results;
}
}

let zeroNormWarned = false;

function l2Normalize(vec: Float32Array): Float32Array {
let sum = 0;
for (let i = 0; i < vec.length; i++) sum += vec[i]! * vec[i]!;
const norm = Math.sqrt(sum);
if (norm === 0) {
if (!zeroNormWarned) {
zeroNormWarned = true;
process.stderr.write(
`[agentmemory] warn: gemini-embedding-001 returned a zero-norm ` +
`embedding (length=${vec.length}); leaving it un-normalized. ` +
`Subsequent zero-norm vectors will not be reported.\n`,
);
}
return vec;
}
for (let i = 0; i < vec.length; i++) vec[i] = vec[i]! / norm;
return vec;
}
Loading