Fix deprecated default Gemini model to latest version#368
Conversation
|
Someone is attempting to deploy a commit to the rohitg00's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR updates the default Gemini language model and documents the configuration option. The code changes the provider detection default from ChangesGemini Model Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
* chore(providers): bump Gemini defaults to current GA models Bundles two upstream PRs into one chore — both are blocking real users today and both are simple default-string bumps with no API contract change. LLM default (was PR #368, @yut304) - `gemini-2.0-flash` is deprecated in Google's Gemini API and returns 429 rate-limit errors under load. Replace the default with `gemini-flash-latest`. Users on a pinned `GEMINI_MODEL` in `~/.agentmemory/.env` are unaffected. Embedding default (was PR #246, @AmmarSaleh50) - `text-embedding-004` is deprecated (shutdown Jan 14 2026). Replace with `gemini-embedding-001` (GA): 100+ languages, MRL dims (768 / 1536 / 3072), 2048-token input. - URL path changes from `:batchEmbedContent` to `:batchEmbedContents` (plural — the new model's batch endpoint). - Each request now sends `outputDimensionality: 768` so the returned vectors match the existing index dim guard from #248 — no reindex needed. - L2-normalize each returned vector before pushing to the result array. `gemini-embedding-001` does not normalize by default, unlike `text-embedding-004`. Without this the cosine-similarity math elsewhere in the search pipeline (which assumes unit-length vectors) collapses. Verified - `npm test` clean: 903 / 903. - `npm run build` clean. Closes #368, closes #246. * fix(gemini): pin LLM default to gemini-2.5-flash + warn-once on zero-norm Addresses CodeRabbit findings on PR #370. 1. Pin Gemini LLM default to gemini-2.5-flash. `gemini-flash-latest` is a moving alias that points to whatever Google promotes next. Production behaviour should be deterministic from a release perspective — users who upgrade agentmemory should not also get a Gemini model rotation in the same step. Switch the default to the current stable GA model `gemini-2.5-flash`. Users who want the moving alias keep getting it via `GEMINI_MODEL=gemini-flash-latest` in `~/.agentmemory/.env`. 2. Warn-once on zero-norm embedding in l2Normalize. `gemini-embedding-001` can return a zero-norm vector for degenerate input. The previous code silently returned the zero vector — downstream cosine-similarity math then divides by zero and the call site sees `NaN` scores with no signal as to why. Emit a one-time stderr warning naming the model + vector length so operators can correlate index quality dips with upstream embedding regressions. Behaviour otherwise unchanged: return the zero vector and let BM25 carry the search signal. Throwing was the other option — rejected because a single bad embedding in a 100-item batch would abort the whole batch and surface as an indexing pipeline halt. Soft-fail + warn matches the rest of the embedding provider error handling. Skipped finding: - `outputDimensionality` → `output_dimensionality` snake_case rename. CodeRabbit asserts the REST API expects snake_case. The Gemini REST API actually uses camelCase on the wire — confirmed against ai.google.dev/api/embeddings (field labelled `outputDimensionality` in the REST schema; the Python SDK alone uses snake_case and translates internally). Current code is correct as-shipped; the snake_case rename would silently break the dim override. Verified: 903 / 903 tests pass; build clean.
Summary
gemini-2.0-flash deprecated, always get return 429 rate limit.
Changes
Replace gemini-2.0-flash to gemini-flash-latest
Summary by CodeRabbit
Documentation
Chores