Skip to content

Fix deprecated default Gemini model to latest version#368

Closed
yut304 wants to merge 1 commit into
rohitg00:mainfrom
yut304:main
Closed

Fix deprecated default Gemini model to latest version#368
yut304 wants to merge 1 commit into
rohitg00:mainfrom
yut304:main

Conversation

@yut304
Copy link
Copy Markdown

@yut304 yut304 commented May 14, 2026

Summary

gemini-2.0-flash deprecated, always get return 429 rate limit.

Changes

Replace gemini-2.0-flash to gemini-flash-latest

Summary by CodeRabbit

  • Documentation

    • Updated README with Gemini model configuration guidance and example environment variables.
  • Chores

    • Updated default Gemini model version used when credentials are configured, maintaining support for custom model overrides.

Review Change Stack

@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

Someone is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f906e9c-c276-4f5c-9687-0eff5a5d616b

📥 Commits

Reviewing files that changed from the base of the PR and between 8c3418c and 7b5a8d1.

📒 Files selected for processing (2)
  • README.md
  • src/config.ts

📝 Walkthrough

Walkthrough

The PR updates the default Gemini language model and documents the configuration option. The code changes the provider detection default from gemini-2.0-flash to gemini-flash-latest, while the README now shows users how to optionally customize this via the GEMINI_MODEL environment variable.

Changes

Gemini Model Configuration

Layer / File(s) Summary
Default Gemini model and configuration documentation
src/config.ts, README.md
Default Gemini model switched from gemini-2.0-flash to gemini-flash-latest in the provider detection logic, with corresponding README documentation added to explain the optional GEMINI_MODEL environment variable override.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • rohitg00/agentmemory#154: Modifies Gemini provider path in src/config.ts's detectProvider via GOOGLE_API_KEY changes and related stderr warnings.

Poem

🐰 A flash of light, so bright and new,
From 2.0 to latest—a model refresh true,
With docs to guide each wandering soul,
The Gemini path plays its fleeting role! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: replacing a deprecated Gemini model with the latest version, which aligns with the core modification in src/config.ts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

rohitg00 added a commit that referenced this pull request May 14, 2026
* chore(providers): bump Gemini defaults to current GA models

Bundles two upstream PRs into one chore — both are blocking real users
today and both are simple default-string bumps with no API contract
change.

LLM default (was PR #368, @yut304)
- `gemini-2.0-flash` is deprecated in Google's Gemini API and returns
  429 rate-limit errors under load. Replace the default with
  `gemini-flash-latest`. Users on a pinned `GEMINI_MODEL` in
  `~/.agentmemory/.env` are unaffected.

Embedding default (was PR #246, @AmmarSaleh50)
- `text-embedding-004` is deprecated (shutdown Jan 14 2026). Replace
  with `gemini-embedding-001` (GA): 100+ languages, MRL dims
  (768 / 1536 / 3072), 2048-token input.
- URL path changes from `:batchEmbedContent` to `:batchEmbedContents`
  (plural — the new model's batch endpoint).
- Each request now sends `outputDimensionality: 768` so the returned
  vectors match the existing index dim guard from #248 — no
  reindex needed.
- L2-normalize each returned vector before pushing to the result
  array. `gemini-embedding-001` does not normalize by default,
  unlike `text-embedding-004`. Without this the cosine-similarity
  math elsewhere in the search pipeline (which assumes unit-length
  vectors) collapses.

Verified
- `npm test` clean: 903 / 903.
- `npm run build` clean.

Closes #368, closes #246.

* fix(gemini): pin LLM default to gemini-2.5-flash + warn-once on zero-norm

Addresses CodeRabbit findings on PR #370.

1. Pin Gemini LLM default to gemini-2.5-flash.

   `gemini-flash-latest` is a moving alias that points to whatever
   Google promotes next. Production behaviour should be deterministic
   from a release perspective — users who upgrade agentmemory should
   not also get a Gemini model rotation in the same step. Switch the
   default to the current stable GA model `gemini-2.5-flash`.
   Users who want the moving alias keep getting it via
   `GEMINI_MODEL=gemini-flash-latest` in `~/.agentmemory/.env`.

2. Warn-once on zero-norm embedding in l2Normalize.

   `gemini-embedding-001` can return a zero-norm vector for
   degenerate input. The previous code silently returned the zero
   vector — downstream cosine-similarity math then divides by zero
   and the call site sees `NaN` scores with no signal as to why.

   Emit a one-time stderr warning naming the model + vector length
   so operators can correlate index quality dips with upstream
   embedding regressions. Behaviour otherwise unchanged: return the
   zero vector and let BM25 carry the search signal.

   Throwing was the other option — rejected because a single bad
   embedding in a 100-item batch would abort the whole batch and
   surface as an indexing pipeline halt. Soft-fail + warn matches
   the rest of the embedding provider error handling.

Skipped finding:

- `outputDimensionality` → `output_dimensionality` snake_case rename.
  CodeRabbit asserts the REST API expects snake_case. The Gemini
  REST API actually uses camelCase on the wire — confirmed against
  ai.google.dev/api/embeddings (field labelled
  `outputDimensionality` in the REST schema; the Python SDK alone
  uses snake_case and translates internally). Current code is
  correct as-shipped; the snake_case rename would silently break
  the dim override.

Verified: 903 / 903 tests pass; build clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant