From bd7c14f8540bb4d251ec5916379d8c66078d5f3c Mon Sep 17 00:00:00 2001 From: mason5052 Date: Wed, 3 Jun 2026 23:01:05 -0400 Subject: [PATCH 1/2] docs: add native Google Vertex AI provider RFC Issue #321 requests a native Vertex AI provider with service-account / ADC auth, opened after #310 clarified Vertex is not supported today. The issue carries a full implementation outline and four open questions, and the author asks for direction on adapter strategy and Gemini-vs-Claude scope before implementing. Add examples/proposals/vertex_ai_provider.md, a planning RFC that frames the work before any code is written: - Distinguishes the current options (AI Studio gemini, direct Anthropic, AWS Bedrock, and the custom OpenAI-compatible LiteLLM proxy workaround) from the proposed native vertex provider. - Recommends a staged approach: v1 Gemini-on-Vertex with ADC / service-account auth; Claude-on-Vertex deferred to a maintainer decision, noting it likely belongs on the Anthropic adapter (Vertex auth/endpoint mode) rather than the Gemini-shaped path because the message schema differs. - Models auth on the existing Bedrock multi-auth precedent (ADC default chain plus service-account file), and flags that service-account JSON is sensitive and must be file-mounted/secret-managed, never pasted into UI or logs. - Captures config/migration touch points (provider checklist, REST Valid() whitelist, PROVIDER_TYPE enum-swap migration) so the eventual implementation size is clear, and restates the issue's open questions. Docs/RFC only. No code, schema, migration, generated, frontend, or provider runtime files are touched, and no new env vars or types are added; every VERTEX_* key and type named is labeled as a candidate. No overlap with the provider files in open PR #328. --- examples/proposals/vertex_ai_provider.md | 230 +++++++++++++++++++++++ 1 file changed, 230 insertions(+) create mode 100644 examples/proposals/vertex_ai_provider.md diff --git a/examples/proposals/vertex_ai_provider.md b/examples/proposals/vertex_ai_provider.md new file mode 100644 index 000000000..9fde08b06 --- /dev/null +++ b/examples/proposals/vertex_ai_provider.md @@ -0,0 +1,230 @@ +# RFC: Native Google Vertex AI Provider + +> Status: RFC / planning. This document proposes a future provider. Native +> Vertex AI support does **not** exist in PentAGI today; every variable, type, +> and migration named here is a candidate, not a shipped feature. The intent is +> to agree on direction (especially the open questions) before any code is +> written. + +## Summary + +This RFC proposes a native **Google Vertex AI** provider (`vertex`) so users can +authenticate with GCP project credentials (Application Default Credentials or a +service-account key) instead of an AI Studio API key. It is motivated by #310 +and #321: today the only Google option is the AI Studio `gemini` provider, which +accepts an API key against `https://generativelanguage.googleapis.com` and +cannot consume Vertex project/service-account credentials. Anthropic +Claude-on-Vertex is likewise not reachable through any current provider. + +The RFC recommends a **staged** approach: a small v1 that adds Gemini-on-Vertex +with ADC / service-account authentication, and a separately-decided follow-up +for Claude-on-Vertex. It deliberately stops short of prescribing the final code +because two design questions (adapter strategy and Gemini-vs-Claude scope) need +maintainer direction first. + +## Goals + +- Let users authenticate to Vertex AI with GCP **Application Default + Credentials** or a **service-account JSON** key, with explicit project ID and + region/location, rather than an AI Studio API key. +- Support Gemini models served through Vertex AI in a first iteration. +- Keep the new path additive: existing providers and their configuration are + untouched. +- Reuse existing request-shaping logic where it is safe to do so, to minimize + new surface area. + +## Non-Goals + +- Replacing or changing the existing AI Studio `gemini` provider. It stays as-is. +- Per-request dynamic credentials, multi-project routing, or credential rotation + (possible future work, explicitly out of scope here). +- Committing to Claude-on-Vertex in v1. Whether and how to add it is an open + question below, not a decision in this RFC. +- Any change to flow lifecycle, queueing, or persisted state. This is a provider + proposal only and introduces no hidden background state. + +## Current Provider Landscape + +PentAGI currently registers ten provider types (`openai`, `anthropic`, +`gemini`, `bedrock`, `ollama`, `custom`, `deepseek`, `glm`, `kimi`, `qwen`). +The Google- and Anthropic-relevant options today are: + +- **Google AI Studio (`gemini`)**: API-key auth against + `https://generativelanguage.googleapis.com`. This is the consumer AI Studio + surface, not Vertex AI. It cannot accept a GCP project or service-account + credential. +- **Direct Anthropic (`anthropic`)**: `ANTHROPIC_API_KEY` / + `ANTHROPIC_SERVER_URL` against Anthropic's own API. Not Vertex. +- **AWS Bedrock (`bedrock`)**: Anthropic and other models via AWS, with a + multi-mode auth model (default AWS credential chain, bearer token, or static + access/secret keys). This is the closest existing precedent for + cloud-IAM-style provider auth. +- **Custom OpenAI-compatible (`custom`, `LLM_SERVER_*`)**: the present + workaround for Vertex is to front it with a **LiteLLM** proxy that exposes an + OpenAI-compatible endpoint, then point the `custom` provider at it. This works + but requires running and securing extra infrastructure, and it relies on the + proxy to translate Vertex auth and message schemas correctly. + +A native `vertex` provider would remove the need for the LiteLLM workaround for +the common Gemini-on-Vertex case. + +## Proposed v1 Scope + +Proposed for the first iteration: + +- A new `vertex` provider type that serves **Gemini models on Vertex AI**. +- Authentication via **ADC** or a **service-account JSON file**, plus explicit + **project ID** and **location**. +- Wiring through the same registration and validation path every other provider + uses, so the provider is selectable in flows and accepted by the REST API. + +Proposed to defer: + +- **Claude-on-Vertex** (Anthropic models through Vertex). See Open Questions Q2. +- Bearer-token / workload-identity auth beyond ADC and service-account file. +- Settings-UI configuration if the credential-file requirement makes env-only + configuration the safer starting point (Open Questions Q4). + +## Authentication Model + +Vertex AI uses GCP IAM (OAuth2 access tokens minted from ADC or a service +account), not a static API key. The AWS Bedrock provider already demonstrates a +multi-mode auth pattern in PentAGI, and a Vertex auth model could mirror its +shape: + +- **ADC (default)**: use Application Default Credentials resolved from the + environment (for example `GOOGLE_APPLICATION_CREDENTIALS`, a mounted metadata + service, or `gcloud` login). Analogous to `BEDROCK_DEFAULT_AUTH` using the AWS + default credential chain. +- **Explicit service-account file**: a candidate `VERTEX_CREDENTIALS_FILE` + pointing at a mounted JSON key, used when ADC is not available. Analogous to + Bedrock static credentials. +- **Project and location**: candidate `VERTEX_PROJECT_ID` and `VERTEX_LOCATION` + (for example `us-central1`), which Vertex requires and which have no AI Studio + equivalent. +- **Optional regional/private endpoint**: a candidate `VERTEX_SERVER_URL` for + regional or private Service endpoints, analogous to `BEDROCK_SERVER_URL`. + +All of the above names are **candidate** keys for discussion, not shipped +configuration. + +## Provider Architecture Options + +The existing `gemini` provider is built on the langchaingo `googleai` client +configured for the AI Studio REST surface with an API key. Vertex AI changes +both the transport (GCP IAM auth, `aiplatform.googleapis.com` regional +endpoints) and, for Claude, the message schema. Two broad options: + +- **Option A - parameterize an existing adapter.** Add a Vertex transport/auth + mode to the Gemini path so request-shaping is shared and only auth + endpoint + differ. Lower duplication, but couples two surfaces that authenticate very + differently and risks regressing the stable AI Studio path. +- **Option B - a separate `vertex` package.** A dedicated provider that owns its + auth and endpoint logic and reuses request-shaping helpers where practical. + More code, cleaner separation, no risk to the existing `gemini` provider. + +This RFC leans toward **Option B for v1** (separation first, extract shared +helpers later if duplication proves real), but defers to maintainer preference +(Open Questions Q1). + +A key architectural note: **Claude-on-Vertex likely does not fit the same +adapter as Gemini-on-Vertex.** Gemini-on-Vertex uses the Gemini request/response +schema, while Claude-on-Vertex uses the Anthropic message schema over a Vertex +endpoint with GCP auth. That asymmetry argues for routing Claude-on-Vertex +through the **Anthropic** adapter with a Vertex auth/endpoint mode, rather than +bolting it onto a Gemini-shaped Vertex provider. Treating "Vertex" as one +monolithic provider for both model families would mix two schemas behind one +type. + +## Config and Migration Considerations + +A native provider would follow the repository's documented "Adding a New LLM +Provider" checklist (`CLAUDE.md`). At a high level that means, when +implementation is approved: + +- A `ProviderVertex` type constant and default provider name. +- Registration in the provider factory functions. +- Addition of `vertex` to the REST `Valid()` whitelist (without this the REST + API rejects the type with 422). +- Candidate config keys in the central config (the `VERTEX_*` names above). +- A goose migration adding `vertex` to the `PROVIDER_TYPE` enum, following the + enum-swap pattern already used by + `backend/migrations/sql/20260227_120000_add_cn_providers.sql` (Up: recreate + the enum including the new value; Down: remove rows of the new type, then + recreate the enum without it). + +The migration is the least reversible step and the Down path deletes any rows of +the new provider type, so it warrants explicit review. None of these changes are +part of this RFC; they are listed so the eventual implementation size is clear. + +## Frontend / Installer Considerations + +For parity with other providers, a future implementation would add a provider +icon and register it, and decide whether Vertex appears in the Settings UI. The +service-account-file requirement is the wrinkle: unlike an API key, a JSON key +is a file and should not be pasted into a web form or stored as plain settings +text. A reasonable starting point is **env/file-mounted configuration only**, +with Settings-UI support considered later (Open Questions Q4). The interactive +installer wizard could later grow a Vertex section that asks for project, +location, and credential-file path. + +## Testing Strategy + +- **Unit / config**: a future `vertex` provider would get the same provider + unit tests other providers have, exercised through the existing config-loading + path. +- **Provider validation**: the `ctester` utility (which tests LLM agent + capabilities and tool-calling agent types) would be the pre-merge smoke test + for the new provider once credentials are available. +- **Credentials caveat**: end-to-end testing requires real GCP credentials and a + Vertex-enabled project, which maintainers would need to supply or stub. This + is called out as a practical gating factor on any implementation PR. + +## Open Questions + +1. **Adapter strategy** - parameterize the existing Gemini adapter with a Vertex + transport/auth mode (Option A), or ship a separate `vertex` package + (Option B)? +2. **Scope** - Gemini-on-Vertex only in v1, or include Claude-on-Vertex? If + Claude-on-Vertex is in scope, should it route through the Anthropic adapter + (Vertex auth/endpoint mode) rather than a Gemini-shaped provider, given the + schema difference? +3. **Auth surface** - are ADC and a service-account JSON file sufficient for v1, + or is bearer-token / workload-identity auth (mirroring the Bedrock multi-auth + approach) also wanted? +4. **Web settings** - should Vertex be configurable from the Settings UI like + other providers, or env/file-mounted only at first, given the credential-file + requirement? + +## Security Considerations + +- **Service-account JSON is a sensitive secret.** It should be **file-mounted or + secret-managed**, never pasted into UI text, never committed, and never + written to logs. Provider initialization and any error surface must avoid + echoing credential contents or file paths beyond what is necessary to + diagnose a misconfiguration. +- **Least privilege**: documentation for any implementation should recommend a + dedicated service account scoped to the minimum Vertex AI prediction roles. +- **No hidden state**: this proposal adds a provider, not background lifecycle + state; credentials are supplied explicitly via env/mounted file and are not + cached or queued anywhere implicit. +- **Endpoint trust**: regional/private endpoint overrides should be validated so + a misconfigured `VERTEX_SERVER_URL` cannot silently redirect traffic. + +## Suggested First Milestone + +If maintainers confirm Option B and a Gemini-on-Vertex-only v1, a minimal first +PR could add the `vertex` provider package, the type constant and registration, +the REST whitelist entry, the candidate `VERTEX_*` config keys, the enum +migration, and `.env.example` plus docs - with Claude-on-Vertex, extra auth +modes, and Settings-UI support tracked as explicit follow-ups. Confirmation on +Open Questions Q1 and Q2 is the blocker before any of that work begins. + +## References + +- #310 - original Vertex AI configuration request (clarified that Vertex is not + natively supported today). +- #321 - native Vertex AI provider request and implementation outline. +- `CLAUDE.md` - "Adding a New LLM Provider" checklist. +- `backend/migrations/sql/20260227_120000_add_cn_providers.sql` - the + `PROVIDER_TYPE` enum-swap migration pattern. From 5c968919ec1989c35164a53875469d058b8aa835 Mon Sep 17 00:00:00 2001 From: mason5052 Date: Wed, 3 Jun 2026 23:04:50 -0400 Subject: [PATCH 2/2] docs: address review feedback on Vertex AI RFC - Drop the hard-coded provider count to avoid doc drift as providers are added; list the current provider types instead. - Use the ADC-specific command (gcloud auth application-default login) rather than the ambiguous 'gcloud login'. - Reference the enum-swap migration pattern generically under backend/migrations/sql/ instead of a single timestamped filename that may be renamed or squashed. --- examples/proposals/vertex_ai_provider.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/examples/proposals/vertex_ai_provider.md b/examples/proposals/vertex_ai_provider.md index 9fde08b06..60cfe2c8b 100644 --- a/examples/proposals/vertex_ai_provider.md +++ b/examples/proposals/vertex_ai_provider.md @@ -45,9 +45,9 @@ maintainer direction first. ## Current Provider Landscape -PentAGI currently registers ten provider types (`openai`, `anthropic`, -`gemini`, `bedrock`, `ollama`, `custom`, `deepseek`, `glm`, `kimi`, `qwen`). -The Google- and Anthropic-relevant options today are: +PentAGI currently registers the following provider types: `openai`, +`anthropic`, `gemini`, `bedrock`, `ollama`, `custom`, `deepseek`, `glm`, +`kimi`, and `qwen`. The Google- and Anthropic-relevant options today are: - **Google AI Studio (`gemini`)**: API-key auth against `https://generativelanguage.googleapis.com`. This is the consumer AI Studio @@ -94,8 +94,8 @@ shape: - **ADC (default)**: use Application Default Credentials resolved from the environment (for example `GOOGLE_APPLICATION_CREDENTIALS`, a mounted metadata - service, or `gcloud` login). Analogous to `BEDROCK_DEFAULT_AUTH` using the AWS - default credential chain. + service, or `gcloud auth application-default login`). Analogous to + `BEDROCK_DEFAULT_AUTH` using the AWS default credential chain. - **Explicit service-account file**: a candidate `VERTEX_CREDENTIALS_FILE` pointing at a mounted JSON key, used when ADC is not available. Analogous to Bedrock static credentials. @@ -148,10 +148,9 @@ implementation is approved: API rejects the type with 422). - Candidate config keys in the central config (the `VERTEX_*` names above). - A goose migration adding `vertex` to the `PROVIDER_TYPE` enum, following the - enum-swap pattern already used by - `backend/migrations/sql/20260227_120000_add_cn_providers.sql` (Up: recreate - the enum including the new value; Down: remove rows of the new type, then - recreate the enum without it). + enum-swap pattern used by the existing provider migrations under + `backend/migrations/sql/` (Up: recreate the enum including the new value; + Down: remove rows of the new type, then recreate the enum without it). The migration is the least reversible step and the Down path deletes any rows of the new provider type, so it warrants explicit review. None of these changes are @@ -226,5 +225,5 @@ Open Questions Q1 and Q2 is the blocker before any of that work begins. natively supported today). - #321 - native Vertex AI provider request and implementation outline. - `CLAUDE.md` - "Adding a New LLM Provider" checklist. -- `backend/migrations/sql/20260227_120000_add_cn_providers.sql` - the - `PROVIDER_TYPE` enum-swap migration pattern. +- `backend/migrations/sql/` - existing provider enum-swap migrations that + demonstrate the `PROVIDER_TYPE` pattern.