fix(codex): restore prompt cache continuity for Codex requests by VooDisss · Pull Request #2374 · router-for-me/CLIProxyAPI

VooDisss · 2026-03-27T15:57:25Z

Summary

This PR fixes Codex prompt cache continuity through CLIProxyAPI.

While this PR fixes #2373 directly, it also addresses the same underlying class of continuity problems discussed in #1962 and #1998, and is adjacent to the Codex session/account pinning issues in #2230 and #2256.

In the broken state, repeated Codex requests could reach the upstream without the same continuity envelope, so upstream prompt caching did not reliably reuse prior prompt state. The most visible repro was with OpenCode using Codex through CLIProxyAPI, where cache reads worked in the reference client but not through the proxy. The fix is not specific to OpenCode, but that integration made the failure mode easy to validate.

At a high level, this PR:

preserves prompt_cache_retention on Codex request paths instead of stripping it
preserves prompt_cache_retention when translating OpenAI Chat Completions requests to Codex
aligns Codex continuity headers to session_id
introduces explicit Codex continuity resolution so repeated requests reuse a stable upstream cache/session identity
keeps auth selection sticky for repeated request sequences so continuity is not accidentally broken by auth rotation
adds targeted regression tests covering the continuity policy and retention preservation

Why this change is needed

Repeated Codex requests were missing the stable continuity signals needed for upstream cache reuse.

Observed failure modes included:

executor-layer Codex request preparation removed prompt_cache_retention
Chat Completions -> Codex translation dropped prompt_cache_retention
Codex request preparation used header shapes that did not match the working client behavior
OpenAI-style Codex requests could be sent without a stable prompt_cache_key
when no stable continuity key existed, session_id fell back to a fresh random UUID per request
auth selection could drift across repeated requests, which changes upstream identity and can also break cache reuse

The combined effect was that repeated requests looked like unrelated turns to the upstream.

What this PR changes

preserves caller-provided prompt_cache_retention on Codex HTTP, stream, and websocket request preparation paths
preserves prompt_cache_retention in the OpenAI Chat Completions -> Codex translator
applies Codex continuity using exact session_id
centralizes continuity resolution in a dedicated helper instead of scattering the policy across executor flow code
derives a stable continuity key with explicit precedence:
- request prompt_cache_key
- execution session metadata
- explicit Idempotency-Key
- request/auth affinity key
- stable client-principal hash
- stable auth-ID hash
applies the chosen continuity key to both:
- request body prompt_cache_key
- request header session_id
adds sticky auth affinity so repeated request sequences reuse the same Codex auth when possible
keeps websocket behavior aligned with the same continuity model

Validation result

This change was validated in both tests and live runtime behavior.

Expected outcome after the fix:

repeated Codex requests sent through CLIProxyAPI can reuse upstream cache continuity instead of behaving like fresh turns
session_id is stable across repeated requests in the same continuity context
prompt cache reads now occur through CLIProxyAPI in the previously failing OpenCode + Codex scenario

Tests run

go test ./sdk/cliproxy/auth ./internal/runtime/executor ./internal/translator/codex/openai/responses ./internal/translator/codex/openai/chat-completions

Result:

all targeted packages passed

Built artifact

Windows executable was built successfully for manual verification during development

Detailed root cause analysis

The investigation showed that the problem was not just prompt_cache_retention.

The more important issue was that repeated Codex requests did not share a stable continuity identity when forwarded by the proxy. In the failing path:

prompt_cache_key could be absent
session_id could change on every request
auth/account selection could rotate across repeated requests

This meant that even identical prompts could be treated by the upstream as unrelated requests.

The final working behavior required preserving a continuity envelope that includes:

stable auth/account context
stable session_id
stable prompt_cache_key
stable Codex-compatible request defaults

The fix encoded that continuity policy explicitly instead of relying on incidental request context.

Implementation notes for gatekeeper review

This PR intentionally keeps the scope focused on Codex continuity and avoids broader redesign.

Architectural choices:

continuity policy was extracted into a dedicated helper so executor flow remains orchestration-focused
auth affinity was added only because auth rotation was a real, observed cache-breaking failure mode
translator changes are intentionally minimal and limited to preserving prompt_cache_retention
websocket behavior was kept aligned with the same continuity semantics

The goal was to fix the real bug with the smallest behaviorally complete change set, not to redesign request routing or provider architecture.

Behavioral details of continuity resolution

For OpenAI-style requests forwarded to Codex, continuity resolution now prefers the strongest available stable signal in this order:

explicit prompt_cache_key
execution session metadata
explicit Idempotency-Key
auth/request affinity metadata
stable client-principal-derived hash
stable auth-ID-derived hash

The chosen continuity key is then applied to:

prompt_cache_key in the request body
session_id in the request headers

This ensures repeated requests reuse the same upstream continuity identity even when the original caller does not provide all continuity fields directly.

Runtime validation outcome

Manual runtime validation during debugging confirmed the pre-fix and post-fix behavior difference.

Before the fix, repeated Codex requests showed continuity drift such as:

empty prompt_cache_key
changing session_id
occasional auth rotation across repeated requests

After the fix, repeated requests in the validated scenario reused stable continuity values and upstream cache reads started working through CLIProxyAPI.

This was specifically confirmed in the OpenCode -> CLIProxyAPI -> Codex flow, which previously missed prompt cache reuse.

Risk assessment

This PR does change request continuity behavior for Codex, so it is not zero-risk, but the risk is bounded and justified.

Main considerations:

continuity precedence is partly based on validated runtime behavior rather than a formal upstream contract
sticky auth affinity changes repeated-request behavior intentionally, but only to preserve continuity
diagnostics were retained because they were useful during validation and make future regressions easier to inspect

Despite that, the changes are targeted, test-covered, and constrained to the exact area that was failing.

Out of scope

This PR does not attempt to:

redesign provider routing generally
change non-Codex providers
redesign translator behavior beyond the necessary Codex retention pass-through
change unrelated cache-control semantics

Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself. The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context. This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists. The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse. To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.

gemini-code-assist

Code Review

This pull request introduces 'Auth Affinity' and 'Codex Continuity' to improve request routing and prompt cache management. Auth Affinity allows the system to pin a specific authentication identity to a session or idempotency key, while Codex Continuity provides a unified way to resolve and apply prompt cache keys across HTTP and WebSocket executors. Feedback suggests aligning the WebSocket continuity logic with the HTTP implementation, ensuring stable string representations for API keys used in metadata, and refactoring repetitive logic for metadata extraction and affinity application to improve maintainability.

gemini-code-assist · 2026-03-27T15:59:38Z

internal/runtime/executor/codex_websockets_executor.go

 	if cache.ID != "" {
 		rawJSON, _ = sjson.SetBytes(rawJSON, "prompt_cache_key", cache.ID)
-		headers.Set("Conversation_id", cache.ID)
-		headers.Set("Session_id", cache.ID)
+		headers.Set("session_id", cache.ID)
 	}


The continuity logic for websockets in applyCodexPromptCacheHeaders doesn't seem to be fully aligned with the HTTP path in codex_executor.go. The cacheHelper function was updated to use the new resolveCodexContinuity helper for openai format requests, but applyCodexPromptCacheHeaders was not. To ensure consistent behavior for prompt cache continuity, consider updating this function to also use resolveCodexContinuity for openai requests. This would also allow you to correctly populate the Source field of the codexContinuity struct for logging.

gemini-code-assist · 2026-03-27T15:59:38Z

sdk/api/handlers/handlers.go

+			if apiKey, exists := ginCtx.Get("apiKey"); exists && apiKey != nil {
+				meta[authAffinityMetadataKey] = fmt.Sprintf("principal:%v", apiKey)
+			}


Using fmt.Sprintf("principal:%v", apiKey) to generate authAffinityMetadataKey can be risky if apiKey is a pointer or a struct, as the string representation might not be stable across requests, leading to auth affinity failures. The value of auth_affinity_key is used as a map key in conductor.go and needs to be consistent.

Consider handling the apiKey value with type switches to ensure a stable string representation, similar to how it's handled in resolveCodexContinuity for client_principal.

if apiKey, exists := ginCtx.Get("apiKey"); exists && apiKey != nil { var keyStr string switch v := apiKey.(type) { case string: keyStr = v case fmt.Stringer: keyStr = v.String() default: // Using %v for other types can lead to unstable keys if they are pointers or maps. // This should be handled carefully. keyStr = fmt.Sprintf("%v", v) } if trimmedKey := strings.TrimSpace(keyStr); trimmedKey != "" { meta[authAffinityMetadataKey] = "principal:" + trimmedKey } }

gemini-code-assist · 2026-03-27T15:59:38Z

internal/runtime/executor/codex_continuity.go

+		if v, exists := ginCtx.Get("apiKey"); exists && v != nil {
+			switch value := v.(type) {
+			case string:
+				if trimmed := strings.TrimSpace(value); trimmed != "" {
+					return codexContinuity{Key: uuid.NewSHA1(uuid.NameSpaceOID, []byte("cli-proxy-api:codex:prompt-cache:"+trimmed)).String(), Source: "client_principal"}
+				}
+			case fmt.Stringer:
+				if trimmed := strings.TrimSpace(value.String()); trimmed != "" {
+					return codexContinuity{Key: uuid.NewSHA1(uuid.NameSpaceOID, []byte("cli-proxy-api:codex:prompt-cache:"+trimmed)).String(), Source: "client_principal"}
+				}
+			default:
+				trimmed := strings.TrimSpace(fmt.Sprintf("%v", value))
+				if trimmed != "" {
+					return codexContinuity{Key: uuid.NewSHA1(uuid.NameSpaceOID, []byte("cli-proxy-api:codex:prompt-cache:"+trimmed)).String(), Source: "client_principal"}
+				}
+			}
+		}


The logic for handling apiKey from the gin context is repetitive across the string, fmt.Stringer, and default cases in the switch statement. This can be refactored to reduce code duplication and improve maintainability. You can extract the string value first and then perform the check and hashing logic once.

if v, exists := ginCtx.Get("apiKey"); exists && v != nil { var keyStr string switch value := v.(type) { case string: keyStr = value case fmt.Stringer: keyStr = value.String() default: keyStr = fmt.Sprintf("%v", value) } if trimmed := strings.TrimSpace(keyStr); trimmed != "" { return codexContinuity{Key: uuid.NewSHA1(uuid.NameSpaceOID, []byte("cli-proxy-api:codex:prompt-cache:"+trimmed)).String(), Source: "client_principal"} } }

gemini-code-assist · 2026-03-27T15:59:38Z

internal/runtime/executor/codex_continuity.go

+	if opts.Metadata != nil {
+		if raw, ok := opts.Metadata[cliproxyexecutor.SelectedAuthMetadataKey]; ok && raw != nil {
+			switch v := raw.(type) {
+			case string:
+				selectedAuthID = strings.TrimSpace(v)
+			case []byte:
+				selectedAuthID = strings.TrimSpace(string(v))
+			}
+		}
+		if raw, ok := opts.Metadata[cliproxyexecutor.ExecutionSessionMetadataKey]; ok && raw != nil {
+			switch v := raw.(type) {
+			case string:
+				executionSessionID = strings.TrimSpace(v)
+			case []byte:
+				executionSessionID = strings.TrimSpace(string(v))
+			}
+		}
+	}


The logic for extracting selectedAuthID and executionSessionID from opts.Metadata is duplicated. You could create a helper function to extract a string value from the metadata map, which would handle the type switching and trimming in one place. This would make the code more concise and easier to maintain.

Example helper:

func getStringFromMetadata(meta map[string]any, key string) string { if raw, ok := meta[key]; ok && raw != nil { switch v := raw.(type) { case string: return strings.TrimSpace(v) case []byte: return strings.TrimSpace(string(v)) } } return "" }

Then you could use it like this:
selectedAuthID := getStringFromMetadata(opts.Metadata, cliproxyexecutor.SelectedAuthMetadataKey)
executionSessionID := getStringFromMetadata(opts.Metadata, cliproxyexecutor.ExecutionSessionMetadataKey)

gemini-code-assist · 2026-03-27T15:59:38Z

sdk/cliproxy/auth/conductor.go

+	if pinnedAuthID := pinnedAuthIDFromMetadata(opts.Metadata); pinnedAuthID == "" {
+		if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
+			if affinityAuthID := m.AuthAffinity(affinityKey); affinityAuthID != "" {
+				meta := opts.Metadata
+				if meta == nil {
+					meta = make(map[string]any)
+					opts.Metadata = meta
+				}
+				meta[cliproxyexecutor.PinnedAuthMetadataKey] = affinityAuthID
+			}
+		}
+	}


The logic for applying auth affinity by checking for a pinned auth ID and then resolving affinity from auth_affinity_key is duplicated in pickNext and pickNextMixed. To improve maintainability and reduce code duplication, you could extract this logic into a private helper method on the Manager.

Example helper:

func (m *Manager) applyAuthAffinity(opts *cliproxyexecutor.Options) { if opts == nil || pinnedAuthIDFromMetadata(opts.Metadata) != "" { return } if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" { if affinityAuthID := m.AuthAffinity(affinityKey); affinityAuthID != "" { if opts.Metadata == nil { opts.Metadata = make(map[string]any) } opts.Metadata[cliproxyexecutor.PinnedAuthMetadataKey] = affinityAuthID } } }

Then you can call m.applyAuthAffinity(&opts) at the beginning of pickNext and pickNextMixed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 511b8a992e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T16:02:43Z

internal/runtime/executor/codex_executor.go

-	if cache.ID != "" {
-		rawJSON, _ = sjson.SetBytes(rawJSON, "prompt_cache_key", cache.ID)
-	}
+	rawJSON = applyCodexContinuityBody(rawJSON, continuity)


Preserve Claude cache ID when applying continuity

cacheHelper still computes a Claude-scoped cache.ID, but this refactor now applies body/header continuity only through continuity; because the Claude branch never assigns continuity, prompt_cache_key and session_id stop being forwarded for Claude-origin requests. That regresses previously working prompt-cache/session continuity for Claude→Codex flows and causes repeated turns to miss upstream cache reuse.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T16:02:43Z

sdk/cliproxy/auth/conductor.go

+		if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
+			m.SetAuthAffinity(affinityKey, auth.ID)
+			if log.IsLevelEnabled(log.DebugLevel) {


Pin auth affinity only after a successful execution

Affinity is written before the selected auth is proven healthy, so a transient failure on that first attempt can lock the request key to a bad auth ID; on the next pick, the pinned ID is forced while tried already contains it, which short-circuits fallback to other available credentials and returns failure. This makes retry behavior worse exactly when one credential is flaky and others could succeed.

Useful? React with 👍 / 👎.

Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.

VooDisss · 2026-03-27T16:13:31Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 62b17f40a1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T16:18:24Z

internal/runtime/executor/codex_executor.go

+		continuity = resolveCodexContinuity(ctx, auth, req, opts)
+		cache.ID = continuity.Key
 	}

-	if cache.ID != "" {
-		rawJSON, _ = sjson.SetBytes(rawJSON, "prompt_cache_key", cache.ID)
-	}
+	rawJSON = applyCodexContinuityBody(rawJSON, continuity)


Carry Claude cache ID into continuity before applying headers

The Claude branch still computes cache.ID, but only continuity is applied to prompt_cache_key/session_id now. Because continuity is never set for from == "claude", this path silently drops continuity fields that were previously sent, so repeated Claude→Codex requests lose cache/session reuse. The same pattern is also present in websocket continuity handling (applyCodexPromptCacheHeaders).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T16:18:24Z

internal/runtime/executor/codex_continuity.go

+	if affinityKey := metadataString(opts.Metadata, codexAuthAffinityMetadataKey); affinityKey != "" {
+		return codexContinuity{Key: affinityKey, Source: "auth_affinity"}


Do not forward auth affinity key directly as continuity ID

Returning auth_affinity_key verbatim as the Codex continuity key can leak sensitive caller identity into upstream session_id/prompt_cache_key. In this commit, handlers populate auth_affinity_key from stablePrincipalMetadataKey(...) (e.g., principal:<apiKey>), so this branch can send raw principal/API-key material upstream instead of the previous SHA1-derived opaque ID.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T16:18:24Z

sdk/cliproxy/auth/conductor.go

+		if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
+			m.SetAuthAffinity(affinityKey, auth.ID)


Pin auth affinity only after a successful execution

Affinity is persisted immediately after selection, before the request succeeds. If that auth then errors, the next retry sees a pinned auth via applyAuthAffinity and cannot rotate to other credentials, which defeats multi-auth fallback under failures/rate limits and can return errors even when another auth is healthy.

Useful? React with 👍 / 👎.

Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.

VooDisss · 2026-03-27T16:27:42Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26eca8b6ba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T16:31:38Z

sdk/cliproxy/auth/conductor.go

+	if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
+		m.SetAuthAffinity(affinityKey, authID)
+		if entry != nil && log.IsLevelEnabled(log.DebugLevel) {
+			entry.Debugf("auth affinity pinned key=%s auth_id=%s provider=%s model=%s", affinityKey, authID, provider, model)


Avoid logging raw auth affinity keys

persistAuthAffinity logs affinityKey verbatim, but that key can include principal:<api key> from requestExecutionMetadata (ginCtx.Get("apiKey")), and apiKey is populated from result.Principal in AuthMiddleware (which is the raw credential for config-based access). When debug logging is enabled, this emits client secrets into logs, creating a credential disclosure path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T16:31:38Z

sdk/api/handlers/handlers.go

+	} else if explicitIdempotencyKey != "" {
+		meta[authAffinityMetadataKey] = explicitIdempotencyKey


Stop keying long-lived affinity by Idempotency-Key

This stores the request's explicit Idempotency-Key as auth_affinity_key, which is then persisted in Manager.affinity on successful executions. Because idempotency keys are commonly unique per operation and the new affinity map has no eviction path in production code, sustained traffic with unique keys will grow affinity state without bound and eventually increase memory usage.

Useful? React with 👍 / 👎.

Stop using one-shot idempotency keys as long-lived auth-affinity identifiers and remove raw affinity-key values from debug logs so sticky routing keeps its continuity benefits without creating avoidable memory growth or credential exposure risks.

VooDisss · 2026-03-27T16:35:00Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c4cbd44da

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T16:40:57Z

sdk/cliproxy/auth/conductor.go

+	if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
+		m.SetAuthAffinity(affinityKey, authID)


Scope persisted auth affinity by provider

persistAuthAffinity saves a single global mapping from auth_affinity_key to authID, but the key built in request metadata is caller-scoped (e.g., principal:<apiKey>) rather than provider-scoped. After one successful request, applyAuthAffinity will pin that auth for later requests even when they target a different provider, and scheduler selection treats pinned IDs as strict (returning auth_not_found when provider/auth don’t match), so a client that switches providers can be forced onto an incompatible auth and fail.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T16:40:57Z

sdk/api/handlers/handlers.go

 		meta[coreexecutor.ExecutionSessionMetadataKey] = executionSessionID
+		meta[authAffinityMetadataKey] = executionSessionID


Avoid leaking affinity entries for closed execution sessions

This adds execution_session_id as auth_affinity_key, which is unique per websocket passthrough session, but the manager’s affinity map has no lifecycle tie-in to session closure. Because successful requests persist these keys and CloseExecutionSession does not clear them, every completed websocket session leaves a stale map entry behind, causing unbounded in-memory growth over time under normal streaming traffic.

Useful? React with 👍 / 👎.

Keep sticky auth affinity limited to matching providers and stop persisting execution-session IDs as long-lived affinity keys so provider switching and normal streaming traffic do not create incorrect pins or stale affinity state.

VooDisss · 2026-03-27T16:53:04Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6962e09dd9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T17:01:41Z

sdk/cliproxy/auth/conductor.go

+	key = strings.TrimSpace(key)
+	if provider == "" || key == "" {
+		return ""
+	}
+	return provider + "|" + key


Scope auth affinity by model to avoid cross-model lockouts

The affinity map key is currently only provider|principal, so one successful request can pin a principal to an auth for that provider across all models. If that auth does not support a later requested model, the pickers still enforce the pinned ID and return auth_not_found instead of selecting another auth that supports the model. This turns normal multi-model routing into hard failures for affected principals; include model (or model family) in the affinity scope, or fall back when the pinned auth is incompatible.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-27T17:01:41Z

sdk/cliproxy/auth/conductor.go

+		if affinityAuthID := m.AuthAffinity(provider, affinityKey); affinityAuthID != "" {
+			if opts.Metadata == nil {
+				opts.Metadata = make(map[string]any)
+			}
+			opts.Metadata[cliproxyexecutor.PinnedAuthMetadataKey] = affinityAuthID


Treat affinity as preference instead of hard pin

Writing the affinity result into PinnedAuthMetadataKey makes affinity strict, not sticky-preferred. In the execute*MixedOnce retry loops, once that pinned auth is attempted and added to tried, subsequent retries cannot pick any other credential, so transient failures on one auth (quota/cooldown/network) cause request failure even when other auths are healthy. This regresses credential failover behavior and should be changed to advisory pinning or automatic unpinning on retry.

Useful? React with 👍 / 👎.

Remove the experimental auth-affinity routing changes from this PR so it stays focused on the validated Codex continuity fix. This keeps the prompt-cache repair while avoiding unrelated routing-policy concerns such as provider/model affinity scope, lifecycle cleanup, and hard-pin fallback semantics.

VooDisss · 2026-03-27T17:09:30Z

@codex please re-review the current PR head only.

This PR has been narrowed substantially since the earlier review comments. The auth-affinity routing changes that triggered the later scheduler/pinning concerns have been removed from the branch.

The remaining diff is intentionally limited to the Codex continuity/cache fix:

preserve prompt_cache_retention on Codex paths
preserve prompt_cache_retention in OpenAI Chat Completions -> Codex translation
stabilize prompt_cache_key
stabilize session_id
align HTTP and websocket continuity behavior
keep focused regression coverage around continuity and retention behavior

Please evaluate the current head commit as a scoped Codex prompt-cache continuity fix, not the earlier affinity-routing iterations.

chatgpt-codex-connector · 2026-03-27T17:17:19Z

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Drop the chat-completions translator edits from this PR so the branch complies with the repository policy that forbids pull-request changes under internal/translator. The remaining PR stays focused on the executor-level Codex continuity fix that was validated to restore cache reuse.

VooDisss · 2026-03-27T17:34:56Z

@codex please re-review the current head commit only.

The PR has been narrowed further to comply with repository policy: all internal/translator changes were removed. The remaining diff is only the executor-level Codex continuity/cache fix.

chatgpt-codex-connector · 2026-03-27T17:38:16Z

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

VooDisss · 2026-03-27T17:40:34Z

@luispater this makes OpenCode work with OpenAI and cache to hit. It works for me, when compiling the .exe and the originally released .exe does not...

xkonjin

Automated review focusing on bugs, security, and tests.

Security:

Potential secret in diff at line 219.
Potential secret in diff at line 529.

Drop the last affinity-related executor artifacts so the PR stays focused on the minimal Codex continuity fix set: stable prompt cache identity, stable session_id, and the executor-only behavior that was validated to restore cache reads.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5d3541b5a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-27T18:48:40Z

internal/runtime/executor/codex_continuity.go

+			if v := strings.TrimSpace(ginCtx.GetHeader("Idempotency-Key")); v != "" {
+				return codexContinuity{Key: v, Source: "idempotency_key"}


Avoid request-scoped Idempotency-Key as primary continuity key

In resolveCodexContinuity, Idempotency-Key is chosen before stable identities (client_principal / auth_id), so any client that sends a different idempotency key per request will get a different prompt_cache_key/session_id on every turn. That breaks the continuity this change is trying to restore (cache/session reuse across turns) and can regress performance for common retry-safe clients that auto-generate per-request idempotency keys. Consider demoting this source below stable caller/auth-derived keys (or only using it for explicit retry contexts).

Useful? React with 👍 / 👎.

VooDisss · 2026-03-28T10:41:38Z

@luispater I made it work for myself, but I wanted it to work for all the other people. I can't keep battling these agents forever, cause they will ALWAYS find something to latch onto, as they are inherently non-deterministic

luispater

Summary:
This PR cleanly addresses Codex prompt-cache continuity by preserving prompt_cache_retention, centralizing continuity resolution, and consistently applying continuity to prompt_cache_key + session_id across HTTP and websocket paths.

Key findings:

Blocking: none.
Non-blocking: continuity behavior now depends on precedence logic (prompt_cache_key -> execution session -> idempotency key -> client principal -> auth id). This is reasonable and covered by targeted unit tests, but production traffic diversity remains the main validation surface.

Test plan:

Reviewed changed files and continuity flow end-to-end in executor/websocket paths.
Ran: go test ./internal/runtime/executor (pass).

Follow-ups (optional):

Consider adding an integration-style test fixture that simulates repeated mixed-source requests (HTTP + websocket) with/without gin context to further harden precedence behavior.

This is an automated Codex review result and still requires manual verification by a human reviewer.

gemini-code-assist bot reviewed Mar 27, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 27, 2026

View reviewed changes

fix(auth): scope affinity by provider

6962e09

Keep sticky auth affinity limited to matching providers and stop persisting execution-session IDs as long-lived affinity keys so provider switching and normal streaming traffic do not create incorrect pins or stale affinity state.

chatgpt-codex-connector bot reviewed Mar 27, 2026

View reviewed changes

xkonjin reviewed Mar 27, 2026

View reviewed changes

VooDisss requested a review from xkonjin March 27, 2026 18:45

chatgpt-codex-connector bot reviewed Mar 27, 2026

View reviewed changes

luispater reviewed Mar 28, 2026

View reviewed changes

luispater added the codex label Mar 28, 2026

luispater approved these changes Mar 28, 2026

View reviewed changes

luispater changed the base branch from main to dev March 28, 2026 13:16

luispater merged commit 5527140 into router-for-me:dev Mar 28, 2026
2 checks passed

alfonsofeliz mentioned this pull request Mar 28, 2026

Does OmniRoute support prompt caching for Codex models? I've been testing GPT 5.4 models from different providers via OmniRoute and OpenClaw: diegosouzapw/OmniRoute#662

Open

ne0ark mentioned this pull request Mar 29, 2026

GTP 5.4 {"detail":"Unsupported parameter: prompt_cache_retention"} v6.9.5 #2397

Closed

luispater mentioned this pull request Mar 29, 2026

Revert "fix(codex): restore prompt cache continuity for Codex requests" #2400

Merged

		if affinityKey := metadataString(opts.Metadata, codexAuthAffinityMetadataKey); affinityKey != "" {
		return codexContinuity{Key: affinityKey, Source: "auth_affinity"}

		if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
		m.SetAuthAffinity(affinityKey, auth.ID)

		} else if explicitIdempotencyKey != "" {
		meta[authAffinityMetadataKey] = explicitIdempotencyKey

		if affinityKey := authAffinityKeyFromMetadata(opts.Metadata); affinityKey != "" {
		m.SetAuthAffinity(affinityKey, authID)

		meta[coreexecutor.ExecutionSessionMetadataKey] = executionSessionID
		meta[authAffinityMetadataKey] = executionSessionID

		if v := strings.TrimSpace(ginCtx.GetHeader("Idempotency-Key")); v != "" {
		return codexContinuity{Key: v, Source: "idempotency_key"}

Uh oh!

Conversation

VooDisss commented Mar 27, 2026

Summary

Why this change is needed

What this PR changes

Validation result

Tests run

Built artifact

Risk assessment

Out of scope

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

VooDisss commented Mar 27, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

VooDisss commented Mar 27, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

VooDisss commented Mar 27, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!