Skip to content

OpenAI /responses failover can bypass account model_mapping due to cached body reuse #2897

@Shujakuinkuraudo

Description

@Shujakuinkuraudo

Summary

In the OpenAI /v1/responses path, failover or same-request account switching can accidentally forward the original raw request body even after account.model_mapping was applied on an earlier attempt. This can leak alias models such as alias-model to the next upstream account instead of the mapped base model.

What happens

  1. The handler enters the /v1/responses failover loop and passes the original raw body to OpenAIGatewayService.Forward.
  2. Forward parses/caches the request body map in the gin context via OpenAIParsedRequestBodyKey.
  3. Account model mapping mutates the cached map, e.g. alias-model -> base-model, and serializes the body for that first attempt.
  4. If the first upstream attempt fails and the handler loops to another account, the handler again passes the original raw body.
  5. Forward reuses the cached map from the previous attempt. Since the cached map already contains base-model, the mapping block no longer marks bodyModified=true.
  6. The outgoing request is then built from the original raw body, which still contains alias-model.

The result is that the next upstream account can receive the unmapped alias model even though the account has a valid model_mapping.

Expected behavior

Every upstream attempt should be built from a body consistent with the current account mapping. A cached parsed body from a previous attempt should not cause the raw original body to be sent unchanged.

Likely affected code

  • backend/internal/handler/openai_gateway_handler.go: /responses failover loop reuses the raw request body for each attempt.
  • backend/internal/service/openai_gateway_service.go: getOpenAIRequestBodyMap reuses OpenAIParsedRequestBodyKey; Forward mutates that cached map but only reserializes when bodyModified is set during the current call.

Possible fixes

  • Clear the parsed request body cache before each new upstream attempt, or
  • make the cache body-specific / immutable, or
  • force serialization when Forward receives a cached parsed map whose model differs from the raw body model.

A regression test should cover one gin context making two Forward calls: the first call maps alias-model -> base-model and fails over, then the second call must still send base-model upstream rather than the original alias-model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions