feat(balancer): bypass QUOTA_EXCEEDED filter for gated models with additional quota#855
feat(balancer): bypass QUOTA_EXCEEDED filter for gated models with additional quota#855zwd0313 wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 551facccdc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 934ca0635e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
934ca06 to
295705f
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 295705f392
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
295705f to
24971c2
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
3f99a17 to
c51efb7
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c51efb7fed
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c51efb7 to
7266350
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72663507ba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
7266350 to
9752e3e
Compare
…ditional quota When a model is routed through an additional (gated) quota, the primary-window QUOTA_EXCEEDED status on an account may be stale or irrelevant — the additional quota gate has already verified eligibility independently. Previously, select_account() unconditionally filtered out QUOTA_EXCEEDED accounts via `continue`, causing legitimate gated-model requests to fail with 'No available accounts' even though the additional quota was available. Add a `bypass_quota_exceeded=False` parameter to select_account() that, when True, skips the QUOTA_EXCEEDED continue branch while still respecting all other status checks (PAUSED, DEACTIVATED, RATE_LIMITED, cooldown, error backoff). At the LoadBalancer.select_account() entry point, compute `_effective_bypass_quota_exceeded = additional_limit_name is not None` so bypass is automatically enabled for any gated-model request. The parameter flows through _select_account_preferring_budget_safe() and _select_with_stickiness() to all internal select_account() call sites. Two unit tests: - bypass_quota_exceeded=True keeps QUOTA_EXCEEDED account in pool - bypass_quota_exceeded=True does not affect PAUSED/DEACTIVATED filtering
9752e3e to
882ec9b
Compare
|
@codex review |
|
Addressed in
Validation:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 882ec9b556
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ae519914e0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 58a8cbe4c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
b13a5e2 to
ca9e470
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
# Conflicts: # app/core/balancer/logic.py # app/modules/proxy/load_balancer.py
Problem
When a model is routed through an additional (gated) quota (e.g. `GPT-5.3-Codex-Spark`), `select_account()` unconditionally filters out all accounts with `QUOTA_EXCEEDED` status — even when the additional (independent) quota is available and the account is otherwise healthy. This causes legitimate gated-model requests to fail with "No available accounts".
The root cause: the `QUOTA_EXCEEDED` branch in `select_account()` does `continue` without checking whether an additional quota gate has already determined the account is eligible.
Change
Add a `bypass_quota_exceeded=False` keyword parameter to `select_account()` that, when `True`, skips the `QUOTA_EXCEEDED continue` branch while still respecting all other status checks (PAUSED, DEACTIVATED, RATE_LIMITED, cooldown, error backoff).
At the `LoadBalancer.select_account()` entry point, automatically compute:
```python
_effective_bypass_quota_exceeded = additional_limit_name is not None
```
so bypass is enabled for any gated-model request. The parameter flows through the full call chain:
Impact scope: Only affects requests routed through an additional quota gate. Standard (non-gated) requests are unaffected (`additional_limit_name=None` → bypass stays `False`).
Testing
84/84 pass (82 existing + 2 new):
Files changed (3 files, +65/-1)