Skip to content

Comments

feat: add reserved_concurrency for sticky session slot reservation#607

Open
wangxiaobo775 wants to merge 1 commit intoWei-Shaw:mainfrom
wangxiaobo775:feat/reserved-concurrency
Open

feat: add reserved_concurrency for sticky session slot reservation#607
wangxiaobo775 wants to merge 1 commit intoWei-Shaw:mainfrom
wangxiaobo775:feat/reserved-concurrency

Conversation

@wangxiaobo775
Copy link

Summary

Add a reserved_concurrency field to Account, allowing operators to reserve dedicated concurrency slots for bound (sticky) sessions. This prevents new session requests from crowding out existing sessions under high concurrency, preserving session continuity.

Core Mechanism

  • Bound session requests (sticky): can use all concurrency slots
  • New session requests: can only use concurrency - reserved_concurrency slots
  • Default value 0: fully backward compatible — no behavioral change unless explicitly configured

Changes

Database

  • New migration: 058_add_account_reserved_concurrency.sql

Backend

  • Ent schema + generated code: new reserved_concurrency integer field (default 0)
  • Account.EffectiveConcurrency(isBound bool) int method on service model
  • All tryAcquireAccountSlot call sites in gateway_service.go and openai_gateway_service.go now use EffectiveConcurrency(isBound) instead of raw Concurrency
  • WaitPlan.MaxConcurrency and AccountWithConcurrency.MaxConcurrency updated accordingly
  • When EffectiveConcurrency(false) == 0, new session requests skip the account entirely
  • Added maxConcurrency == 0 short-circuit in tryAcquireAccountSlot to avoid unnecessary Redis calls
  • Full CRUD support: create / update / bulk-update through handler → service → repository layers

Frontend

  • Account create / edit / bulk-edit modals: new "Reserved Concurrency" input field
  • i18n: zh + en translations

Edge Cases

Scenario Behavior
reserved_concurrency = 0 Fully backward compatible
reserved_concurrency >= concurrency New sessions get effective concurrency = 0 → skipped
reserved_concurrency < 0 Treated as 0 (no reservation)
concurrency <= 0 Original behavior: no concurrency limit

Test Plan

  • Create account with concurrency=5, reserved_concurrency=2 → verify new sessions limited to 3 concurrent
  • Verify bound session requests can use all 5 slots
  • Verify reserved_concurrency=0 has no behavioral change
  • Verify reserved_concurrency >= concurrency blocks new sessions entirely
  • Verify frontend create/edit/bulk-edit modals display and submit the field correctly

为已绑定 sticky session 的请求预留专属并发槽位,防止高并发下
被新 session 请求挤占,保障会话连续性。

核心机制:
- 已绑定 session 请求可使用全部 concurrency 个槽位
- 新 session 请求只能使用 concurrency - reserved_concurrency 个
- 默认值 0,完全向后兼容
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant