Skip to content

format match_rule.py and chat.py#19

Merged
paulyuk merged 3 commits into
mainfrom
copilot/infra-bump-default-modelcapacity
Jun 16, 2026
Merged

format match_rule.py and chat.py#19
paulyuk merged 3 commits into
mainfrom
copilot/infra-bump-default-modelcapacity

Conversation

Copilot AI commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Thanks for the feedback on #18. I've created this new PR, which merges into #18, to address your comment. I will work on the changes and keep this PR's description up to date as I make progress.

Original PR: #18
Triggering comment (#18 (comment)):

@copilot uv run ruff format --check and create a PR to merge in this branch

paulyuk and others added 2 commits June 16, 2026 14:59
The default GlobalStandard capacity of 50 (= 50K TPM) is too low for a
multi-step agent template. A single daily-briefing run pulls a chunk of
inbox, reasons over it, and renders -- easily 10-20K tokens. Two runs
within a minute trip the per-deployment rate limit and surface as:

  HTTP 500: 'rate_limit_exceeded' ... 'Too Many Requests'

...which is a terrible first-deploy UX (people assume their config or
code is broken).

150 (= 150K TPM):
  - Comfortably handles repeated single-user agent runs.
  - Stays well within the default per-region/per-model GlobalStandard
    quota for new subscriptions (typically 1000 units = 1M TPM/region
    for the gpt-5 family).
  - Leaves room in a sub to host several template deploys side-by-side
    before quota-bumping is needed.

The param remains tweakable for users who are quota-constrained or who
need more headroom for multi-user workloads. To check your headroom:

  az cognitiveservices usage list --location <region> \
    --query "[?contains(name.value, 'gpt-5.4-mini')]" -o table

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI requested a review from manvkaur June 16, 2026 22:25
@manvkaur manvkaur marked this pull request as ready for review June 16, 2026 22:27
@manvkaur manvkaur changed the title [WIP] Infra: Bump default modelCapacity from 50 to 150 format match_rule.py and chat.py Jun 16, 2026
@manvkaur manvkaur changed the base branch from paulyuk/model-capacity-default-150 to main June 16, 2026 22:40
@manvkaur manvkaur changed the base branch from main to paulyuk/model-capacity-default-150 June 16, 2026 22:40
@manvkaur manvkaur changed the base branch from paulyuk/model-capacity-default-150 to main June 16, 2026 22:41

@paulyuk paulyuk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I assume these are formatting changes more than functional? it was hard to tell at first because description and comments werent there and this is a stacked PR versus inline to my PR.

@paulyuk paulyuk merged commit 16abacf into main Jun 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants