hosting-cli(gcp): add --max-instances, --allow-unauthenticated, --env, --envfile#6557
hosting-cli(gcp): add --max-instances, --allow-unauthenticated, --env, --envfile#6557Kastier1 wants to merge 3 commits into
Conversation
Two new flags for `reflex deploy --gcp`:
- --max-instances (IntRange(min=1), default 100): caps autoscaling so
cost-conscious deploys don't run open-ended against Cloud Run's 100-
instance default. CLI-level validation rejects max < min so users
get a clear error instead of an opaque gcloud one.
- --allow-unauthenticated / --no-allow-unauthenticated (default true):
today the deploy script unconditionally publishes the service to
allUsers. The negated form makes the service private — callers then
need a roles/run.invoker IAM binding to reach it (or front it with
IAP / a load balancer with IAM auth). Help text calls this out.
Forwarded as CLOUD_RUN_MAX_INSTANCES (string int) and
CLOUD_RUN_ALLOW_UNAUTHENTICATED ("true"/"false"). Requires the
matching backend change so the deploy script honors them.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR adds four new flags to
Confidence Score: 5/5Safe to merge; the new flags are additive, defaults preserve existing behavior, and the tempfile lifecycle is correctly scoped. All new code is additive. Defaults match Cloud Run's own defaults and don't change any existing behavior. The env-vars YAML tempfile is created only when needed and cleaned up reliably via ExitStack. The one inconsistency (envfile keys not validated against the env-var name regex) would produce a gcloud error rather than silent bad state. packages/reflex-hosting-cli/src/reflex_cli/v2/gcp.py — specifically the _parse_envs envfile branch which lacks key-name validation. Important Files Changed
Reviews (2): Last reviewed commit: "hosting-cli(gcp): add --env / --envfile ..." | Re-trigger Greptile |
| type=click.IntRange(min=1), | ||
| help="Maximum number of Cloud Run instances during autoscaling (sets CLOUD_RUN_MAX_INSTANCES). Caps cost under traffic spikes.", |
There was a problem hiding this comment.
No upper bound is enforced on
--max-instances. Cloud Run's documented maximum is 1000 instances; values above that will produce an opaque error from gcloud at deploy time rather than a clear CLI message — inconsistent with the min/max cross-validation added in this PR. Adding max=1000 keeps the UX consistent.
| type=click.IntRange(min=1), | |
| help="Maximum number of Cloud Run instances during autoscaling (sets CLOUD_RUN_MAX_INSTANCES). Caps cost under traffic spikes.", | |
| type=click.IntRange(min=1, max=1000), | |
| help="Maximum number of Cloud Run instances during autoscaling (sets CLOUD_RUN_MAX_INSTANCES). Caps cost under traffic spikes.", |
There was a problem hiding this comment.
Declining this one. 1000 is Cloud Run's default per-service cap, but it's a soft quota — customers can request it raised via Cloud Quotas. Hard-coding max=1000 in the CLI would lock anyone with a raised quota out of using the CLI for higher counts. IntRange(min=1) keeps the floor sane; let gcloud be the authority on the ceiling.
Add user-supplied env vars to `reflex deploy --gcp`, mirroring the existing `reflex deploy` and `reflex secrets update` flows: - `--env KEY=VALUE` (multiple=True): repeatable; parsed by `hosting.process_envs` (validates key format). - `--envfile PATH`: reads a .env file via `dotenv_values`; lazy import with the same install-hint as secrets.py. - When both are passed, --envfile wins with a warning (same precedence as the existing flows). Implementation: the parsed dict is written to a YAML tempfile (via json.dumps per value, so any string round-trips safely) and the path is forwarded to the deploy script as REFLEX_ENV_VARS_FILE. The script hands it to `gcloud run deploy --env-vars-file=...`. The tempfile's lifecycle is bound to a contextlib.ExitStack so it's only created when envs are present and always cleaned up afterward. Dry-run output now shows the env-vars YAML body so users can preview what's about to ship to Cloud Run. Help text calls out that these become plain Cloud Run env vars (visible to roles/run.viewer) and points at Secret Manager for sensitive values — matches existing Reflex/Fly deploy semantics. Companion backend PR (the script-side --env-vars-file support): reflex-dev/flexgen#3748. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@greptileai please re-review — the PR has expanded to include |
Per Greptile review on #6557 (P1 + security): if a user upgrades the CLI before the matching flexgen backend ships, passing --no-allow-unauthenticated would be silently no-op'd by the older deploy script (which still hard-codes --allow-unauthenticated), producing a PUBLIC service when the user explicitly asked for a private one. That's exactly the fail-silent privacy flip we defended against on the script side. Add a CLI-side check: after fetching the manifest, if the user passed --no-allow-unauthenticated but the fetched deploy_script doesn't reference CLOUD_RUN_ALLOW_UNAUTHENTICATED, abort with a clear error naming the missing backend support. Declining the companion P2 (IntRange max=1000 on --max-instances): 1000 is a soft default per-service cap that customers can raise via quota request; hard-coding it client-side would lock out users with raised quotas. Let gcloud be the authority. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Four new flags for `reflex deploy --gcp`:
`--max-instances` (`IntRange(min=1)`, default `100`): caps autoscaling against Cloud Run's 100-instance default. CLI-level check rejects `max < min` so users get a clear error instead of an opaque gcloud one.
`--allow-unauthenticated / --no-allow-unauthenticated` (default true): today the deploy script unconditionally publishes the service to `allUsers`. The negated form makes the service private — callers then need `roles/run.invoker` to reach it (or front with IAP / a load balancer).
`--env KEY=VALUE` (`multiple=True`): set environment variables on the Cloud Run service. Parsed by the existing `hosting.process_envs` helper (validates key format). Repeat for multiple, matching the existing `reflex deploy` and `reflex secrets update` UX.
`--envfile PATH`: reads a .env file via `dotenv_values` (lazy import + install hint, same pattern as `secrets.py`). When both `--envfile` and `--env` are passed, `--envfile` wins with a warning — same precedence as the existing flows.
How env vars get to Cloud Run
The parsed dict is written to a YAML tempfile (`json.dumps` per value so quotes/backslashes/newlines round-trip), and the path is forwarded to the deploy script as `REFLEX_ENV_VARS_FILE`. The script hands it to `gcloud run deploy --env-vars-file=...`. Tempfile lifecycle is bound to `contextlib.ExitStack` — created only when envs are present, cleaned up afterward.
Help text on `--env` calls out that these become plain Cloud Run env vars (visible to anyone with `roles/run.viewer`) and points at Secret Manager for sensitive values — matches the security posture of the existing Reflex/Fly deploy flow.
Dry-run output now shows the env-vars YAML body so users can preview what's about to ship.
Requires the matching backend change so the deploy script honors the new env vars: reflex-dev/flexgen#3748
Test plan
🤖 Generated with Claude Code