Tier 2: Single-state-per-solution + cloud-only status + gapp deploy --ref
Summary
Three coupled changes to the build/state/status path:
- Drop workspace-mode TF state partitioning. One TF state per solution, always.
- Status reads service URLs from a TF outputs map (
service_urls), not a single service_url string.
- Add
gapp deploy --ref=<ref> to override the local-repo build ref.
Background
Today's behavior
Workspace-mode TF state. When a solution declares paths: in gapp.yaml (multi-service workspace mode), each sub-service's TF state is partitioned at gs://gapp-{solution}-{project}/terraform/state/{svc-name}/. Single-service solutions write to gs://gapp-{solution}-{project}/terraform/state/. Promoting a single-service solution to multi-service requires manually moving state objects between prefixes — not a seamless transition.
Status output read. core.py status() reads service_url (string) from terraform output -json. For multi-service it iterates a list constructed from paths: in the local manifest and reads each sub-state separately. This forces status to load + validate the local manifest before any cloud read can complete.
Deploy ref. gapp deploy builds from git archive HEAD of the local repo. No CLI override exists for selecting a different local ref. A clean working tree is required.
Why this is wrong
The "solution" abstraction is meant to be a single deployable unit:
- One bucket:
gapp-{solution}-{project} ✓
- One project label:
gapp_<owner>_<solution>=v-N ✓
- One TF state — but workspace mode partitions this ✗
If state is per-service, "solution" becomes a docstring with no resources at the solution level, and the bucket is the only shared resource. Inconsistent middle ground.
Best practice (Terraform community consensus): one state per "deployment unit" — the smallest thing typically applied/destroyed together. Split state only when you need:
- Independent lifecycles (apply A without touching B)
- Blast radius isolation (a bad apply on A doesn't risk B)
- Permission boundaries (different humans manage different services)
For the typical user (single developer, single repo, services that ship together), none of those apply. Splitting state imposes complexity (multi-init, multi-plan, drift across multiple states, migrations between modes) for zero benefit. Per-service SAs, IAM, and secrets are already isolated at the resource level inside one state.
Proposed Changes
1. Single state per solution, always
- TF state always lives at
gs://gapp-{solution}-{project}/terraform/state/
- Multiple services in a solution are TF resources/modules within that one state
- The workspace-mode state-prefix branching in
core.py deploy and status paths is deleted
- The
paths: field in gapp.yaml retains its meaning (multiple services) but no longer changes state location
2. Status reads service URLs from outputs map
TF templates emit service_urls (map) instead of service_url (string):
output "service_urls" {
value = { for s in var.services : s.name => google_cloud_run_service.svc[s.name].status[0].url }
}
Single-service solutions emit a 1-entry map keyed by the solution name. This produces one code path in status:
- Resolve solution → resolve project (cloud read)
terraform output -json → get service_urls map
- Iterate, probe
/health per service
Status no longer needs the local manifest. The bucket and TF state are the source of truth for what the solution actually has deployed.
Backward compat. During the transition, status reads service_urls (map) first, falls back to service_url (string) for older deployments. After all live deployments have been re-deployed under the new template, the fallback can be removed.
3. gapp deploy --ref=<ref> flag
Single coherent semantic: --ref selects which ref of the local repo to use as the build context for any service drawing from it.
| Service shape |
What --ref=v1.0.0 does |
Singular service: (no source: block) |
Build local repo at v1.0.0 instead of HEAD |
Singular service: with remote source: |
Refuses — no local code involved, flag has nothing to apply to |
Plural services:, all local-code entries |
Build local repo at v1.0.0 once; all entries share that build context |
Plural services: mixed (some local, some remote) |
Local entries built from v1.0.0; remote entries keep their pinned source.ref |
Plural services:, all remote |
Refuses — no local code involved |
Remote-sourced services have their own pinned source.ref in gapp.yaml (see the composition issue). CLI --ref does not override them. To bump a remote pin, edit gapp.yaml and commit.
Dirty-tree gate.
| Scenario |
Refuse on dirty tree? |
gapp deploy (any local-code service) |
Yes — accidental risk |
gapp deploy --ref=v1.0.0 |
No — explicit ref means local working state irrelevant |
gapp deploy for a solution with all remote sources |
N/A — local tree doesn't feed any build |
v-3 Contract Preservation
None of these changes intrinsically requires a contract bump:
- Label shape unchanged
- Bucket name shape unchanged
- gapp-env semantics unchanged
- Reserved env names unchanged
The TF outputs map is internal — outputs are not part of the contract. Workspace state collapse changes state-layout semantics, but the workspace-mode prefix layout is documented as deprecated rather than treated as a contract change.
If the new TF template emits different terraform resource addresses for an equivalent single-service deploy, existing TF state stops resolving and terraform apply would want to destroy + recreate. Mitigation, in priority order:
- Preserve addresses. Engineer the template so single-service resource addresses are byte-identical to the v-3 template. CI runs
terraform plan against a snapshotted v-3 state and asserts zero changes.
moved {} blocks (Terraform 1.1+). Declare address renames inside the template; TF auto-migrates state references on next apply, no manual state mv, no contract bump. This is the strongest escape hatch and almost always sufficient.
- Targeted
terraform state mv. For solutions already deployed, surgically rename resources in state to match new addresses. Tractable for small N.
- Template carve-out. Keep the existing v-3 template path active for solutions already deployed under it; new composed solutions use the new template. Two paths, single contract.
- Defer or scope-cut the part of the refactor that would change addresses.
v-4 is not on the table without a separate design discussion. Only after exhausting (1)–(5) does a contract bump enter the conversation, and only with explicit owner sign-off.
Work Breakdown
Out of Scope
- Per-service
--ref syntax (e.g., --ref worker=v2.0.0). Multi-service builds with mixed refs require editing gapp.yaml.
- Composition (
services: plural + remote source:). See the composition issue.
- Changing the contract major to v-4.
Tier 2: Single-state-per-solution + cloud-only status +
gapp deploy --refSummary
Three coupled changes to the build/state/status path:
service_urls), not a singleservice_urlstring.gapp deploy --ref=<ref>to override the local-repo build ref.Background
Today's behavior
Workspace-mode TF state. When a solution declares
paths:in gapp.yaml (multi-service workspace mode), each sub-service's TF state is partitioned atgs://gapp-{solution}-{project}/terraform/state/{svc-name}/. Single-service solutions write togs://gapp-{solution}-{project}/terraform/state/. Promoting a single-service solution to multi-service requires manually moving state objects between prefixes — not a seamless transition.Status output read.
core.pystatus()readsservice_url(string) fromterraform output -json. For multi-service it iterates a list constructed frompaths:in the local manifest and reads each sub-state separately. This forces status to load + validate the local manifest before any cloud read can complete.Deploy ref.
gapp deploybuilds fromgit archive HEADof the local repo. No CLI override exists for selecting a different local ref. A clean working tree is required.Why this is wrong
The "solution" abstraction is meant to be a single deployable unit:
gapp-{solution}-{project}✓gapp_<owner>_<solution>=v-N✓If state is per-service, "solution" becomes a docstring with no resources at the solution level, and the bucket is the only shared resource. Inconsistent middle ground.
Best practice (Terraform community consensus): one state per "deployment unit" — the smallest thing typically applied/destroyed together. Split state only when you need:
For the typical user (single developer, single repo, services that ship together), none of those apply. Splitting state imposes complexity (multi-init, multi-plan, drift across multiple states, migrations between modes) for zero benefit. Per-service SAs, IAM, and secrets are already isolated at the resource level inside one state.
Proposed Changes
1. Single state per solution, always
gs://gapp-{solution}-{project}/terraform/state/core.pydeploy and status paths is deletedpaths:field in gapp.yaml retains its meaning (multiple services) but no longer changes state location2. Status reads service URLs from outputs map
TF templates emit
service_urls(map) instead ofservice_url(string):Single-service solutions emit a 1-entry map keyed by the solution name. This produces one code path in status:
terraform output -json→ getservice_urlsmap/healthper serviceStatus no longer needs the local manifest. The bucket and TF state are the source of truth for what the solution actually has deployed.
Backward compat. During the transition, status reads
service_urls(map) first, falls back toservice_url(string) for older deployments. After all live deployments have been re-deployed under the new template, the fallback can be removed.3.
gapp deploy --ref=<ref>flagSingle coherent semantic:
--refselects which ref of the local repo to use as the build context for any service drawing from it.--ref=v1.0.0doesservice:(nosource:block)service:with remotesource:services:, all local-code entriesservices:mixed (some local, some remote)source.refservices:, all remoteRemote-sourced services have their own pinned
source.refin gapp.yaml (see the composition issue). CLI--refdoes not override them. To bump a remote pin, edit gapp.yaml and commit.Dirty-tree gate.
gapp deploy(any local-code service)gapp deploy --ref=v1.0.0gapp deployfor a solution with all remote sourcesv-3 Contract Preservation
None of these changes intrinsically requires a contract bump:
The TF outputs map is internal — outputs are not part of the contract. Workspace state collapse changes state-layout semantics, but the workspace-mode prefix layout is documented as deprecated rather than treated as a contract change.
If the new TF template emits different terraform resource addresses for an equivalent single-service deploy, existing TF state stops resolving and
terraform applywould want to destroy + recreate. Mitigation, in priority order:terraform planagainst a snapshotted v-3 state and asserts zero changes.moved {}blocks (Terraform 1.1+). Declare address renames inside the template; TF auto-migrates state references on next apply, no manualstate mv, no contract bump. This is the strongest escape hatch and almost always sufficient.terraform state mv. For solutions already deployed, surgically rename resources in state to match new addresses. Tractable for small N.v-4 is not on the table without a separate design discussion. Only after exhausting (1)–(5) does a contract bump enter the conversation, and only with explicit owner sign-off.
Work Breakdown
core.pydeploy and status pathsservice_urlsmap output (single-entry for single-service solutions)service_urlsmap; falls back toservice_urlstring for legacy state--refflag togapp deployCLI with the rules table above--refwhen no service uses local code; clear error message--refis givenmoved {}blocks in the new TF template if any resource addresses changeterraform planagainst a v-3-shape state must show zero diff for an unchanged single-service solution--refflag honored for local-code services,--refrefused for all-remote solutions, dirty-tree gate relaxed under--refCONTRIBUTING.md: document one-state-per-solution as the contract, document workspace-mode prefixes as deprecated, capture the v-4 escalation rules above as the standing policy--refflag, dirty-tree relaxation, and the single-state-per-solution contractterraform moved {}mitigation pattern inCONTRIBUTING.mdas the standard approach when a template needs to rename resource addressesCONTRIBUTING.md: workspace-prefix-layout deployments cannot be applied under the new template without a one-timeterraform state mv; document the exact commandsOut of Scope
--refsyntax (e.g.,--ref worker=v2.0.0). Multi-service builds with mixed refs require editing gapp.yaml.services:plural + remotesource:). See the composition issue.