Single-state-per-solution + cloud-only status outputs map + gapp deploy --ref flag

# Tier 2: Single-state-per-solution + cloud-only status + `gapp deploy --ref`

## Summary

Three coupled changes to the build/state/status path:

1. Drop workspace-mode TF state partitioning. One TF state per solution, always.
2. Status reads service URLs from a TF outputs map (`service_urls`), not a single `service_url` string.
3. Add `gapp deploy --ref=<ref>` to override the local-repo build ref.

## Background

### Today's behavior

**Workspace-mode TF state.** When a solution declares `paths:` in gapp.yaml (multi-service workspace mode), each sub-service's TF state is partitioned at `gs://gapp-{solution}-{project}/terraform/state/{svc-name}/`. Single-service solutions write to `gs://gapp-{solution}-{project}/terraform/state/`. Promoting a single-service solution to multi-service requires manually moving state objects between prefixes — not a seamless transition.

**Status output read.** `core.py` `status()` reads `service_url` (string) from `terraform output -json`. For multi-service it iterates a list constructed from `paths:` in the local manifest and reads each sub-state separately. This forces status to load + validate the local manifest before any cloud read can complete.

**Deploy ref.** `gapp deploy` builds from `git archive HEAD` of the local repo. No CLI override exists for selecting a different local ref. A clean working tree is required.

### Why this is wrong

The "solution" abstraction is meant to be a single deployable unit:

- One bucket: `gapp-{solution}-{project}` ✓
- One project label: `gapp_<owner>_<solution>=v-N` ✓
- One TF state — but workspace mode partitions this ✗

If state is per-service, "solution" becomes a docstring with no resources at the solution level, and the bucket is the only shared resource. Inconsistent middle ground.

Best practice (Terraform community consensus): one state per "deployment unit" — the smallest thing typically applied/destroyed together. Split state only when you need:

- Independent lifecycles (apply A without touching B)
- Blast radius isolation (a bad apply on A doesn't risk B)
- Permission boundaries (different humans manage different services)

For the typical user (single developer, single repo, services that ship together), none of those apply. Splitting state imposes complexity (multi-init, multi-plan, drift across multiple states, migrations between modes) for zero benefit. Per-service SAs, IAM, and secrets are already isolated at the resource level inside one state.

## Proposed Changes

### 1. Single state per solution, always

- TF state always lives at `gs://gapp-{solution}-{project}/terraform/state/`
- Multiple services in a solution are TF resources/modules within that one state
- The workspace-mode state-prefix branching in `core.py` deploy and status paths is deleted
- The `paths:` field in gapp.yaml retains its meaning (multiple services) but no longer changes state location

### 2. Status reads service URLs from outputs map

TF templates emit `service_urls` (map) instead of `service_url` (string):

```hcl
output "service_urls" {
  value = { for s in var.services : s.name => google_cloud_run_service.svc[s.name].status[0].url }
}
```

Single-service solutions emit a 1-entry map keyed by the solution name. This produces one code path in status:

1. Resolve solution → resolve project (cloud read)
2. `terraform output -json` → get `service_urls` map
3. Iterate, probe `/health` per service

Status no longer needs the local manifest. The bucket and TF state are the source of truth for what the solution actually has deployed.

**Backward compat.** During the transition, status reads `service_urls` (map) first, falls back to `service_url` (string) for older deployments. After all live deployments have been re-deployed under the new template, the fallback can be removed.

### 3. `gapp deploy --ref=<ref>` flag

Single coherent semantic: **`--ref` selects which ref of the local repo to use as the build context for any service drawing from it.**

| Service shape | What `--ref=v1.0.0` does |
|---|---|
| Singular `service:` (no `source:` block) | Build local repo at v1.0.0 instead of HEAD |
| Singular `service:` with remote `source:` | **Refuses** — no local code involved, flag has nothing to apply to |
| Plural `services:`, all local-code entries | Build local repo at v1.0.0 once; all entries share that build context |
| Plural `services:` mixed (some local, some remote) | Local entries built from v1.0.0; remote entries keep their pinned `source.ref` |
| Plural `services:`, all remote | **Refuses** — no local code involved |

Remote-sourced services have their own pinned `source.ref` in gapp.yaml (see the composition issue). CLI `--ref` does not override them. To bump a remote pin, edit gapp.yaml and commit.

**Dirty-tree gate.**

| Scenario | Refuse on dirty tree? |
|---|---|
| `gapp deploy` (any local-code service) | Yes — accidental risk |
| `gapp deploy --ref=v1.0.0` | No — explicit ref means local working state irrelevant |
| `gapp deploy` for a solution with all remote sources | N/A — local tree doesn't feed any build |

## v-3 Contract Preservation

None of these changes intrinsically requires a contract bump:

- Label shape unchanged
- Bucket name shape unchanged
- gapp-env semantics unchanged
- Reserved env names unchanged

The TF outputs map is internal — outputs are not part of the contract. Workspace state collapse changes state-layout semantics, but the workspace-mode prefix layout is documented as deprecated rather than treated as a contract change.

If the new TF template emits **different terraform resource addresses** for an equivalent single-service deploy, existing TF state stops resolving and `terraform apply` would want to destroy + recreate. Mitigation, in priority order:

1. **Preserve addresses.** Engineer the template so single-service resource addresses are byte-identical to the v-3 template. CI runs `terraform plan` against a snapshotted v-3 state and asserts zero changes.
2. **`moved {}` blocks** (Terraform 1.1+). Declare address renames inside the template; TF auto-migrates state references on next apply, no manual `state mv`, no contract bump. This is the strongest escape hatch and almost always sufficient.
3. **Targeted `terraform state mv`.** For solutions already deployed, surgically rename resources in state to match new addresses. Tractable for small N.
4. **Template carve-out.** Keep the existing v-3 template path active for solutions already deployed under it; new composed solutions use the new template. Two paths, single contract.
5. **Defer or scope-cut** the part of the refactor that would change addresses.

**v-4 is not on the table without a separate design discussion.** Only after exhausting (1)–(5) does a contract bump enter the conversation, and only with explicit owner sign-off.

## Work Breakdown

- [ ] Drop workspace-mode state-prefix branching in `core.py` deploy and status paths
- [ ] Update TF templates to emit `service_urls` map output (single-entry for single-service solutions)
- [ ] Status reads `service_urls` map; falls back to `service_url` string for legacy state
- [ ] Add `--ref` flag to `gapp deploy` CLI with the rules table above
- [ ] Refuse `--ref` when no service uses local code; clear error message
- [ ] Loosen the dirty-tree gate when `--ref` is given
- [ ] Use `moved {}` blocks in the new TF template if any resource addresses change
- [ ] CI: `terraform plan` against a v-3-shape state must show zero diff for an unchanged single-service solution
- [ ] Tests: state-shape-collapse for single-service deploys, multi-service via TF resources within one state, status reads outputs map, `--ref` flag honored for local-code services, `--ref` refused for all-remote solutions, dirty-tree gate relaxed under `--ref`
- [ ] Update `CONTRIBUTING.md`: document one-state-per-solution as the contract, document workspace-mode prefixes as deprecated, capture the v-4 escalation rules above as the standing policy
- [ ] Update the deploy skill: describe `--ref` flag, dirty-tree relaxation, and the single-state-per-solution contract
- [ ] Capture the `terraform moved {}` mitigation pattern in `CONTRIBUTING.md` as the standard approach when a template needs to rename resource addresses
- [ ] Capture state-collapse migration policy in `CONTRIBUTING.md`: workspace-prefix-layout deployments cannot be applied under the new template without a one-time `terraform state mv`; document the exact commands

## Out of Scope

- Per-service `--ref` syntax (e.g., `--ref worker=v2.0.0`). Multi-service builds with mixed refs require editing gapp.yaml.
- Composition (`services:` plural + remote `source:`). See the composition issue.
- Changing the contract major to v-4.


Service shape	What `--ref=v1.0.0` does
Singular `service:` (no `source:` block)	Build local repo at v1.0.0 instead of HEAD
Singular `service:` with remote `source:`	Refuses — no local code involved, flag has nothing to apply to
Plural `services:`, all local-code entries	Build local repo at v1.0.0 once; all entries share that build context
Plural `services:` mixed (some local, some remote)	Local entries built from v1.0.0; remote entries keep their pinned `source.ref`
Plural `services:`, all remote	Refuses — no local code involved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single-state-per-solution + cloud-only status outputs map + gapp deploy --ref flag #32

Tier 2: Single-state-per-solution + cloud-only status + `gapp deploy --ref`

Summary

Background

Today's behavior

Why this is wrong

Proposed Changes

1. Single state per solution, always

2. Status reads service URLs from outputs map

3. `gapp deploy --ref=<ref>` flag

v-3 Contract Preservation

Work Breakdown

Out of Scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Scenario	Refuse on dirty tree?
`gapp deploy` (any local-code service)	Yes — accidental risk
`gapp deploy --ref=v1.0.0`	No — explicit ref means local working state irrelevant
`gapp deploy` for a solution with all remote sources	N/A — local tree doesn't feed any build

Single-state-per-solution + cloud-only status outputs map + gapp deploy --ref flag #32

Description

Tier 2: Single-state-per-solution + cloud-only status + gapp deploy --ref

Summary

Background

Today's behavior

Why this is wrong

Proposed Changes

1. Single state per solution, always

2. Status reads service URLs from outputs map

3. gapp deploy --ref=<ref> flag

v-3 Contract Preservation

Work Breakdown

Out of Scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Tier 2: Single-state-per-solution + cloud-only status + `gapp deploy --ref`

3. `gapp deploy --ref=<ref>` flag