feat(iac): plan-time JIT resolver against state outputs#576
Merged
Conversation
Adds TryResolveSpec that substitutes resolvable references while leaving
unresolvable ones verbatim. Malformed refs (${}, ${.x}, ${x.}) remain
hard errors matching ResolveSpec's contract. Returns sorted unresolved
list for plan-time diagnostics.
Also adds lookupModuleField helper shared by strict and lenient paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds resolveSpecsAgainstState that applies lenient JIT substitution against existing state outputs at plan time. MODULE.field refs in state collapse to literals; infra_output-typed secrets are also pre-resolved. Unresolvable refs return ResolutionDiagnostic entries for plan output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inserts resolveSpecsAgainstState between loadCurrentState and computePlanForInfraSpecs in runInfraPlan. MODULE.field refs that exist in state collapse to literals before Diff, eliminating the TC2 spurious-replace class. Pending-JIT refs surface in plan output. Includes TC2 regression test with a mock DropletDriver that reproduces the field-compare spurious-replace behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inserts resolveSpecsAgainstState after infraSpecs filtering and before provider grouping in applyInfraModules. Mirrors the plan-path change so the apply path sees real values for MODULE.field refs when the source module is already in state. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the Unreleased CHANGELOG entry describing plan-time JIT resolver and TryResolveSpec. Adds ADR 0013 (workflow-side copy of core-dump decisions/0009) documenting the design rationale and alternatives. Also cleans up unused countingStateStore from apply test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a plan-time “lenient JIT” substitution pass for IaC specs so wfctl infra plan / wfctl infra apply can resolve ${MODULE.field} and infra_output-typed ${VAR} references from existing state before diffing, reducing spurious replace actions.
Changes:
- Introduces
jitsubst.TryResolveSpecto resolve what’s available while preserving unresolved references (with diagnostics) and still hard-failing malformed refs. - Wires plan-time resolution into
runInfraPlanandapplyInfraModulesvia a newresolveSpecsAgainstStatehelper. - Adds ADR 0013 + changelog entry and regression tests covering the TC2 spurious-replace scenario and determinism of hashing inputs.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
iac/jitsubst/jitsubst.go |
Adds lenient spec resolver (TryResolveSpec) plus shared lookup helper. |
iac/jitsubst/jitsubst_test.go |
Adds unit tests for lenient resolution behavior and malformed-ref rejection. |
cmd/wfctl/infra_resolve_state.go |
Implements plan-time state/env-based resolution + diagnostics collection. |
cmd/wfctl/infra.go |
Invokes plan-time resolution during wfctl infra plan prior to diff planning. |
cmd/wfctl/infra_apply.go |
Mirrors the same resolution step in the non---plan apply path. |
cmd/wfctl/infra_resolve_state_test.go |
Tests resolver behavior (module refs, infra_output secrets, determinism). |
cmd/wfctl/infra_plan_resolve_state_test.go |
Regression test ensuring TC2 scenario yields no spurious replace during planning. |
cmd/wfctl/infra_apply_resolve_state_test.go |
Regression test for apply path (no spurious replace before diff plan). |
decisions/0013-jit-plan-time-resolver-against-state.md |
ADR documenting the motivation, approach, and consequences. |
CHANGELOG.md |
Documents the new plan-time resolver behavior and new API surface. |
Comment on lines
+267
to
+275
| wfCfgForResolver, cfgLoadErr := config.LoadFromFile(cfgFile) | ||
| if cfgLoadErr != nil { | ||
| return fmt.Errorf("load config for plan-time resolver: %w", cfgLoadErr) | ||
| } | ||
| var resolutionDiags []ResolutionDiagnostic | ||
| desired, resolutionDiags, err = resolveSpecsAgainstState(desired, current, wfCfgForResolver, envName) | ||
| if err != nil { | ||
| return fmt.Errorf("resolve specs against state: %w", err) | ||
| } |
Comment on lines
+276
to
+285
| if len(resolutionDiags) > 0 { | ||
| // Deferred-resolution notice printed after plan table (see below). | ||
| // Store in closure so the print lands after the plan table. | ||
| defer func() { | ||
| fmt.Println() | ||
| fmt.Println("Pending JIT resolution (apply-time):") | ||
| for _, d := range resolutionDiags { | ||
| fmt.Printf(" %s: ${%s}\n", d.ResourceName, d.Ref) | ||
| } | ||
| }() |
Comment on lines
+157
to
+160
| // Used by cmd/wfctl/infra.go::resolveSpecsAgainstState (PR-1) to | ||
| // substitute state-output and env-var references at plan time so that | ||
| // driver.Diff sees real values instead of literal templates. Surviving | ||
| // unresolved refs flow through to apply-time strict ResolveSpec. |
Comment on lines
+284
to
+312
| // lookupModuleField factors the ${MODULE.id|field} resolution from | ||
| // resolveRef so both strict and lenient paths share it. Returns | ||
| // (value, true) when found; (_, false) when missing. | ||
| func lookupModuleField( | ||
| module, field string, | ||
| replaceIDMap map[string]string, | ||
| syncedOutputs map[string]map[string]any, | ||
| ) (string, bool) { | ||
| if field == "id" { | ||
| if id, ok := replaceIDMap[module]; ok { | ||
| return id, true | ||
| } | ||
| if outs, ok := syncedOutputs[module]; ok { | ||
| if v, ok := outs["id"]; ok { | ||
| return fmt.Sprintf("%v", v), true | ||
| } | ||
| } | ||
| return "", false | ||
| } | ||
| outs, ok := syncedOutputs[module] | ||
| if !ok { | ||
| return "", false | ||
| } | ||
| v, ok := outs[field] | ||
| if !ok { | ||
| return "", false | ||
| } | ||
| return fmt.Sprintf("%v", v), true | ||
| } |
Comment on lines
+53
to
+60
| const vpcID = "14badc41-1234-5678-abcd-ef0123456789" | ||
| t.Setenv("STAGING_VPC_UUID", vpcID) | ||
| writeApplyStateFile(t, stateDir, "core-dump-vpc", "infra.vpc", vpcID, | ||
| map[string]any{"id": vpcID, "cidr": "10.0.0.0/16"}, | ||
| map[string]any{"provider": "do-provider", "cidr": "10.0.0.0/16"}) | ||
| writeApplyStateFile(t, stateDir, "coredump-staging-pg", "infra.droplet", "droplet-99", | ||
| map[string]any{"vpc_uuid": vpcID}, | ||
| map[string]any{"provider": "do-provider", "vpc_uuid": vpcID, "size": "s-1vcpu-2gb"}) |
Comment on lines
+178
to
+182
| planFile := filepath.Join(dir, "plan.json") | ||
| if err := runInfraPlan([]string{"--config", cfgPath, "--output", planFile}); err != nil { | ||
| t.Fatalf("runInfraPlan: %v", err) | ||
| } | ||
|
|
⏱ Benchmark Results✅ No significant performance regressions detected. benchstat comparison (baseline → PR)
|
- resolveRef now delegates ${MODULE.field} lookup to lookupModuleField
so strict and lenient paths share one implementation (no drift)
- Remove defer for resolution diagnostics; print only after successful
plan output (not on error returns from computePlan)
- --plan apply path: mirror resolveSpecsAgainstState before hashing so
plan.DesiredHash (post-resolution) matches the stale-check recompute
- Fix TryResolveSpec doc comment: references infra_resolve_state.go, not infra.go
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines
+53
to
+88
| const vpcID = "14badc41-1234-5678-abcd-ef0123456789" | ||
| t.Setenv("STAGING_VPC_UUID", vpcID) | ||
| writeApplyStateFile(t, stateDir, "core-dump-vpc", "infra.vpc", vpcID, | ||
| map[string]any{"id": vpcID, "cidr": "10.0.0.0/16"}, | ||
| map[string]any{"provider": "do-provider", "cidr": "10.0.0.0/16"}) | ||
| writeApplyStateFile(t, stateDir, "coredump-staging-pg", "infra.droplet", "droplet-99", | ||
| map[string]any{"vpc_uuid": vpcID}, | ||
| map[string]any{"provider": "do-provider", "vpc_uuid": vpcID, "size": "s-1vcpu-2gb"}) | ||
|
|
||
| cfgPath := filepath.Join(dir, "infra.yaml") | ||
| if err := os.WriteFile(cfgPath, []byte(` | ||
| modules: | ||
| - name: do-provider | ||
| type: iac.provider | ||
| config: | ||
| provider: test-cloud | ||
|
|
||
| - name: do-state | ||
| type: iac.state | ||
| config: | ||
| backend: filesystem | ||
| directory: `+stateDir+` | ||
|
|
||
| - name: core-dump-vpc | ||
| type: infra.vpc | ||
| config: | ||
| provider: do-provider | ||
| cidr: "10.0.0.0/16" | ||
|
|
||
| - name: coredump-staging-pg | ||
| type: infra.droplet | ||
| config: | ||
| provider: do-provider | ||
| vpc_uuid: "${STAGING_VPC_UUID}" | ||
| size: s-1vcpu-2gb | ||
| `), 0o600); err != nil { |
Comment on lines
+145
to
+150
| // We inject through resolveIaCProvider; computeInfraPlan is used by | ||
| // applyWithProviderAndStore which calls p.Plan — handled above. | ||
|
|
||
| if err := runInfraApply([]string{"--config", cfgPath, "--auto-approve"}); err != nil { | ||
| t.Fatalf("runInfraApply: %v", err) | ||
| } |
Comment on lines
+41
to
+103
| // tc2MockProvider is an IaCProvider that returns a Diff-based plan | ||
| // simulating the DO DropletDriver behavior: if the desired config's | ||
| // vpc_uuid field differs from the state's, it emits a replace action. | ||
| // This faithfully reproduces the TC2 spurious-replace root cause. | ||
| type tc2MockProvider struct { | ||
| applyCapture | ||
| } | ||
|
|
||
| func (p *tc2MockProvider) Plan(_ context.Context, desired []interfaces.ResourceSpec, current []interfaces.ResourceState) (*interfaces.IaCPlan, error) { | ||
| // Build a lookup of current state by name. | ||
| currentMap := make(map[string]interfaces.ResourceState, len(current)) | ||
| for _, s := range current { | ||
| currentMap[s.Name] = s | ||
| } | ||
| var actions []interfaces.PlanAction | ||
| for _, spec := range desired { | ||
| rs, exists := currentMap[spec.Name] | ||
| if !exists { | ||
| actions = append(actions, interfaces.PlanAction{Action: "create", Resource: spec}) | ||
| continue | ||
| } | ||
| // Simulate DropletDriver.Diff: for each field in desired config, | ||
| // compare against the stored state config. If any field differs, | ||
| // emit a replace action (ForceNew semantics for vpc_uuid). | ||
| // | ||
| // This faithfully reproduces the TC2 root cause: before plan-time | ||
| // resolution, spec.Config["vpc_uuid"] was "${STAGING_VPC_UUID}" | ||
| // but rs.Config["vpc_uuid"] was the real UUID — mismatch → replace. | ||
| // | ||
| // After plan-time resolution, spec.Config["vpc_uuid"] is the real | ||
| // UUID (matching rs.Config["vpc_uuid"]) → no action. | ||
| anyDiff := false | ||
| for k, desiredVal := range spec.Config { | ||
| if k == "provider" { | ||
| continue // provider ref is metadata, not a cloud field | ||
| } | ||
| stateVal, hasField := rs.AppliedConfig[k] | ||
| if !hasField { | ||
| // Check state Config (iacStateRecord.config maps to AppliedConfig | ||
| // but some backends persist as Outputs). Check outputs too. | ||
| stateVal, hasField = rs.Outputs[k] | ||
| } | ||
| if !hasField || fmt.Sprintf("%v", desiredVal) != fmt.Sprintf("%v", stateVal) { | ||
| anyDiff = true | ||
| break | ||
| } | ||
| } | ||
| if anyDiff { | ||
| actions = append(actions, interfaces.PlanAction{ | ||
| Action: "replace", | ||
| Resource: spec, | ||
| Current: &rs, | ||
| }) | ||
| } | ||
| } | ||
| plan := &interfaces.IaCPlan{Actions: actions} | ||
| return plan, nil | ||
| } | ||
|
|
||
| func (p *tc2MockProvider) ResourceDriver(_ string) (interfaces.ResourceDriver, error) { | ||
| return nil, nil | ||
| } | ||
|
|
Both tc2 regression tests previously injected a mock provider via resolveIaCProvider but computePlanForInfraSpecs calls computeInfraPlan (platform.ComputePlan) which falls back to ConfigHash comparison when ResourceDriver is nil — making the tests vacuously green. Fix: override computeInfraPlan with a TC2 field-compare function that simulates DropletDriver.Diff behavior. Also: - Plan test: remove tc2MockProvider.Plan (dead code, never called) - Apply test: use secrets.generate infra_output + provider:env instead of t.Setenv so the test validates the state-output resolution path, not the os.LookupEnv fallback; add infra.auto_bootstrap:false to bypass bootstrap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines
+43
to
+47
| out := make([]interfaces.ResourceSpec, len(specs)) | ||
| var diags []ResolutionDiagnostic | ||
| for i, spec := range specs { | ||
| resolved, unresolved, err := jitsubst.TryResolveSpec(spec, nil, syncedOutputs, envLookup) | ||
| if err != nil { |
Comment on lines
258
to
+265
| desired = filterSpecsByInclude(desired, planIncludeSet) | ||
| current = filterStatesByInclude(current, planIncludeSet) | ||
|
|
||
| // Plan-time JIT resolution (PR-1): substitute ${MODULE.field} and | ||
| // ${SECRET} refs against current state so driver.Diff sees real | ||
| // values instead of literal templates. Refs whose source isn't in | ||
| // state stay templated; planRequiresJITSubstitution detects them | ||
| // and SchemaVersion=2 stamping handles the apply-time path. |
Comment on lines
186
to
+206
| // Validate and apply the include filter to both specs and state before | ||
| // grouping by provider so that out-of-scope resources are never passed | ||
| // down to any provider. | ||
| if err := validateIncludeSet(includeSet, infraSpecs, current); err != nil { | ||
| return err | ||
| } | ||
| infraSpecs = filterSpecsByInclude(infraSpecs, includeSet) | ||
| current = filterStatesByInclude(current, includeSet) | ||
|
|
||
| // Load full config to resolve iac.provider module definitions. | ||
| cfg, err := config.LoadFromFile(cfgFile) | ||
| if err != nil { | ||
| return fmt.Errorf("load config: %w", err) | ||
| } | ||
|
|
||
| // Plan-time JIT resolution (PR-1): substitute ${MODULE.field} and | ||
| // ${SECRET} refs against current state so driver.Diff sees real | ||
| // values instead of literal templates. Apply does not print the | ||
| // diagnostics — they're plan-output sugar only. | ||
| infraSpecs, _, err = resolveSpecsAgainstState(infraSpecs, current, cfg, envName) | ||
| if err != nil { |
Clarifies why collapsing ${MODULE.id} at plan time is safe for the
TC2 fix scenario: when parent is being replaced, the dependent shows
no-change in the plan (both desired and state have the same old ProviderID),
so there is no plan action carrying a wrong literal ProviderID. The
replace-cascade path (dependent has an action AND parent's id changes)
still flows through apply-time JIT via ReplaceIDMap as before.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
jitsubst.TryResolveSpec— lenient sibling ofResolveSpecfor plan-time use. Substitutes resolvable${MODULE.field}and${VAR}refs; leaves unresolvable refs verbatim with diagnostics. Malformed refs remain hard errors.cmd/wfctl/infra_resolve_state.go::resolveSpecsAgainstState— wiresTryResolveSpecintorunInfraPlanandapplyInfraModules. Resolves MODULE.field refs against state outputs andinfra_output-typed secrets at plan time so driver Diff sees real values instead of literal templates.DropletDriver.Diff(and similar) comparing${STAGING_VPC_UUID}vs real UUID now sees the resolved UUID on both sides → no spurious replace.Refs whose source module is not yet in state stay templated for apply-time JIT (existing W-5 path unchanged). ADR 0013 added.
Test plan
go test ./iac/jitsubst/ ./cmd/wfctl/ -vpasses (all 5 tasks covered)golangci-lint run ./iac/jitsubst/ ./cmd/wfctl/clean (pre-existing gosec G115 only)TestRunInfraPlan_TC2RegressionScenario— mock DropletDriver-style Diff sees resolved UUID → 0 actionsTestRunInfraApply_ResolvesSpecsBeforeComputePlan— apply path mirrors plan, no spurious replaceTestResolveSpecsAgainstState_HashByteStable— 5 iterations produce identicaldesiredStateHashTestInfraPlan_SchemaVersion*tests all pass — resolver doesn't break JIT detectionReferences
decisions/0013-jit-plan-time-resolver-against-state.md🤖 Generated via writing-plans → executing-plans pipeline