Skip to content

feat: infra.dns_delegation resource type (v0.2.0)#5

Merged
intel352 merged 20 commits into
mainfrom
feat/dns-delegation
May 21, 2026
Merged

feat: infra.dns_delegation resource type (v0.2.0)#5
intel352 merged 20 commits into
mainfrom
feat/dns-delegation

Conversation

@intel352
Copy link
Copy Markdown
Contributor

Summary

Adds the infra.dns_delegation resource type so wfctl can set a domain's registrar-level nameservers via the Hover control-panel API. Field-test target: gocodealone.tech → ns1/2/3.digitalocean.com.

Design + plan: docs/plans/2026-05-20-hover-dns-delegation-design.md + docs/plans/2026-05-20-hover-dns-delegation.md. Passed 3 rounds of design-phase + 2 rounds of plan-phase adversarial review + alignment-check + scope-lock.

Endpoint (captured from Hover UI 2026-05-20)

PUT /api/control_panel/domains/domain-<name>
Content-Type: application/json
X-CSRF-Token: <rails-csrf>

{"field":"nameservers","value":["ns1.digitalocean.com",...]}

What changed

  • DomainDelegation type — distinct from Domain (avoids endpoint-shape ambiguity).
  • ErrEmptyNameservers sentinel — loud failure on empty Read instead of silent re-apply thrash.
  • SetNameservers holds c.mu across full auth → CSRF → PUT (no TOCTOU). ensureLogin split into public + Locked variants.
  • GetDomainDelegation via the same control-panel API family as the PUT — far more likely to surface nameservers reliably than the DNS-records endpoint.
  • DelegationDriver registered as infra.dns_delegation alongside existing infra.dns. Shared *hover.Client.
  • plugin.json declares the new resource type.
  • Outputs structpb-safe ([]any not []string), guarded by round-trip tests.
  • Delete = reset to [ns1.hover.com, ns2.hover.com] (Hover defaults). Restore-from-stash deferred to v0.3.0 (requires interfaces.ResourceRef to gain a state channel — currently {Name, Type, ProviderID} only).

Test plan

  • gofmt -l . clean
  • GOWORK=off go vet ./... clean
  • GOWORK=off go build ./... clean
  • GOWORK=off go test ./... -count=1 all PASS (3 packages: internal, internal/drivers, internal/hover)
  • Field test: wfctl apply against gocodealone.tech via workflow_dispatch in gocodealone-multisite; verify Hover UI shows DO nameservers post-apply (deferred to a separate session).

Follow-ups (deferred to separate sessions)

  1. workflow-registry manifest bump to v0.2.0 (cannot author until goreleaser publishes asset SHAs after this merges + the v0.2.0 tag fires).
  2. gocodealone-multisite: config/dns.wfctl.yaml + .github/workflows/dns-delegation.yml (manual workflow_dispatch); operator runs apply against gocodealone.tech.
  3. v0.3.0: stash previous_nameservers at Create / restore at Delete (requires workflow interface change to surface last-applied Outputs to Delete).

🤖 Generated with Claude Code

intel352 and others added 17 commits May 20, 2026 20:25
New resource type adds registrar-level nameserver management
alongside the existing infra.dns record management. Endpoint
captured from the Hover web UI: PUT /api/control_panel/domains/
domain-<name> with X-CSRF-Token header + {field, value} JSON.

Decisions locked via brainstorming:
- Resource type: infra.dns_delegation (matches AWS/GCP naming
  room for equivalents; semantically distinct from infra.dns).
- Delete: reset to Hover defaults [ns1.hover.com, ns2.hover.com].
- CSRF: fetch fresh per PUT.
- Field test: gocodealone.tech end-to-end via
  workflow_dispatch GHA inside gocodealone-multisite.

Assumptions, rollback, and top-3 doubts captured in the doc for
adversarial-review attack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses 2 Critical + 3 Important + 3 Minor findings from
adversarial review:

CRITICAL:
- Read endpoint uncertainty (A6): switched primary Read from
  /api/domains/<name>/dns to /api/control_panel/domains/domain-<name>
  (same API family as the PUT — far more likely to surface the
  nameservers field reliably). First implementation task =
  curl-verify the response shape.
- Outputs["nameservers"] encoding: explicitly spec'd as []any
  (not []string) per the structpb boundary invariant. Helper
  + round-trip JSON test.

IMPORTANT:
- CSRF source ambiguity: documented the two distinct regexes
  (form token on /signin; meta tag on /control_panel/). Cited
  that both shapes coexist in the captured browser session.
- ensureLogin + fetchControlPanelCSRF concurrency: new *Locked
  helpers; SetNameservers holds c.mu across both the CSRF GET
  and the PUT. Eliminates the race window.
- Delete hardcoded defaults: primary path stashes
  previous_nameservers at Create; Delete restores from state.
  Hardcoded fallback only for state-less resources.

MINOR:
- Sequencing: deferred registry manifest + multisite YAML to a
  separate session post-goreleaser.
- Validation: relaxed to >=1 nameserver.
- Field test: success = Hover UI shows new NS; dig is separate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses 1 Critical + 2 Important from round 2:

CRITICAL:
- TOCTOU between ensureLogin + c.mu.Lock in SetNameservers:
  new ensureLoginLocked helper; SetNameservers holds c.mu for
  the entire auth -> CSRF -> PUT sequence. No interleaving
  window remains.

IMPORTANT:
- Domain struct dual-population ambiguity: introduced distinct
  DomainDelegation type returned by GetDomainDelegation.
  Existing Domain struct unchanged.
- Silent Apply thrash on empty nameservers: GetDomainDelegation
  returns typed ErrEmptyNameservers on zero entries. Loud
  failure at first plan instead of silent re-apply loop.

Minor findings tracked for plan-phase pickup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 3 found 0 Critical + 2 Important + 3 Minor; reviewer
explicitly recommended doc-level inline clarifications rather
than a round-4 dispatch. Three applied:

1. Data-flow diagram fixed: lock acquired BEFORE
   ensureLoginLocked (was misleading; implementer could re-
   introduce TOCTOU by following the diagram).
2. A6 tentative JSON envelope spec'd: flat object, not
   {"domains":[...]} wrapper. Curl verification gate remains
   the first implementation task.
3. I/O-under-lock trade-off documented: SetNameservers holds
   c.mu across two HTTP round-trips. Acknowledged as
   correctness-over-throughput; production hardening path
   (session-scoped CSRF cache) deferred to a follow-up if
   field-test reveals contention.

Status flipped to PASS (per reviewer guidance).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13-task plan with TDD per-task, structpb-safe Outputs, lock-held
SetNameservers, Read via control-panel endpoint with
ErrEmptyNameservers sentinel, Delete restores stashed
previous_nameservers (fallback to Hover defaults).

Single PR (feat/dns-delegation). Out of scope: workflow-registry
manifest bump + gocodealone-multisite field-test artifact +
live field test (deferred to a separate session post-goreleaser).

Rollback notes per task. Scope Manifest authored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses 1 Critical + 3 Important + 3 Minor:

CRITICAL: ResourceRef has no InputSnapshot field (verified
workflow/interfaces/iac_provider.go:183-187). Adopted reviewer
option #2: v0.2.0 ships Delete = always reset to Hover defaults
[ns1.hover.com, ns2.hover.com]. Stash-and-restore deferred to
v0.3.0 follow-up requiring interfaces change. Design + plan
updated; previous_nameservers dropped from Outputs entirely.

IMPORTANT: Task 3 newTestClient/rewritingTransport conflicted
with existing newStubClient/rewriteTransport helpers. Rewrote
to use the existing helpers.

IMPORTANT: Task 2 cited line numbers that shift after Task 1.
Switched to function-name references throughout.

IMPORTANT: Task 11 test used nonexistent newTestIaCServerInitialized
helper. Replaced with provider-level TestHoverProvider_Capabilities_
IncludesDelegation against HoverProvider.Capabilities() directly
(pure function; no gRPC harness needed).

MINOR: Added []string-vs-[]any clarifying comment to
putNameserversLocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses 3 Important + 3 Minor from plan-phase adversarial
review round 2:

IMPORTANT:
- Existing TestHoverIaCServer_Capabilities asserts len==1;
  added explicit Step 4b in Task 11 to update it to len==2 +
  multi-type assertion. Task 11 commit list now includes
  iacserver_test.go.
- Task 3/4/5 Red-step test code rewritten to use existing
  newStubClient + rewriteTransport directly (was: newTestClient
  references that would compile-error before the "re-author"
  note was reached).
- DelegationDriver struct comment + PR body bullet rewritten
  to match v0.2.0 spec (no previous_nameservers stash;
  fallback-only Delete). Was contradicting Task 7's
  delegationOutput which already omits the field.

MINOR (design doc cleanup):
- Removed stale "stash current NS as previous_nameservers"
  line from data-flow diagram.
- Removed "Pre-change GetDomainDelegation fails during Create"
  row from error-handling table (dead-code path).
- Updated return line: dnsDelegationOutputFromDesired(...,
  previous_ns) -> delegationOutput(domain, ns).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + HealthCheck ctx

Round-1 review findings:
- TestDelegationDriver_Update_HappyPath now asserts
  out.Outputs["nameservers"] is []any (structpb-safe).
  Previously discarded the output via `_ = out`.
- TestDelegationDriver_CtxCanceled_AllMethods extended to
  cover HealthCheck (which returns (result, nil) on
  cancellation; Healthy=false carries the signal).
- New TestDelegationDriver_Read_PropagatesErrEmptyNameservers
  verifies errors.Is(driverErr, hover.ErrEmptyNameservers)
  survives the driver's error-wrap chain. Callers using the
  sentinel to distinguish "Hover returned 0 nameservers"
  from other failures now have a regression harness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new IaC resource type, infra.dns_delegation, enabling wfctl to manage registrar-level nameserver delegation for Hover-registered domains via Hover’s control-panel API, alongside the existing infra.dns record-management support.

Changes:

  • Registered infra.dns_delegation in plugin.json and provider capabilities/driver map.
  • Extended the Hover client with control-panel CSRF extraction plus GET/PUT support for domain delegation nameservers, including lock discipline to avoid TOCTOU.
  • Introduced DelegationDriver with full lifecycle (Create/Read/Update/Delete/Diff/HealthCheck) and comprehensive unit tests.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
plugin.json Declares infra.dns_delegation as a supported resource type.
internal/provider.go Registers the new driver and exposes delegation capability via provider metadata.
internal/provider_test.go Adds coverage ensuring provider capabilities include the new resource type.
internal/iacserver_test.go Updates capability expectations to include delegation in the gRPC surface.
internal/hover/client.go Adds control-panel CSRF parsing, delegation GET, and nameserver PUT with lock-held sequence.
internal/hover/client_test.go Adds httptest coverage for CSRF extraction, delegation GET, and nameserver PUT shapes/errors.
internal/drivers/delegation.go Implements the infra.dns_delegation driver lifecycle + diff logic with structpb-safe outputs.
internal/drivers/delegation_test.go Adds extensive unit tests for delegation driver behavior and invariants.
docs/plans/2026-05-20-hover-dns-delegation.md Adds the implementation plan documentation for the feature.
docs/plans/2026-05-20-hover-dns-delegation.md.scope-lock Adds the scope-lock hash for the plan document.
docs/plans/2026-05-20-hover-dns-delegation-design.md Adds the design document describing API shape, concurrency, and rollout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/drivers/delegation.go Outdated
Comment on lines +90 to +93
if _, dup := seen[s]; dup {
return dnsDelegationSpec{}, fmt.Errorf("dns_delegation %q: nameservers[%d] = %q is a duplicate", domain, i, s)
}
seen[s] = struct{}{}
Copilot review finding: parseDelegationSpec used exact-string dedup
(seen[s]) while Update + Diff use strings.EqualFold. DNS hostnames
are case-insensitive, so ["NS1.example.com","ns1.example.com"]
should be rejected as duplicates but slipped through. Lowercase
the key when checking + storing in `seen`; preserve the original
casing in the parsed output for display.

Regression test: TestDelegationDriver_Create_CaseInsensitiveDuplicate_Rejected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread internal/drivers/delegation.go Outdated
Comment on lines +275 to +278
sort.Strings(sa)
sort.Strings(sb)
for i := range sa {
if !strings.EqualFold(sa[i], sb[i]) {
Comment thread internal/hover/client.go Outdated
Comment on lines +333 to +339
// between the two requests).
//
// Trade-off: any concurrent caller using the same *Client blocks for
// up to ~60s (two HTTP round-trips under the 30s default client timeout).
// Acceptable for the field-test scope (single goroutine, one delegation
// resource). Future: cache CSRF at session granularity if mixed-resource
// throughput becomes a concern.
…ld comment

PR #5 round-2 Copilot findings:

1. sameNameserverSet sorted case-sensitively then compared with
   strings.EqualFold. Mixed-case multisets could falsely diverge
   because sort positions disagreed with the case-insensitive
   compare. Lowercased all entries before sort; direct == compare
   suffices after normalize. Regression test:
   TestDelegationDriver_Diff_CaseInsensitiveMatch.

2. SetNameservers lock-hold comment understated worst-case
   duration: ensureLoginLocked may run GET /signin + POST /signin
   + GET /signin/totp + (optional) POST /signin/totp on stale
   sessions. Updated comment to enumerate the full request list
   and note ~180s worst-case vs ~60s warm-session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread internal/hover/client.go Outdated
Comment on lines +149 to +156
// csrfMetaRe extracts the Rails CSRF meta token from a control-panel HTML
// page. Distinct from csrfRe (form-token regex used by the /signin flow)
// because the control-panel pages embed the token as a meta tag for the
// SPA layer to read, while /signin embeds it as a hidden input.
//
// Both shapes coexist in the Hover-served HTML; we match each from the
// page where it's authoritative.
var csrfMetaRe = regexp.MustCompile(`<meta\s+name="csrf-token"\s+content="([^"]+)"`)
Comment on lines +223 to +232
if current.ProviderID != "" && !strings.EqualFold(s.domain, current.ProviderID) {
return &interfaces.DiffResult{
NeedsReplace: true,
Changes: []interfaces.FieldChange{{
Path: "domain",
Old: current.ProviderID,
New: s.domain,
ForceNew: true,
}},
}, nil
PR #5 round-3 Copilot findings:

1. csrfMetaRe assumed name=before-content + double quotes. Hover/
   Rails may emit content=before-name or single quotes. Replaced
   with two patterns covering both attribute orders and both
   quote styles, wrapped in extractCSRFMeta() helper.
   Test: TestExtractCSRFMeta_AttributeOrders covers all 4
   shapes + the missing case.

2. DelegationDriver.Diff returned NeedsReplace=true with
   NeedsUpdate at zero. DNSDriver pattern sets both; some
   planner paths gate on NeedsUpdate, risking a skipped
   replace. Now sets both for domain-change.
   Test: TestDelegationDriver_Diff_DomainChange_SetsBothNeedsUpdateAndReplace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@intel352 intel352 requested a review from Copilot May 21, 2026 02:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

@intel352 intel352 merged commit 1bb32c7 into main May 21, 2026
8 checks passed
@intel352 intel352 deleted the feat/dns-delegation branch May 21, 2026 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants