Skip to content

feat(tbtc): covenant signer final project branch#3882

Open
mswilkison wants to merge 134 commits into
mainfrom
feat/psbt-covenant-final-project-pr
Open

feat(tbtc): covenant signer final project branch#3882
mswilkison wants to merge 134 commits into
mainfrom
feat/psbt-covenant-final-project-pr

Conversation

@mswilkison
Copy link
Copy Markdown
Contributor

@mswilkison mswilkison commented Mar 9, 2026

Summary

This PR introduces the covenant signer — a Go HTTP service that signs Bitcoin PSBTs (Partially Signed Bitcoin Transactions) using each operator's existing tECDSA wallet share from DKG. It is consumed by the off-chain covenant tooling (in tbtc-v2-ac) when a covenant migration or recovery operation needs an operator-side signature on a constructed PSBT.

Repo footprint: ~9.4K LoC of Go in pkg/covenantsigner/ plus integration glue across pkg/tbtc/, pkg/chain/ethereum/, pkg/bitcoin/, cmd/, and config/. 48 files changed, +13,574 / −30.

Audit positioning

This PR is the secondary review target for the Trail of Bits PSBT covenant audit. The primary review covers the on-chain Solidity contracts in tlabs-xyz/tbtc-v2-ac. The current SOW miscalls this PR an "off-chain attester / watchdog service" — that wording is incorrect:

  • This PR is the covenant signer, not a watchdog. The signer has no ATTESTER_ROLE and no WATCHER_ROLE on-chain.
  • The on-chain WatchdogEnforcer.sol (the actual permissionless watchdog) is already merged on tbtc-v2-ac/dev. It lives in tbtc-v2-ac, not here.
  • The off-chain ac-watchdog-service (Node.js) lives in tlabs-xyz/ac-watchdog-service, not here.

Where to start reading

Recommended file order:

  1. pkg/covenantsigner/doc.go — package overview
  2. pkg/covenantsigner/DEPLOYMENT.md — deployment topology, dedup scope
  3. pkg/covenantsigner/server.go — HTTP, bearer-token auth, timeouts
  4. pkg/covenantsigner/service.go — Submit/Poll, idempotency, request digest
  5. pkg/covenantsigner/validation.go — request validation, fuzz target
  6. pkg/covenantsigner/types.go — request/response shapes, expiresAt
  7. pkg/covenantsigner/store.go — in-memory + on-disk persistence
  8. pkg/covenantsigner/engine.go — engine interface
  9. pkg/tbtc/covenant_signer.go — concrete engine: BTC confirmations, PSBT signing, tECDSA share lookup
  10. pkg/internal/canonicaljson/marshal.go — canonical JSON

Test vectors live at pkg/covenantsigner/testdata/ (covenant_recovery_approval_vectors_v1.json, migration_plan_quote_signing_vectors_v1.json).

Defensive properties (cheat-card)

Concern Defense
Authentication Bearer token; constant-time compare; mandatory if bound to non-loopback (service refuses to start without one)
Body size limit 2 MB
Request timeout 30 s
Replay / duplicate requests routeRequestID per request: same ID + same body returns the prior result; same ID + different body is rejected; expired quote (expiresAt) is rejected
Bitcoin reorg Funding tx must have ≥ 6 BTC confirmations before signing (≈ 1 hour, configurable per node)
Key custody No new key — reuses each node's existing tECDSA wallet share from DKG, encrypted on disk via NewEncryptedProtectedPersistence in pkg/storage/storage.go

Out-of-scope by design

This service signs PSBTs and returns them in HTTP responses. It does not submit Ethereum transactions: grep for nonce, SendTransaction, or TransactionManager in this PR returns nothing. Nonce safety, transaction replay, and RPC-outage handling are concerns of the off-chain ac-watchdog-service (a separate canonical repo) — not this PR.

Cross-repo coupling

This PR binds against contract ABIs from tlabs-xyz/tbtc-v2-ac via pkg/chain/ethereum/tbtc.go. After the on-chain Solidity stack lands (PR #263#280#281 / #282devmain), the Go bindings here will need regeneration against the merged ABIs. The current bindings track tbtc-v2-ac/feat/ac-031-psbt-covenant-import as of the most recent refresh.

CI surface

  • Go 1.24.
  • gosec enabled — staticcheck enabled — unit + integration tests run on every PR.
  • govulnchecknot yet wired in; planned addition.
  • Fuzz coverage — 1 fuzz target (pkg/covenantsigner/validation.go); light given the input surface (HTTP JSON, PSBT, witness scripts).

Dependency replacements (deliberate, pinned)

  • tss-lib → Threshold-Network fork.
  • keep-commontlabs-xyz fork.

Both forks are intentional; auditors are encouraged to review the forks commit-by-commit alongside this PR.

Known limitations / liveness risks

  • Single HTTP server per node — DoS is possible at the deployment layer; operators need monitoring + redundancy.
  • No automatic failover between nodes.
  • Request dedup is per-node only (no cluster-wide lock); the deployment doc instructs operators to use sticky sessions behind a load balancer to avoid cross-node duplication.
  • If the Bitcoin RPC is down, signing pauses (no internal retry / circuit breaker).

Status

  • Refreshed against current main on 2026-05-10 (clean rebase: 93 commits replayed, 15 cherry-pick-equivalents auto-skipped, 0 conflicts).
  • Promoted from Draft to Ready-for-Review on 2026-05-10.
  • Reviewer assigned: @piotr-roslaniec.

@lrsaturnino lrsaturnino marked this pull request as ready for review May 10, 2026 04:53
mswilkison added 29 commits May 10, 2026 02:01
piotr-roslaniec and others added 30 commits May 23, 2026 10:41
Adds the poisonedRoutes map field that was referenced in A34 fix
(commit f54d4e4) but never declared in the struct definition.

Without this field, the store fails to compile:
  s.poisonedRoutes undefined (type *Store has no field or method poisonedRoutes)

Fixes compile error reported in PR #3938 review comment.
P0-1: Add signer approval expiration check via EndBlock field.
     validationOptions gains currentBlock field; at policyIndependentDigest
     true, checks EndBlock expiration and falls through to re-verification.

P0-2: Persist poisoned route markers to disk for recovery across restarts.
     store.go adds poisonedDirectory, MarkPoisoned writes marker files,
     and load() restores state on startup.

P1: Add concurrency limit for in-flight submit/poll operations.
     Service gains maxInFlight field and semaphore channel; slot acquisition
     added in both Submit() and Poll() paths.

P2: Check EndBlock expiration in loadPollJob before digest comparison.
     Uses currentBlockProvider to verify approval has not expired.

P3: Add poll rate limiting alongside existing submit rate limit.
     pollLimiter uses golang.org/x/time/rate at 60/min; handlers updated
     to check Allow() before processing.

Engine: Add CurrentBlockHeightProvider as separate optional interface;
     passiveEngine stub added; WithCurrentBlockProvider refactored to
     auto-detect interface from Engine; nil guard added in Poll().
Prior to this change, WithCurrentBlockProvider silently returned 0 when the
underlying provider returned an error. This caused signer approval expiration
checks to pass incorrectly during RPC failures, treating expired approvals
as valid.

After this change, the provider returns (uint64, error). Both call sites
(loadPollJob and Poll) now propagate errors instead of suppressing them.
Remove redundant inner if block in loadPollJob's expiration check.
The outer && already guarantees job.Request.SignerApproval != nil,
making the inner nil check on EndBlock unreachable after the
short-circuit. Also adds a doc comment to WithMaxInFlight explaining
that n <= 0 disables the concurrency limit entirely.
Remove the MarkPoisoned function, poisonedRoutes map, and related
poisonedDirectory constant. The feature was added in a prior commit
but is never called from any code path, leaving the scaffolding
unused.
…BlockErr field

Extends scriptedEngine to support the SignerApprovalVerifier interface
and inject currentBlockHeight provider errors for comprehensive testing
of error propagation paths.

- Add signerApprovalVerifier field to scriptedEngine
- Add currentBlockErr field to scriptedEngine
- Add VerifySignerApproval method delegating to inner verifier
- VerifySignerApproval returns nil when inner verifier is nil (no-op)
When a signer approval certificate is expired and no verifier is present
in the validation options (Poll flow), return the expiration error
directly instead of falling through to the re-verification path that
requires a signerApprovalVerifier.

This ensures TestServicePollRejectsExpiredCertificate correctly
returns 'signer approval certificate has expired' instead of
'request.signerApproval cannot be verified by this signer deployment'.
Use explicit engine variable and WithCurrentBlockProvider in tests
that need to verify block-height-dependent validation, for
clearer block height control.
…fier interface

scriptedEngine implicitly implemented SignerApprovalVerifier via its
signerApprovalVerifier field and VerifySignerApproval method. This caused
NewService to auto-wire a signerApprovalVerifier on every test service,
triggering the validation guard that requires SignerApproval in the
request whenever a verifier is configured. Tests using baseRequest (no
SignerApproval) then failed before calling the engine, hanging goroutine
synchronization channels indefinitely.

Remove VerifySignerApproval and the signerApprovalVerifier field from
scriptedEngine. Tests that need a signer approval verifier now pass it
explicitly via WithSignerApprovalVerifier, making the intent clear.
golang.org/x/time was marked indirect in go.mod even though
pkg/covenantsigner/server.go imports golang.org/x/time/rate directly for
submit-endpoint rate limiting. Running go mod tidy removes the stale
// indirect annotation so the dependency graph matches the actual import
surface.
…-project-pr

Two pieces dropped when skipping conflicting intermediate commits:

- cancelService() on listener bind error in covenantsigner.NewServer:
  prevents service-context goroutines from leaking when the HTTP listener
  fails to bind during startup. Originally added in the now-skipped
  ef45c15 (which also added a signal handler later removed by 73c8863).

- TestVerifySignerApprovalCertificateRejectsHighSSignature in
  signer_approval_certificate_test.go: defense-in-depth coverage proving
  the verifier rejects mathematically-equivalent high-S signatures (S' =
  N - S) that would otherwise enable signature malleability. Originally
  added by 54dae9b.
## Summary

Implements 8 fixes from the deep audit of the covenant signer feature
(20 agent passes across 5 analysis areas + cross-provider verification
with Gemini and GPT models).

### P1 Fixes

- **A31**: Auto-derive `DataDir` from `storage.Dir` when unset and
signer port is non-zero. Without this, the exclusive file lock was
silently disabled, allowing two processes to corrupt the job store
concurrently.
- **A32**: Replace `context.Background()` with
`signal.NotifyContext(SIGINT, SIGTERM)` in `cmd/start.go`. The shutdown
goroutine (HTTP drain, cancel signing ops, release file lock) was dead
code -- it blocked on `ctx.Done()` which never fired.
- **A33**: Unconditionally evict stale `byRequestID` entries on
route-key replacement in `Put()`, and add route-key holder verification
in `loadPollJob`. Previously, a failed file delete left an orphan that a
stale Poll could use to seize the route key from a newer job.

### P2 Fixes

- **A34**: Clear poisoned route keys on successful `Put()`. Poison was
self-reinforcing at runtime because `Submit` hit `GetByRouteRequest`
(which returns `errPoisonedRouteKey`) before reaching `Put()`.
- **A36**: Normalize `migrationPlanQuoteTrustRoots` KeyID at startup,
matching the pattern for depositor/custodian roots. Config with
accidental whitespace in KeyID caused immediate request rejection.
- **A38**: Release file lock if trust root normalization fails after
`NewStore` succeeds, using a deferred cleanup on named error return.

### P3 Fixes

- **A35**: Trim whitespace from `destination.ReservationID` in
`normalizeMigrationDestination`, matching `normalizeMigrationPlanQuote`.
Asymmetric trim caused spurious rejection for whitespace-padded IDs.
- **A37**: Enforce low-S normalization on signer approval certificate
threshold signature. Without this, a certificate holder could submit the
high-S variant to create a duplicate job (doubled resource consumption,
no fund loss).

### Audit methodology

- 20 agent runs (4 passes x 5 areas: coverage, contract verification,
threat model, property testing, integration seams)
- Cross-provider verification: Gemini 3 Pro Preview (security), Gemini
2.5 Pro (integration), GPT-5.2 (state layer)
- Multi-agent consolidated review for final severity calibration
- All 7 Bitcoin P2WSH spending paths verified correct via script
execution traces
- No P0 (fund-loss) vulnerabilities found

## Test plan
- [x] `go build ./cmd/... ./pkg/covenantsigner/... ./pkg/tbtc/...`
compiles cleanly
- [x] `go test -race ./pkg/covenantsigner/...` passes
- [x] Relevant tbtc tests pass (GetWallet, signer approval, payload
hash)
- [ ] CI green
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants