fix(namecheap): follow ReplacedBy chain + parse SSLGetInfoResult Status#40
Merged
Merged
Conversation
Two bugs in rota's namecheap CA polling, both surfaced when the in-flight oneiric.dev renewal hung indefinitely on aur0 with empty status logs: * `get_info` looked for `<SSLStatus Status="...">` and `<Status>` text, neither of which match Namecheap's actual response. The Status attribute lives on `<SSLGetInfoResult ...>`. Result: `unwrap_or_default()` returned empty string, so the WARN log always read `status=` (blank), and `is_issued()` could never match "active" / "issued". * Namecheap's `ssl.reissue` doesn't reissue the same SSL ID — it creates a NEW one under the same subscription line and marks the parent as `Status="replaced"` with a `ReplacedBy` pointer. rota was polling the parent forever, never seeing the new cert at the child ID. Fix: * `NamecheapCa` now holds two IDs: `initial_ssl_id` (immutable, from rota.yaml) and `active_ssl_id` (AtomicU64, mutable per renewal). `submit()` extracts `<SSLReissueResult ID="...">` from the reissue response and promotes it to active. `get_info` reads the active ID. Subsequent renewal cycles still call `ssl.reissue` against the initial subscription ID (the operator's purchase line) but poll whatever child Namecheap creates each time. * `get_info` reads Status as an attribute on `SSLGetInfoResult` (with fallbacks to the legacy `SSLStatus` attr + `<Status>` text for older response shapes). Captures `ReplacedBy` as `Option<u64>` on `NamecheapCertInfo`. * `await_issuance` adds a chain-following branch: when `status == "replaced"` and a `ReplacedBy` is present, swap the active ID and `continue` immediately (no 30s sleep) so polling resumes against the right cert in the next iteration. Defensive log if `replaced` is reported with no `ReplacedBy`. Without this fix every Sectigo CSR-hash renewal hung for the full 30-min POLL_DEADLINE before erroring with `timed out waiting for namecheap issuance`. With it, rota detects issuance in real time once Sectigo finishes validating the DCV CNAME. Tests: 108 daemon tests still pass; the chain-follow + status-attr fixes are exercised by the live deploy on aur0 (tests for the multi-step state machine would require a more involved mock NamecheapClient than the one-shot fixture tests use today).
albedosehen
added a commit
that referenced
this pull request
May 9, 2026
rota's `get_info` was looking for `<CertificateReturned>` element
text and `<CACertificate>` element text. Neither matches Namecheap's
actual `ssl.getInfo&Returncertificate=true` response, which carries
`CertificateReturned` as an ATTRIBUTE on `<Certificates>` and packs
PEMs in nested `<Certificate>` elements:
<Certificates CertificateReturned="true" ReturnType="INDIVIDUAL">
<Certificate><![CDATA[LEAF_PEM]]></Certificate>
<CaCertificates>
<Certificate Type="INTERMEDIATE">
<Certificate><![CDATA[INTERMEDIATE_1_PEM]]></Certificate>
</Certificate>
...
</CaCertificates>
</Certificates>
Result: cert_pem and chain_pem both empty, `is_issued()` false,
polling never terminates even when status==active. So PR #40's
chain-follow lands on the right SSL ID but `await_issuance` still
hangs at the extraction step. Found by extracting the cert manually
out of band when `getInfo` returned status=active for oneiric.dev's
in-flight order: rota's parser yielded empty strings even though
the PEMs were sitting right there in the response.
Fix: new `ApiResponse::pem_blocks(label)` method scans the raw
response for `-----BEGIN <label>-----`...`-----END <label>-----`
armor and returns each block in document order. `get_info` calls
`pem_blocks("CERTIFICATE")`; first block is the leaf, rest are the
chain (concatenated with newlines). The CSR present in the same
response is safely skipped because its label is "CERTIFICATE
REQUEST" and `BEGIN CERTIFICATE-----` doesn't substring-match
`BEGIN CERTIFICATE REQUEST-----`.
This is the 6th and (hopefully) final layer in the rota+Namecheap
end-to-end renewal pipeline, after PRs #36 (reverted), #37 (CDATA
unwrap), #38 (DnsCname variant), #39 (lowercase HostName), #40
(ReplacedBy chain + Status XML path). Tests: 3 new in xml::tests
covering the leaf+chain extraction, the CSR-skip rule, and the
empty-input edge case. Total daemon test count 111 (was 108).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Renewal pipeline hung indefinitely on aur0 against oneiric.dev.
rota logshowedin_progressfor 12+ minutes with WARN logs readingstatus=(blank), even though Namecheap had clearly accepted the reissue and was actively processing it.Two bugs:
get_infoparsed the wrong XML location for Status. Namecheap returns Status as an attribute on<SSLGetInfoResult ...>. rota was looking for<SSLStatus Status="...">(an element that doesn't exist) and<Status>text (also doesn't exist). Result:unwrap_or_default()returned empty string, blockingis_issued().rota polled the wrong SSL ID.
namecheap.ssl.reissuecreates a NEW SSL ID under the same subscription line — the parent flips toStatus="replaced"with aReplacedBypointer to the child. rota kept polling the parent forever.Fix
NamecheapCanow holdsinitial_ssl_id(immutable, from config) +active_ssl_id: AtomicU64(mutable per renewal).submit()extracts<SSLReissueResult ID="...">and promotes it to active.get_inforeads Status fromSSLGetInfoResult's attribute (with fallbacks to legacy shapes). Addsreplaced_by: Option<u64>toNamecheapCertInfo.await_issuanceadds a chain-follow branch onstatus == "replaced": swap active_ssl_id to ReplacedBy andcontinueimmediately (no sleep).Without this every Sectigo CSR-hash renewal would hang the full 30-min
POLL_DEADLINEbefore erroring withtimed out waiting for namecheap issuance.Verified
cargo fmt --all --checkcleancargo clippy --workspace --all-targets -- -D warningscleancargo test --workspace --locked108 daemon tests pass