fix: open-mandate consume-once locking + commit-uncertain raise#2
Conversation
P0-A — Open-mandate consume-once locking
=======================================
Earlier rounds keyed idempotency on transaction_id only, which breaks the
AP2 spec's normative defense (specification §6) against an autonomous
agent presenting subsequent open mandates without a rejection receipt:
two distinct transactions derived from the same open mandate produced
DIFFERENT idempotency keys, so both would be authorized.
Fix: when AP2Mandate.open_mandate_hash is present, key the lock on
open_mandate_hash instead of transaction_id. Scope namespace embedded in
the key so the two buckets never collide server-side:
- HNP: ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}
- default: ap2:tx:{sha256(transaction_id)[:32]}:{phase}
New mapping.consume_once_input() picks the (scope, raw_value) pair.
idempotency_key() signature changed from (transaction_id, ...) to
(mandate, ...) for automatic scope selection. Wire-shape change, but
pre-PyPI so no migration cost.
New TestIdempotencyKeyOpenMandateScope class covers:
- open_mandate scope used when hash present
- different transactions / same open_mandate share key
- tx and open_mandate namespaces never collide
- open_mandate takes precedence over transaction_id
Plus new test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key
in test_double_spend_same_transaction.
P0-B — Post-PSP commit failures now raise
==========================================
RESERVATION_FINALIZED, RESERVATION_EXPIRED, and IDEMPOTENCY_MISMATCH
previously returned silently after a warning log. Caller's only signal
was guard.committed == False — too easy to miss. RESERVATION_EXPIRED in
particular is dangerous: PSP may have charged while Cycles reclaimed
the budget on TTL.
New AP2GuardCommitUncertain exception (carries error_code, request_id,
reservation_id). Raised from _handle_commit for all three terminal-
status codes. Still no auto-release (a prior commit may already have
settled).
Three existing tests flipped from "asserts silent" to "asserts raise":
- test_commit_idempotency_mismatch_raises_uncertain
- test_reservation_finalized_raises_uncertain
- test_reservation_expired_raises_uncertain (new)
Polish
======
- README: "Independent project / not affiliated with Google" disclaimer.
- README: new "What this does NOT do" section (no signature verify,
no mandate creation, no PSP, etc.).
- README: tightened headline value-prop with bulleted "prevents these
failure modes" list.
- README: softened "any AP2-compatible SDK" → "AP2-style PaymentMandate
objects via a small adapter layer; tested against the current AP2
public example shapes".
- README: lifecycle + idempotency tables updated for both new behaviors.
- New tests/test_ap2_shape_adapter.py: exercises AP2Mandate.from_ap2()
against local dataclasses mirroring the upstream AP2 sample-types
layout, so adapter brittleness surfaces here first.
Test posture: 100 tests (up from 90), 98.35% coverage. ruff + mypy
strict clean. `python -m build` produces sdist + wheel cleanly.
Public API additions:
- AP2GuardCommitUncertain (exported)
Wire-shape change (acceptable because pre-PyPI):
- idempotency keys now include a scope namespace: ap2:tx:... or
ap2:open_mandate:... (was ap2:...)
AUDIT.md and CHANGELOG.md updated.
…rve test Per positioning-review round 2 on PR #2. 1) "Collapses onto one reservation" was an overclaim. The code makes the idempotency key collide; the payload may still differ (distinct transaction_id / checkout_hash on each AP2 checkout from the same open mandate). Cycles semantics: same key + same payload replays; same key + different payload → 409 IDEMPOTENCY_MISMATCH (surfaced as AP2GuardDenied). Both block a second valid reservation, but the mechanism differs. Reworded in README, AUDIT, and two test docstrings. 2) GuardedPayment.__doc__ and the README lifecycle table still said the retry/lock key is "transaction_id". Generalized to "consume-once key — open_mandate_hash when present, otherwise transaction_id". 3) New regression test: test_open_mandate_divergent_payload_reserve_mismatch_raises_denied exercises the second half of the consume-once contract — same open mandate hash, distinct transaction_id, first reserve allowed, second reserve gets 409 IDEMPOTENCY_MISMATCH from the server and the guard raises AP2GuardDenied(reason_code="IDEMPOTENCY_MISMATCH"). Documents what the spec really gives you for divergent payloads. 4) Reject empty open_mandate_hash / checkout_hash via pydantic min_length=1. Previously an empty string passed validation and consume_once_input's truthy check silently fell back to the tx scope — data corruption disguised as the default. Two new tests cover the rejection paths. Test posture: 103 tests (up from 100), 98.35% coverage. ruff + mypy strict clean. `python -m build` produces sdist + wheel cleanly. No public API additions. AUDIT.md and CHANGELOG.md updated.
Follow-up fixes pushed (1797db4)All three pre-merge requirements addressed plus the minor polish item.
Gates re-verified locally (all green):
No public API additions in this round. |
P0 — transport errors and 5xx on commit are uncertain, not failed
=================================================================
Previously a transport-level failure or 5xx response on commit fell
through to the "unrecognized rejection" branch: _handle_release was
called and AP2GuardCommitFailed raised with "reservation released" in
the message regardless of outcome. Wrong: the commit POST may have
reached Cycles and mutated state before the failure, so auto-release
risks undoing a successful settle.
New branch ordering in _handle_commit:
1. success
2. transport error / 5xx → AP2GuardCommitUncertain, NO release
3. RESERVATION_FINALIZED / → AP2GuardCommitUncertain, NO release
RESERVATION_EXPIRED / (existing behavior)
IDEMPOTENCY_MISMATCH
4. 4xx with unrecognized code → release + AP2GuardCommitFailed
(the server explicitly rejected,
release is safe)
Synthetic error_code values for the new branches:
- TRANSPORT_ERROR — CyclesResponse.transport_error / status=-1
- SERVER_ERROR — 5xx without a parseable error code in body
- specific code propagates when present (e.g. INTERNAL_ERROR)
P1 — bare exception during commit POST now wraps as uncertain
=============================================================
The `try / except Exception / raise` re-raised the raw exception,
losing the reservation_id and bypassing the reconciliation contract.
Wrapped as:
raise AP2GuardCommitUncertain(
error_code="COMMIT_RAISED",
reservation_id=...,
) from exc
The original exception is preserved via __cause__.
Four new regression tests cover:
- 5xx with body → uncertain, specific code, no release
- 5xx without body → uncertain, synthetic SERVER_ERROR, no release
- transport error → uncertain, TRANSPORT_ERROR, no release
- commit raises → uncertain wrapping original, COMMIT_RAISED,
chained __cause__, no release
Existing test_commit_failed_records_release_non_success still passes —
it's about the release POST returning 5xx (not the commit POST), which
remains a release-and-raise flow.
AP2GuardCommitUncertain.__doc__ updated to enumerate all the new
conditions and the error_code values callers can branch on.
README lifecycle + exception tables updated.
Test posture: 107 tests (up from 103), 99.18% coverage. ruff + mypy
strict clean. `python -m build` produces sdist + wheel cleanly. No
public API additions — only an expanded contract on the existing
AP2GuardCommitUncertain.
AUDIT.md and CHANGELOG.md updated.
Round-3 fixes pushed (4642a8d)Both findings addressed.
Four new regression tests:
Gates re-verified locally (all green):
No public API additions — only an expanded contract on the existing |
Three README spots still implied the consume-once boundary is always transaction_id and that distinct attempts "collapse onto one reservation". Both are wrong with the open-mandate scope and the real Cycles idempotency semantics (replay or IDEMPOTENCY_MISMATCH). 1. "How the guard responds" table — the HTTP/transport-error-on-reserve row said "same transaction_id ⇒ same reserve key". Generalized to "same consume-once scope (open_mandate_hash when present, otherwise transaction_id) ⇒ same reserve key". 2. AP2 mapping table — the transaction_id row said "also feeds idempotency keys" unconditionally. Now: "feeds the idempotency key only when open_mandate_hash is absent (otherwise the open mandate is the consume-once scope)". 3. Features bullet — "duplicate workers on the same mandate collapse onto one reservation server-side" replaced with the accurate "share one idempotency bucket server-side; identical replays return the original reservation, divergent attempts are rejected with IDEMPOTENCY_MISMATCH". Code unchanged. Gates still green: ruff + mypy strict + 107 tests.
README cleanup pushed (88f2524)All three stale lines fixed:
Code unchanged. Gates re-verified locally (ruff + ruff format clean, mypy strict 0 errors, 107 tests pass). |
…stale docstrings
P2 — Idempotency-key suffix accepted non-ASCII letters
======================================================
safe_suffix used `c.isalnum()`, but Python's str.isalnum is
Unicode-aware: "É".isalnum() is True. A non-ASCII suffix (e.g. a
localized exception class name like "Échec") would have reached the
Idempotency-Key HTTP header. RFC 7230 requires ASCII tokens, so httpx
would reject the request.
Tightened the predicate to `(c.isascii() and c.isalnum()) or c in
("_", "-", ".")`. Regression test asserts the key is isascii() even
for a non-ASCII suffix.
P3 — from_ap2() masked empty checkout_hash as None
==================================================
The `or` short-circuit at
getattr(cm, "hash", None) or getattr(cm, "checkout_hash", None)
silently dropped upstream `hash=""` to None, bypassing the model's
new `min_length=1` rejection. Replaced with explicit None-check on
the first attribute; fall back only when the first is genuinely
missing. Empty strings now propagate to the model and get rejected
at construction.
Two regression tests:
- hash="" → ValidationError at AP2Mandate(...)
- alternate naming (checkout_hash attr only, no hash attr) still
falls through correctly
Kept payee.website / .identifier fallback unchanged — that one is
intentional alternates-not-aliases behavior with a dedicated test.
P3 — Stale docstrings refreshed
===============================
1. mapping.build_commit_body.__doc__: said "key derived from
transaction_id" — generalized to the consume-once scope
(open_mandate_hash when present, otherwise transaction_id),
matching the actual implementation.
2. AP2GuardCommitFailed.__doc__: still described RESERVATION_FINALIZED
/ RESERVATION_EXPIRED / IDEMPOTENCY_MISMATCH as "benign replay"
that this exception is NOT raised for. Misleading — those raise
AP2GuardCommitUncertain (a different exception), not a quiet no-op.
Rewrote to scope this exception to 4xx-explicit-rejection only and
point readers at AP2GuardCommitUncertain for the unknown-outcome
conditions.
Test posture: 110 tests (up from 107), 99.19% coverage. ruff + mypy
strict clean. `python -m build` produces sdist + wheel cleanly.
No public API additions — only behavior tightening on existing paths
and documentation sync.
AUDIT.md and CHANGELOG.md updated.
Round-4 fixes pushed (00835b5)All three findings addressed.
Gates re-verified locally (all green):
No public API additions; only behavior tightening on existing paths and documentation sync. |
…ternals
Earlier passes fixed README and feature-bullet wording but missed two
internal docstrings/comments that still used the inaccurate phrasing:
- runcycles_ap2/_constants.py — the IDEMPOTENCY_SCOPE_OPEN_MANDATE
comment block said all checkouts "collapse onto one reservation"
- runcycles_ap2/mapping.py — consume_once_input.__doc__ said the
server "collapses them onto one reservation"
Replaced both with the accurate "share one (tenant, endpoint,
idempotency_key) bucket; identical payloads replay, divergent
payloads hit IDEMPOTENCY_MISMATCH; either way no second valid
reservation" framing.
Also:
- tests/test_mapping.py — TestIdempotencyKeyOpenMandateScope class
docstring still said "must collapse onto one reservation"; reworded
- AUDIT.md — stale test-name reference
(test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key
was renamed in a follow-up to ...shares_idempotency_bucket...);
updated and annotated.
The two remaining "collapse" matches in AUDIT.md (lines 50 and 71) are
intentional — they quote the old phrasing inside historical entries that
document the wording fix itself, not current behavior claims.
No behavior change. Gates green: ruff + mypy strict + 110 tests pass.
Round-5 fix pushed (7f0d8db)Doc-only cleanup — the implementation already matched the accurate wording; only internal docstrings and one comment still carried the old "collapse onto one reservation" overclaim.
Two remaining "collapse" matches in Gates re-verified locally (all green):
No behavior change, no new tests, no public API change. |
Follow-up to PR #1 addressing the positioning-review findings. Stays on v0.1.0 (still pre-PyPI).
P0-A — Open-mandate consume-once locking
Earlier rounds keyed idempotency on
transaction_idonly, which breaks the AP2 spec's normative consume-once defense (§6) against an autonomous agent presenting subsequent open mandates without a rejection receipt: two distinct transactions derived from the same open mandate produced DIFFERENT idempotency keys, so Cycles would authorize both. The defense we claimed didn't actually exist for the human-not-present flow.Fix: when
AP2Mandate.open_mandate_hashis present, key the lock on it instead oftransaction_id. Scope namespace embedded in the key so the two buckets never collide server-side:open_mandate_hash(HNP)ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}[:{suffix}]transaction_id(default)ap2:tx:{sha256(transaction_id)[:32]}:{phase}[:{suffix}]New
mapping.consume_once_input()returns the(scope, raw_value)pair.idempotency_key()signature changed from(transaction_id: str, ...)to(mandate: AP2Mandate, ...)for automatic scope selection.Wire-shape change (idempotency keys now include scope namespace). Pre-PyPI, no migration cost.
Regression coverage: new
TestIdempotencyKeyOpenMandateScopeclass + newtest_open_mandate_overuse_collapses_distinct_transactions_onto_one_keyin the double-spend file.P0-B — Post-PSP commit failures now raise
RESERVATION_FINALIZED,RESERVATION_EXPIRED, andIDEMPOTENCY_MISMATCHpreviously logged at warning level and returned silently. Caller's only signal wasguard.committed == False— too easy to miss.RESERVATION_EXPIREDin particular is dangerous: the PSP may have charged while Cycles reclaimed the budget on TTL.Fix: new
AP2GuardCommitUncertainexception (carrieserror_code+request_id+reservation_id). Raised from_handle_commitfor all three terminal-status codes. Still no auto-release (a prior commit may already have settled).Three existing tests flipped from "asserts silent" to "asserts raise"; new
RESERVATION_EXPIREDtest added.Polish (positioning review medium items)
tests/test_ap2_shape_adapter.py: exercisesAP2Mandate.from_ap2()against local dataclasses mirroring the upstream AP2 sample-type layout, so adapter brittleness surfaces here first instead of at integration time.Test plan
ruff check . && ruff format --check .— cleanmypy --strict runcycles_ap2— 0 errors (8 files)pytest --cov=runcycles_ap2 --cov-fail-under=95— 100 passed, 98.35% coverage (up from 90)python -m build— sdist + wheel build cleanlyexamples/ap2_human_not_present.pyagainst a local Cycles serverPublic API additions
AP2GuardCommitUncertainexception class (exported)Wire-shape change (acceptable: pre-PyPI)
Idempotency keys now include a scope namespace:
ap2:tx:...orap2:open_mandate:...(wasap2:...). No callers in the wild yet — v0.1.0 was merged but never published.