From ac10e6f1b59915b711c2e228e5fc6a15752c51ac Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Wed, 13 May 2026 10:23:48 -0400 Subject: [PATCH 1/6] fix: address P0-A consume-once gap + P0-B commit-uncertain silent path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P0-A — Open-mandate consume-once locking ======================================= Earlier rounds keyed idempotency on transaction_id only, which breaks the AP2 spec's normative defense (specification §6) against an autonomous agent presenting subsequent open mandates without a rejection receipt: two distinct transactions derived from the same open mandate produced DIFFERENT idempotency keys, so both would be authorized. Fix: when AP2Mandate.open_mandate_hash is present, key the lock on open_mandate_hash instead of transaction_id. Scope namespace embedded in the key so the two buckets never collide server-side: - HNP: ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase} - default: ap2:tx:{sha256(transaction_id)[:32]}:{phase} New mapping.consume_once_input() picks the (scope, raw_value) pair. idempotency_key() signature changed from (transaction_id, ...) to (mandate, ...) for automatic scope selection. Wire-shape change, but pre-PyPI so no migration cost. New TestIdempotencyKeyOpenMandateScope class covers: - open_mandate scope used when hash present - different transactions / same open_mandate share key - tx and open_mandate namespaces never collide - open_mandate takes precedence over transaction_id Plus new test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key in test_double_spend_same_transaction. P0-B — Post-PSP commit failures now raise ========================================== RESERVATION_FINALIZED, RESERVATION_EXPIRED, and IDEMPOTENCY_MISMATCH previously returned silently after a warning log. Caller's only signal was guard.committed == False — too easy to miss. RESERVATION_EXPIRED in particular is dangerous: PSP may have charged while Cycles reclaimed the budget on TTL. New AP2GuardCommitUncertain exception (carries error_code, request_id, reservation_id). Raised from _handle_commit for all three terminal- status codes. Still no auto-release (a prior commit may already have settled). Three existing tests flipped from "asserts silent" to "asserts raise": - test_commit_idempotency_mismatch_raises_uncertain - test_reservation_finalized_raises_uncertain - test_reservation_expired_raises_uncertain (new) Polish ====== - README: "Independent project / not affiliated with Google" disclaimer. - README: new "What this does NOT do" section (no signature verify, no mandate creation, no PSP, etc.). - README: tightened headline value-prop with bulleted "prevents these failure modes" list. - README: softened "any AP2-compatible SDK" → "AP2-style PaymentMandate objects via a small adapter layer; tested against the current AP2 public example shapes". - README: lifecycle + idempotency tables updated for both new behaviors. - New tests/test_ap2_shape_adapter.py: exercises AP2Mandate.from_ap2() against local dataclasses mirroring the upstream AP2 sample-types layout, so adapter brittleness surfaces here first. Test posture: 100 tests (up from 90), 98.35% coverage. ruff + mypy strict clean. `python -m build` produces sdist + wheel cleanly. Public API additions: - AP2GuardCommitUncertain (exported) Wire-shape change (acceptable because pre-PyPI): - idempotency keys now include a scope namespace: ap2:tx:... or ap2:open_mandate:... (was ap2:...) AUDIT.md and CHANGELOG.md updated. --- AUDIT.md | 25 +++++ CHANGELOG.md | 5 +- README.md | 52 ++++++--- runcycles_ap2/__init__.py | 2 + runcycles_ap2/_constants.py | 12 +++ runcycles_ap2/exceptions.py | 32 ++++++ runcycles_ap2/guard.py | 20 +++- runcycles_ap2/mapping.py | 56 +++++++--- tests/test_ap2_shape_adapter.py | 111 ++++++++++++++++++++ tests/test_double_spend_same_transaction.py | 30 +++++- tests/test_guard_clean_commit.py | 4 +- tests/test_guard_release_on_exception.py | 4 +- tests/test_idempotency.py | 52 ++++++--- tests/test_mapping.py | 94 ++++++++++++----- 14 files changed, 416 insertions(+), 83 deletions(-) create mode 100644 tests/test_ap2_shape_adapter.py diff --git a/AUDIT.md b/AUDIT.md index 7d6cc18..3a6a84c 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -2,6 +2,31 @@ Per `CLAUDE.md`: this file records material changes to the repo (server, admin, client). For a client package, that means public API, on-the-wire request shape, and protocol-conformance posture. +## 2026-05-13 — positioning-review fixes (P0-A consume-once, P0-B commit-uncertain) + +**Author:** strategic/positioning review response (still v0.1.0; not yet on PyPI) +**Scope:** consume-once correctness, post-PSP failure visibility, README accuracy + +1. **[P0-A] Open-mandate consume-once locking** — earlier rounds keyed idempotency on `transaction_id` only. That breaks the AP2 spec's normative defense (specification §6) against an autonomous agent presenting subsequent open mandates without a rejection receipt: two distinct transactions derived from the same open mandate would produce different idempotency keys and BOTH would be authorized. **Fix:** when `AP2Mandate.open_mandate_hash` is present, key the lock on `open_mandate_hash` instead of `transaction_id`. Scope namespace embedded in the key so the two buckets never collide server-side: + - human-not-present: `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}[:{suffix}]` + - default / human-present: `ap2:tx:{sha256(transaction_id)[:32]}:{phase}[:{suffix}]` + New `mapping.consume_once_input()` returns the (scope, raw_value) pair. The `idempotency_key()` function signature changed from `(transaction_id: str, ...)` to `(mandate: AP2Mandate, ...)` to enable automatic scope selection. **Wire-shape change** — pre-PyPI so no migration. New test class `TestIdempotencyKeyOpenMandateScope` + new test `test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key` cover the regression. + +2. **[P0-B] Post-PSP commit failures now raise instead of silently logging** — `RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, and `IDEMPOTENCY_MISMATCH` previously returned silently with a warning log; callers only saw `guard.committed == False`. After the PSP body has run, any of these is a reconciliation event the caller MUST handle. `RESERVATION_EXPIRED` in particular is dangerous: the PSP may have charged while Cycles reclaimed the budget on TTL. **Fix:** new `AP2GuardCommitUncertain` exception (carries `error_code` + `request_id` + `reservation_id`); raised from `_handle_commit` for all three terminal-status codes. Still no auto-release (a prior commit may already have settled). Existing tests that asserted silent behavior were flipped to expect the raise; new `RESERVATION_EXPIRED` test added. + +3. **README polish** — added "Independent project — not affiliated with Google" disclaimer; added "What this does NOT do" section (no signature verification, no mandate creation, etc.); softened "any AP2-compatible SDK" wording to "AP2-style PaymentMandate objects via a small adapter layer"; tightened headline value-prop with a bulleted "prevent these failure modes" list; updated lifecycle table for commit-uncertain raise; updated idempotency key table to show the dual-scope behavior. + +4. **Real AP2-shaped adapter test** — new `tests/test_ap2_shape_adapter.py` uses local dataclasses matching the upstream AP2 sample-type layout (`PaymentMandate`, `CheckoutMandate`, `Payee`, `PaymentAmount`) so `AP2Mandate.from_ap2()` is exercised against shape closer to real AP2 SDK objects, not just our own internal fakes. If upstream renames a field, this test fails first. + +**Test posture after fixes:** +- 100 tests (up from 90), 98.35% coverage. + +**Public API additions:** +- `AP2GuardCommitUncertain` exception class (exported) + +**Wire-shape change (acceptable because pre-PyPI):** +- idempotency keys now include a scope namespace: `ap2:tx:...` or `ap2:open_mandate:...` (was `ap2:...`) + ## 2026-05-13 — sixth-round review fix (P2 non-int amount types) **Author:** code-review response on PR #1 (round 6) diff --git a/CHANGELOG.md b/CHANGELOG.md index a2f074a..66f0bc6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,16 +11,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `cycles_guard_payment(...)` sync context manager: `reserve` on enter, `commit` on clean exit, `release` on exception. - `AP2Mandate` adapter type insulating the wrapper from upstream AP2 schema churn. - `RuntimeAuthorityReceipt` (`runtime_authority.ap2.payment.charge.v1`) — client-side, informational. -- Deterministic idempotency keys: `ap2:{sha256(transaction_id)[:32]}:{phase}[:{suffix}]`. Hash is fixed-length (128-bit collision resistance), header-safe, and the phase suffix is always preserved regardless of upstream id length. +- Deterministic idempotency keys with automatic consume-once scope selection — `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}` when the mandate carries an open mandate hash (AP2 spec §6 normative consume-once defense for human-not-present flows), `ap2:tx:{sha256(transaction_id)[:32]}:{phase}` otherwise. Hash is fixed-length (128-bit collision resistance), header-safe, and the phase suffix is always preserved. - `AP2DryRunResult` exception — raised from `__enter__` when `dry_run=True` so the `with` body cannot execute (prevents real PSP calls under dry-run from moving money off the books). - `AP2GuardCommitFailed` exception — raised after a release attempt when the server rejects a commit with an unrecognized code; carries `released: bool` and `release_error: str | None` so the caller can distinguish "budget recovered" from "budget stranded until TTL". +- `AP2GuardCommitUncertain` exception — raised on terminal commit responses (`RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, `IDEMPOTENCY_MISMATCH`) so post-PSP reconciliation events cannot be silently missed. No auto-release (a prior commit may already have settled). - USD-only enforcement; non-USD raises `AP2CurrencyError`. Rejects NaN, +/-Infinity, and amounts with more than 8 decimal places (sub-micro precision); wraps all `decimal` failures as `AP2MandateError`. - Exact integer-tuple conversion in `amount_micros()` — does not depend on the default decimal context, so large valid inputs convert exactly instead of being silently rounded. - Pre-allocation digit-count cap (≤ 19 integer digits, the int64 USD_MICROCENTS ceiling) blocks exponent-notation DoS like `Decimal("1E+1000000000000")` that would otherwise hang allocating a trillion-digit scaling factor. - Post-conversion int64 boundary check (`micros <= 2**63 - 1`) rejects amounts one over int64.max client-side instead of relying on the server. Applied symmetrically to `AP2Mandate.amount_micros()` and `GuardedPayment.set_actual_micros()` so a caller-supplied commit override cannot bypass the cap. - `set_actual_micros()` raises `AP2MandateError` (a `ValueError` subclass) for both negative and over-int64 inputs, plus any non-int type (notably `bool` and `float`, which numerical comparisons would otherwise let slip through). - Direct callers of `mapping.build_commit_body(actual_micros=...)` get the same input validation via a shared private validator. -- 90 tests, ≥ 95% coverage, ruff + mypy strict. +- 100 tests, ≥ 95% coverage, ruff + mypy strict. ### Planned for v0.2 - `AsyncGuardedPayment` (asyncio). diff --git a/README.md b/README.md index 07c65f4..176b44c 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,27 @@ [![PyPI Downloads](https://img.shields.io/pypi/dm/runcycles-ap2)](https://pypi.org/project/runcycles-ap2/) [![CI](https://github.com/runcycles/cycles-ap2-python/actions/workflows/ci.yml/badge.svg)](https://github.com/runcycles/cycles-ap2-python/actions) [![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE) -[![Coverage](https://img.shields.io/badge/coverage-97%25-brightgreen)](https://github.com/runcycles/cycles-ap2-python/actions) +[![Coverage](https://img.shields.io/badge/coverage-98%25-brightgreen)](https://github.com/runcycles/cycles-ap2-python/actions) [![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/runcycles/cycles-ap2-python/badge)](https://scorecard.dev/viewer/?uri=github.com/runcycles/cycles-ap2-python) # Cycles AP2 Guard — Runtime authority for AP2 agent payments -**Cycles runtime authority guard for [AP2](https://github.com/google-agentic-commerce/AP2) (Agent Payments Protocol) — wrap every AP2 payment moment in a `reserve / commit / release` lifecycle so a valid mandate cannot be over-exercised under retries, fan-out, or concurrent checkout attempts.** Works with Google's AP2 spec and any AP2-compatible SDK (Python samples today; A2A / MCP / UCP roadmap). +**Cycles AP2 Guard adds runtime authority to [AP2](https://github.com/google-agentic-commerce/AP2) payment flows.** -AP2 answers *"was this payment authorized?"* with signed Open/Closed mandates, Payment Mandates, and Checkout Mandates. Cycles answers the complementary question: *"should this agent be allowed to attempt this payment **right now**?"* — pre-execution authority over reservation, idempotency, quota, and consume-once semantics. Install via `pip install runcycles-ap2`. +> *AP2 proves that a payment mandate is valid.* +> *Cycles decides whether this agent, tenant, run, mandate, and merchant are still allowed to attempt the payment right now.* + +Use it to prevent: + +- duplicate payment attempts under retries +- concurrent checkout races +- open-mandate overuse in human-not-present flows +- per-tenant or per-agent payment budget violations +- missing runtime audit beside AP2 receipts + +Install via `pip install runcycles-ap2`. + +> **Independent project.** This is not affiliated with, endorsed by, or maintained by Google. It is an independent Cycles integration for AP2-style payment mandate flows, built against the public AP2 specification and sample shapes. ## The problem AP2 itself flags @@ -19,6 +32,17 @@ From the AP2 spec, human-not-present flows let the agent act autonomously using That is a **runtime-state** problem: concurrency, retries, in-flight attempts, quota counters, consume-once. AP2 mandates are cryptographic *authorization*. Cycles adds the missing runtime enforcement. +When an `AP2Mandate` carries an `open_mandate_hash`, this package keys the consume-once lock on the open mandate (not the transaction id) — so every checkout derived from the same open mandate collapses onto a single reservation server-side, even when their transaction ids differ. See [Deterministic idempotency keys](#deterministic-idempotency-keys) below. + +## What this does NOT do + +Be explicit about the boundary: + +- **Does not verify AP2 signatures.** Signature checks belong to the AP2 SDK / credential provider. +- **Does not create or sign mandates.** Callers pass already-signed `PaymentMandate` / `CheckoutMandate` objects. +- **Does not replace merchant or credential-provider AP2 verification.** This guard runs *before* the PSP call as a runtime authority gate. +- **Does not move money.** The PSP call lives inside the `with` block; this package only decides whether the agent may attempt it. + ## Installation ```bash @@ -59,7 +83,7 @@ with CyclesClient(config) as client: ## From an existing AP2 SDK object -If you already hold a `PaymentMandate` (and optional `CheckoutMandate`) from the AP2 Python SDK, build an `AP2Mandate` adapter in one line. Field renames in AP2 only touch this adapter — your guard code stays stable. +If you already hold a `PaymentMandate` (and optional `CheckoutMandate`) shaped per the AP2 public examples, build an `AP2Mandate` adapter in one line. Schema renames in upstream AP2 only touch this adapter — your guard code stays stable. ```python from runcycles_ap2 import AP2Mandate @@ -67,7 +91,7 @@ from runcycles_ap2 import AP2Mandate mandate = AP2Mandate.from_ap2(payment_mandate, checkout_mandate) ``` -Required upstream attributes: `payment_mandate.transaction_id`, `payment_mandate.payment_amount.value`, `payment_mandate.payment_amount.currency`, `payment_mandate.payee.website` (or `.identifier`). Optional: `checkout_mandate.hash`. +Required upstream attributes (duck-typed): `payment_mandate.transaction_id`, `payment_mandate.payment_amount.value`, `payment_mandate.payment_amount.currency`, `payment_mandate.payee.website` (or `.identifier`). Optional: `checkout_mandate.hash`. Tested against the AP2-style field shapes used in the current public examples; not bound to any specific AP2 SDK release. ## How the guard responds @@ -77,7 +101,7 @@ Required upstream attributes: `payment_mandate.transaction_id`, `payment_mandate | `Decision.ALLOW`, body raises | **Release** | Reason `ap2_guard_failed:{ExcType}`, idempotency key includes the exception type | | `Decision.DENY` | **Neither** | `AP2GuardDenied` raised in `__enter__`; real money never moves | | HTTP / transport error on reserve | **Neither** | `AP2GuardDenied` raised; caller can retry — same `transaction_id` ⇒ same reserve key | -| Commit returns `RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH` | **Neither** | Logged at warning; we never auto-release after these (a previous commit may already have charged) | +| Commit returns `RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH` | **Raise, no release** | `AP2GuardCommitUncertain` raised so the caller cannot silently miss the reconciliation event; we don't auto-release (a prior commit may already have settled) | | Commit returns other 4xx | **Release + raise** | Reservation release attempted; `AP2GuardCommitFailed` raised with `released` + `release_error` so the caller cannot miss the reconciliation event | | `guard.abort(reason)` called inside `with` | **Release** | Reason `ap2_guard_aborted:{reason}` | | `dry_run=True` | **Neither** | `__enter__` raises `AP2DryRunResult` carrying the decision payload — the `with` body never runs, so a real PSP call cannot leak under a dry-run probe | @@ -103,15 +127,16 @@ No protocol changes required for v0.1 — `payment.charge` and `payment.refund` ## Deterministic idempotency keys -The wrapper computes idempotency keys from `transaction_id`; callers MUST NOT pass their own. This is the consume-once defense — retried `__enter__`s on the same mandate, from any process, return the original reservation: +The wrapper computes idempotency keys from the mandate; callers MUST NOT pass their own. **The lock scope shifts automatically based on what the mandate carries** — this is the AP2-spec consume-once defense: -| Phase | Key shape | -|---|---| -| Reserve | `ap2:{sha256(transaction_id)[:32]}:reserve` | -| Commit | `ap2:{sha256(transaction_id)[:32]}:commit` | -| Release | `ap2:{sha256(transaction_id)[:32]}:release:{ExcType}` | +| Mandate carries… | Key shape | Lock boundary | +|---|---|---| +| `open_mandate_hash` (human-not-present) | `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}[:{suffix}]` | every checkout derived from one open mandate collapses onto a single reservation | +| only `transaction_id` (default / human-present) | `ap2:tx:{sha256(transaction_id)[:32]}:{phase}[:{suffix}]` | one `transaction_id` == one payment attempt | + +The scope namespace (`open_mandate` or `tx`) is embedded in the key so the two buckets never collide server-side. The hash is fixed-length (SHA-256 truncated to 32 hex chars, 128-bit collision resistance), header-safe, and the phase suffix (`reserve` / `commit` / `release:{ExcType}`) is always preserved. -The transaction_id is hashed (SHA-256, first 32 hex chars — 128 bits of collision resistance) so the key is fixed-length, header-safe, and the phase suffix is always preserved regardless of how long the upstream id is. The raw `transaction_id` is still attached to `Subject.dimensions["ap2_transaction_id"]` for debug/audit. +Raw `transaction_id` and `open_mandate_hash` stay on `Subject.dimensions` for debug/audit; only the idempotency key uses the hash. ## Runtime authority receipt @@ -165,6 +190,7 @@ Exception hierarchy: | `AP2GuardError` | Base for all AP2-guard errors | | `AP2GuardDenied` | Cycles returned `DENY` or the reserve POST failed | | `AP2DryRunResult` | Raised from `__enter__` when `dry_run=True` — carries the decision payload; the `with` body never executes | +| `AP2GuardCommitUncertain` | Commit returned a terminal status (`RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, or `IDEMPOTENCY_MISMATCH`) after the body ran. No auto-release (a prior commit may already have settled). `RESERVATION_EXPIRED` is the worst case: PSP may have charged while Cycles reclaimed the budget on TTL. Reconcile with PSP | | `AP2GuardCommitFailed` | Commit was rejected with an unrecognized code after the body ran. Check `.released` (bool) and `.release_error` (string \| None) on the exception — `released=False` means budget is stranded until TTL; reconcile with PSP either way | | `AP2CurrencyError` | Non-USD mandate in v0.1 (subclass of `ValueError`) | | `AP2MandateError` | Adapter input is malformed — NaN, infinity, sub-micro precision, missing payee, etc. (subclass of `ValueError`) | diff --git a/runcycles_ap2/__init__.py b/runcycles_ap2/__init__.py index cc7bf46..8647c02 100644 --- a/runcycles_ap2/__init__.py +++ b/runcycles_ap2/__init__.py @@ -6,6 +6,7 @@ AP2CurrencyError, AP2DryRunResult, AP2GuardCommitFailed, + AP2GuardCommitUncertain, AP2GuardDenied, AP2GuardError, AP2MandateError, @@ -19,6 +20,7 @@ "AP2CurrencyError", "AP2DryRunResult", "AP2GuardCommitFailed", + "AP2GuardCommitUncertain", "AP2GuardDenied", "AP2GuardError", "AP2Mandate", diff --git a/runcycles_ap2/_constants.py b/runcycles_ap2/_constants.py index dbd8cc4..437b802 100644 --- a/runcycles_ap2/_constants.py +++ b/runcycles_ap2/_constants.py @@ -31,6 +31,18 @@ # Idempotency key prefix used for all phases. IDEMPOTENCY_PREFIX: Final[str] = "ap2" +# Scope namespace embedded in the idempotency key. The scope determines what counts as +# "the same payment attempt" for server-side dedup: +# - "open_mandate": all checkouts derived from one open mandate collapse onto one +# reservation. This matches AP2's normative warning (specification §6) that a +# shopping agent must not present subsequent open mandates without a rejection +# receipt — without this scope, two transactions from one open mandate produce +# different transaction_ids and both would be authorized. +# - "tx": one transaction_id == one payment attempt (the default for human-present +# and any flow without an open mandate). +IDEMPOTENCY_SCOPE_OPEN_MANDATE: Final[str] = "open_mandate" +IDEMPOTENCY_SCOPE_TRANSACTION: Final[str] = "tx" + # Length of the hex-encoded SHA-256 prefix used in idempotency keys. # 32 hex chars == 128 bits of collision resistance. Keys stay well inside the 256-char # protocol cap, so the phase suffix (`reserve`/`commit`/`release:{ExcType}`) is always diff --git a/runcycles_ap2/exceptions.py b/runcycles_ap2/exceptions.py index 3fc33f6..eb2c712 100644 --- a/runcycles_ap2/exceptions.py +++ b/runcycles_ap2/exceptions.py @@ -52,6 +52,38 @@ def __init__( self.affected_scopes = affected_scopes +class AP2GuardCommitUncertain(AP2GuardError): + """Cycles returned a terminal status for this reservation on commit. + + Raised when the server replies with one of: + - ``RESERVATION_FINALIZED`` — a prior attempt already finalized the reservation; + usually a benign replay, but we can't verify "matching actuals" client-side. + - ``RESERVATION_EXPIRED`` — the TTL elapsed before we could commit. The server + reclaimed the budget on the cycles side, but the PSP body has already run, so + the payment may have moved without a Cycles settlement. + - ``IDEMPOTENCY_MISMATCH`` — a prior commit under this idempotency key carried + a different payload (often: different actuals across retries). + + After the PSP body has executed, any of these is a reconciliation event. We do + NOT auto-release for them (a previous successful commit would be undone). The + caller MUST handle this exception — silently returning would let unreconciled + payment state propagate. + """ + + def __init__( + self, + message: str, + *, + error_code: str | None = None, + request_id: str | None = None, + reservation_id: str | None = None, + ) -> None: + super().__init__(message) + self.error_code = error_code + self.request_id = request_id + self.reservation_id = reservation_id + + class AP2GuardCommitFailed(AP2GuardError): """Cycles rejected the commit AFTER the body ran (PSP may already have charged). diff --git a/runcycles_ap2/guard.py b/runcycles_ap2/guard.py index f43b051..1f8736f 100644 --- a/runcycles_ap2/guard.py +++ b/runcycles_ap2/guard.py @@ -12,7 +12,7 @@ from runcycles_ap2._constants import DEFAULT_ACTION_KIND, DEFAULT_OVERAGE_POLICY, DEFAULT_TTL_MS from runcycles_ap2._validation import validate_micros -from runcycles_ap2.exceptions import AP2DryRunResult, AP2GuardCommitFailed, AP2GuardDenied +from runcycles_ap2.exceptions import AP2DryRunResult, AP2GuardCommitFailed, AP2GuardCommitUncertain, AP2GuardDenied from runcycles_ap2.mapping import ( build_action, build_commit_body, @@ -252,15 +252,25 @@ def _handle_commit(self) -> None: error_code = error.error_code.value if (error and error.error_code) else None request_id = error.request_id if error else None if error_code in ("RESERVATION_FINALIZED", "RESERVATION_EXPIRED", "IDEMPOTENCY_MISMATCH"): - # Benign replay: a previous attempt already finalized the reservation. The - # caller's payment record is correct from that prior attempt; nothing to do. + # The reservation is in a terminal state on the server. We do NOT auto-release + # (that would undo a prior successful commit), but we also do NOT return + # silently — after the PSP body has run, any of these is a reconciliation + # event the caller MUST see. RESERVATION_EXPIRED in particular is dangerous: + # the PSP may have charged while Cycles reclaimed the budget on TTL. logger.warning( - "AP2 commit returned %s (no release): id=%s, tx=%s", + "AP2 commit returned %s (terminal, raising uncertain): id=%s, tx=%s", error_code, self._reservation_id, self._mandate.transaction_id, ) - return + raise AP2GuardCommitUncertain( + f"AP2 commit returned {error_code} for transaction {self._mandate.transaction_id}. " + "Reservation is in a terminal state; PSP state may need reconciliation. " + "No release was attempted (a prior commit may already have settled).", + error_code=error_code, + request_id=request_id, + reservation_id=self._reservation_id, + ) # Unrecognized commit rejection. The PSP may already have moved money, so we # release the budget and raise — the caller MUST reconcile, and a silent diff --git a/runcycles_ap2/mapping.py b/runcycles_ap2/mapping.py index c361d36..39c8030 100644 --- a/runcycles_ap2/mapping.py +++ b/runcycles_ap2/mapping.py @@ -19,14 +19,16 @@ DIM_OPEN_MANDATE_HASH, DIM_RUN_ID, IDEMPOTENCY_PREFIX, + IDEMPOTENCY_SCOPE_OPEN_MANDATE, + IDEMPOTENCY_SCOPE_TRANSACTION, TRANSACTION_ID_HASH_LEN, ) from runcycles_ap2._validation import validate_micros from runcycles_ap2.models import AP2Mandate -def _hash_transaction_id(transaction_id: str) -> str: - """SHA-256 of the raw ``transaction_id``, hex-encoded and truncated. +def _hash_input(value: str) -> str: + """SHA-256 of the input string, hex-encoded and truncated to 32 chars. Hashing guarantees: - fixed-length output regardless of input (avoids the 256-char header truncation @@ -36,22 +38,46 @@ def _hash_transaction_id(transaction_id: str) -> str: Truncating to 32 hex chars preserves 128 bits of collision resistance — more than enough for an idempotency-key namespace scoped to a single tenant. """ - digest = hashlib.sha256(transaction_id.encode("utf-8")).hexdigest() + digest = hashlib.sha256(value.encode("utf-8")).hexdigest() return digest[:TRANSACTION_ID_HASH_LEN] -def idempotency_key(transaction_id: str, phase: str, suffix: str | None = None) -> str: - """Deterministic idempotency key: ``ap2:{sha256(tx)[:32]}:{phase}[:{suffix}]``. +def consume_once_input(mandate: AP2Mandate) -> tuple[str, str]: + """Pick the lock scope + raw input for this mandate. - The phase suffix is ALWAYS preserved (fixed-length hash + short phase fit comfortably - inside the 256-char protocol cap). The same ``transaction_id`` reaching the same phase - from any retry, parallel worker, or process restart produces the same key — the - consume-once defense. + Returns ``(scope, raw_value)`` where: + - ``scope == "open_mandate"`` and ``raw_value == mandate.open_mandate_hash`` when + the mandate carries an open mandate hash. This makes every checkout derived from + the same open mandate collide on the same idempotency key — the server's + ``(tenant, endpoint, idempotency_key)`` dedup then collapses them onto one + reservation. **This is the AP2 spec's normative defense** against an autonomous + agent presenting subsequent open mandates without a rejection receipt + (specification §6). + - ``scope == "tx"`` and ``raw_value == mandate.transaction_id`` otherwise. One + transaction == one payment attempt (the human-present / no-open-mandate case). - The raw ``transaction_id`` is still attached to ``Subject.dimensions["ap2_transaction_id"]`` - on the Cycles wire payload for debug/audit; only the idempotency key uses the hash. + The scope is embedded in the idempotency key so the two namespaces cannot collide + server-side. """ - base = f"{IDEMPOTENCY_PREFIX}:{_hash_transaction_id(transaction_id)}:{phase}" + if mandate.open_mandate_hash: + return (IDEMPOTENCY_SCOPE_OPEN_MANDATE, mandate.open_mandate_hash) + return (IDEMPOTENCY_SCOPE_TRANSACTION, mandate.transaction_id) + + +def idempotency_key(mandate: AP2Mandate, phase: str, suffix: str | None = None) -> str: + """Deterministic idempotency key: ``ap2:{scope}:{sha256(input)[:32]}:{phase}[:{suffix}]``. + + The scope (``open_mandate`` or ``tx``) is picked automatically from the mandate — + see :func:`consume_once_input` for the rules. The phase suffix is ALWAYS preserved + (the hash is fixed-length and short scope/phase tokens fit comfortably inside the + 256-char protocol cap). + + The raw ``transaction_id`` and ``open_mandate_hash`` are still attached to + ``Subject.dimensions`` on the Cycles wire payload for debug/audit; only the + idempotency key uses the hash. + """ + scope, raw = consume_once_input(mandate) + base = f"{IDEMPOTENCY_PREFIX}:{scope}:{_hash_input(raw)}:{phase}" if suffix: # Header-safe charset: alphanumeric, underscore, hyphen, dot. Anything else # becomes ``_`` so the resulting key is always a valid HTTP header value. @@ -144,7 +170,7 @@ def build_reservation_body( ) -> dict[str, Any]: """Assemble the full reservation create request body, including deterministic idempotency key.""" body: dict[str, Any] = { - "idempotency_key": idempotency_key(mandate.transaction_id, "reserve"), + "idempotency_key": idempotency_key(mandate, "reserve"), "subject": build_subject( mandate, run_id=run_id, @@ -185,7 +211,7 @@ def build_commit_body( else: amount = validate_micros(actual_micros, field="actual_micros") body: dict[str, Any] = { - "idempotency_key": idempotency_key(mandate.transaction_id, "commit"), + "idempotency_key": idempotency_key(mandate, "commit"), "actual": {"unit": "USD_MICROCENTS", "amount": amount}, } if metadata: @@ -201,6 +227,6 @@ def build_release_body( ) -> dict[str, Any]: """Release body with deterministic idempotency key per exception type.""" return { - "idempotency_key": idempotency_key(mandate.transaction_id, "release", exception_type), + "idempotency_key": idempotency_key(mandate, "release", exception_type), "reason": reason[:256], } diff --git a/tests/test_ap2_shape_adapter.py b/tests/test_ap2_shape_adapter.py new file mode 100644 index 0000000..bd2c289 --- /dev/null +++ b/tests/test_ap2_shape_adapter.py @@ -0,0 +1,111 @@ +"""Adapter test against object shapes mirroring the AP2 public sample types. + +The upstream AP2 Python SDK isn't pinned on PyPI yet, so we don't import from it. +Instead this test exercises `AP2Mandate.from_ap2()` against object shapes that +match the current `google-agentic-commerce/AP2` sample-types layout — same field +names and nesting, just without the protobuf/pydantic machinery. If upstream +renames a field, this test fails and we fix the adapter in one place. +""" + +from __future__ import annotations + +from dataclasses import dataclass, field + +from runcycles_ap2 import AP2Mandate, cycles_guard_payment +from runcycles_ap2.mapping import idempotency_key +from tests.conftest import allow_response, commit_success_response + +# Mirrors AP2 sample types: `PaymentAmount`, `Payee`, `PaymentMandate`, `CheckoutMandate`. +# Field names are taken from the AP2 spec's payment_mandate / checkout_mandate sections. + + +@dataclass +class _PaymentAmount: + value: str + currency: str + + +@dataclass +class _Payee: + website: str + identifier: str | None = None + + +@dataclass +class _PaymentMandate: + transaction_id: str + payment_amount: _PaymentAmount + payee: _Payee + + +@dataclass +class _CheckoutMandate: + hash: str + cart: list[str] = field(default_factory=list) # extra field — should be ignored + + +class TestAP2ShapeAdapter: + def test_from_ap2_minimal_human_present_flow(self) -> None: + pm = _PaymentMandate( + transaction_id="ap2-tx-real-001", + payment_amount=_PaymentAmount(value="49.99", currency="USD"), + payee=_Payee(website="shop.example"), + ) + mandate = AP2Mandate.from_ap2(pm) + + assert mandate.transaction_id == "ap2-tx-real-001" + assert mandate.amount_value == "49.99" + assert mandate.currency == "USD" + assert mandate.payee_website == "shop.example" + assert mandate.checkout_hash is None + assert mandate.open_mandate_hash is None + # Conversion still exact through the adapter. + assert mandate.amount_micros() == 4_999_000_000 + + def test_from_ap2_human_not_present_with_checkout(self) -> None: + pm = _PaymentMandate( + transaction_id="ap2-tx-real-002", + payment_amount=_PaymentAmount(value="199.00", currency="USD"), + payee=_Payee(website="merchant.example"), + ) + cm = _CheckoutMandate(hash="ch_real_002", cart=["item-1", "item-2"]) + mandate = AP2Mandate.from_ap2(pm, cm, open_mandate_hash="omh_real_xyz") + + assert mandate.checkout_hash == "ch_real_002" + assert mandate.open_mandate_hash == "omh_real_xyz" + # When open_mandate_hash is present, the consume-once lock keys on it. + assert ":open_mandate:" in idempotency_key(mandate, "reserve") + + def test_from_ap2_payee_identifier_fallback(self) -> None: + # Some AP2 sample shapes use `identifier` when no website is set. + pm = _PaymentMandate( + transaction_id="ap2-tx-real-003", + payment_amount=_PaymentAmount(value="1.00", currency="USD"), + payee=_Payee(website="", identifier="merchant-id-xyz"), # type: ignore[arg-type] + ) + # website="" + identifier="merchant-id-xyz" → falls back to identifier + mandate = AP2Mandate.from_ap2(pm) + assert mandate.payee_website == "merchant-id-xyz" + + def test_end_to_end_through_guard(self, mock_client) -> None: + # Sanity: an AP2-shape mandate flows through the guard like any other. + mock_client.create_reservation.return_value = allow_response("rsv_ap2_real") + mock_client.commit_reservation.return_value = commit_success_response() + + pm = _PaymentMandate( + transaction_id="ap2-tx-real-e2e", + payment_amount=_PaymentAmount(value="12.50", currency="USD"), + payee=_Payee(website="merchant.example"), + ) + cm = _CheckoutMandate(hash="ch_e2e") + mandate = AP2Mandate.from_ap2(pm, cm, open_mandate_hash="omh_e2e") + + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as guard: + assert guard.reservation_id == "rsv_ap2_real" + + body = mock_client.create_reservation.call_args[0][0] + assert body["subject"]["dimensions"]["ap2_transaction_id"] == "ap2-tx-real-e2e" + assert body["subject"]["dimensions"]["checkout_hash"] == "ch_e2e" + assert body["subject"]["dimensions"]["open_mandate_hash"] == "omh_e2e" + # Lock scope shifted because open_mandate_hash was present. + assert ":open_mandate:" in body["idempotency_key"] diff --git a/tests/test_double_spend_same_transaction.py b/tests/test_double_spend_same_transaction.py index 2978f3e..7c92e5c 100644 --- a/tests/test_double_spend_same_transaction.py +++ b/tests/test_double_spend_same_transaction.py @@ -29,8 +29,8 @@ def test_two_guards_same_transaction_send_same_reserve_key(self, mock_client) -> body1 = mock_client.create_reservation.call_args_list[0][0][0] body2 = mock_client.create_reservation.call_args_list[1][0][0] - expected_reserve = idempotency_key("ap2-tx-shared", "reserve") - expected_commit = idempotency_key("ap2-tx-shared", "commit") + expected_reserve = idempotency_key(make_mandate(transaction_id="ap2-tx-shared"), "reserve") + expected_commit = idempotency_key(make_mandate(transaction_id="ap2-tx-shared"), "commit") assert body1["idempotency_key"] == body2["idempotency_key"] == expected_reserve # Commit keys also collide — the server collapses both onto one reservation. commit1 = mock_client.commit_reservation.call_args_list[0][0][1] @@ -49,3 +49,29 @@ def test_different_transactions_produce_different_keys(self, mock_client) -> Non body1 = mock_client.create_reservation.call_args_list[0][0][0] body2 = mock_client.create_reservation.call_args_list[1][0][0] assert body1["idempotency_key"] != body2["idempotency_key"] + + def test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key(self, mock_client) -> None: + # P0-A: this is the AP2 spec consume-once defense (specification §6). Two + # distinct transaction_ids spawned from one open mandate must share the same + # reserve key, so the server's (tenant, endpoint, idempotency_key) dedup + # collapses them onto a single reservation. Without this, an autonomous agent + # could authorize multiple checkouts using one open mandate. + mock_client.create_reservation.return_value = allow_response("rsv_omh_shared") + mock_client.commit_reservation.return_value = commit_success_response() + + m1 = make_mandate(transaction_id="ap2-tx-CHECKOUT-1", open_mandate_hash="omh_hnp_shared") + m2 = make_mandate(transaction_id="ap2-tx-CHECKOUT-2", open_mandate_hash="omh_hnp_shared") + + with cycles_guard_payment(mock_client, mandate=m1, run_id="r1", tenant="acme"): + pass + with cycles_guard_payment(mock_client, mandate=m2, run_id="r2", tenant="acme"): + pass + + body1 = mock_client.create_reservation.call_args_list[0][0][0] + body2 = mock_client.create_reservation.call_args_list[1][0][0] + # Same reserve key despite different transaction_ids — that's the lock. + assert body1["idempotency_key"] == body2["idempotency_key"] + assert ":open_mandate:" in body1["idempotency_key"] + # Transaction ids stay distinct in dimensions for audit. + assert body1["subject"]["dimensions"]["ap2_transaction_id"] == "ap2-tx-CHECKOUT-1" + assert body2["subject"]["dimensions"]["ap2_transaction_id"] == "ap2-tx-CHECKOUT-2" diff --git a/tests/test_guard_clean_commit.py b/tests/test_guard_clean_commit.py index 49458c9..4560b30 100644 --- a/tests/test_guard_clean_commit.py +++ b/tests/test_guard_clean_commit.py @@ -24,7 +24,7 @@ def test_commit_called_with_ap2_idempotency_key(self, mock_client, mandate) -> N mock_client.commit_reservation.assert_called_once() called_id, called_body = mock_client.commit_reservation.call_args[0] assert called_id == "rsv_clean_001" - assert called_body["idempotency_key"] == idempotency_key("ap2-tx-001", "commit") + assert called_body["idempotency_key"] == idempotency_key(mandate, "commit") assert called_body["actual"] == {"unit": "USD_MICROCENTS", "amount": 19_900_000_000} mock_client.release_reservation.assert_not_called() @@ -132,7 +132,7 @@ def test_reserve_body_carries_policy_keys(self, mock_client, mandate) -> None: pass body = mock_client.create_reservation.call_args[0][0] - assert body["idempotency_key"] == idempotency_key("ap2-tx-001", "reserve") + assert body["idempotency_key"] == idempotency_key(mandate, "reserve") assert body["action"]["policy_keys"]["host"] == "merchant.example" assert body["action"]["policy_keys"]["custom"]["payment_protocol"] == "ap2" assert body["subject"]["dimensions"]["ap2_transaction_id"] == "ap2-tx-001" diff --git a/tests/test_guard_release_on_exception.py b/tests/test_guard_release_on_exception.py index 9958e54..952fab2 100644 --- a/tests/test_guard_release_on_exception.py +++ b/tests/test_guard_release_on_exception.py @@ -21,7 +21,7 @@ def test_runtime_error_triggers_release(self, mock_client, mandate) -> None: mock_client.release_reservation.assert_called_once() called_id, body = mock_client.release_reservation.call_args[0] assert called_id == "rsv_rel" - assert body["idempotency_key"] == idempotency_key("ap2-tx-001", "release", "RuntimeError") + assert body["idempotency_key"] == idempotency_key(mandate, "release", "RuntimeError") assert body["reason"].startswith("ap2_guard_failed:RuntimeError") mock_client.commit_reservation.assert_not_called() @@ -34,7 +34,7 @@ def test_value_error_uses_exception_type_in_key(self, mock_client, mandate) -> N raise ValueError("nope") body = mock_client.release_reservation.call_args[0][1] - assert body["idempotency_key"] == idempotency_key("ap2-tx-001", "release", "ValueError") + assert body["idempotency_key"] == idempotency_key(mandate, "release", "ValueError") def test_abort_releases_on_clean_exit(self, mock_client, mandate) -> None: mock_client.create_reservation.return_value = allow_response("rsv_abort") diff --git a/tests/test_idempotency.py b/tests/test_idempotency.py index 141d335..283458e 100644 --- a/tests/test_idempotency.py +++ b/tests/test_idempotency.py @@ -2,7 +2,9 @@ from __future__ import annotations -from runcycles_ap2 import cycles_guard_payment +import pytest + +from runcycles_ap2 import AP2GuardCommitUncertain, cycles_guard_payment from runcycles_ap2.mapping import idempotency_key from tests.conftest import allow_response, commit_error_response, commit_success_response @@ -16,27 +18,49 @@ def test_reserve_key_is_deterministic_per_transaction(self, mock_client, mandate pass body = mock_client.create_reservation.call_args[0][0] - assert body["idempotency_key"] == idempotency_key(mandate.transaction_id, "reserve") + assert body["idempotency_key"] == idempotency_key(mandate, "reserve") - def test_commit_idempotency_mismatch_does_not_release(self, mock_client, mandate) -> None: - # If a previous commit ran with a different payload under the same key, server returns - # IDEMPOTENCY_MISMATCH. We must not auto-release — that would attempt to undo a real charge. + def test_commit_idempotency_mismatch_raises_uncertain(self, mock_client, mandate) -> None: + # P0-B: previously silent. After the PSP body has run, IDEMPOTENCY_MISMATCH means + # a prior commit under this key carried a different payload — that's a + # reconciliation event the caller MUST handle. We still don't auto-release + # (a prior successful commit may exist). mock_client.create_reservation.return_value = allow_response() mock_client.commit_reservation.return_value = commit_error_response("IDEMPOTENCY_MISMATCH", status=409) - with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as guard: - pass + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + assert ei.value.error_code == "IDEMPOTENCY_MISMATCH" + assert ei.value.reservation_id == "rsv_ap2_001" mock_client.release_reservation.assert_not_called() - assert guard.committed is False - def test_reservation_finalized_does_not_release(self, mock_client, mandate) -> None: + def test_reservation_finalized_raises_uncertain(self, mock_client, mandate) -> None: + # P0-B: previously silent. Usually a benign replay, but we can't verify "benign" + # client-side — and the caller may have run a PSP charge before the replay. + # Surface it; the caller decides whether to ignore. mock_client.create_reservation.return_value = allow_response() mock_client.commit_reservation.return_value = commit_error_response("RESERVATION_FINALIZED", status=409) - with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: - pass + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + assert ei.value.error_code == "RESERVATION_FINALIZED" + mock_client.release_reservation.assert_not_called() + + def test_reservation_expired_raises_uncertain(self, mock_client, mandate) -> None: + # P0-B: the worst case. The PSP body has run; server reclaimed budget on TTL. + # The caller MUST reconcile — we never want this to be a silent log line. + mock_client.create_reservation.return_value = allow_response() + mock_client.commit_reservation.return_value = commit_error_response("RESERVATION_EXPIRED", status=409) + + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + + assert ei.value.error_code == "RESERVATION_EXPIRED" mock_client.release_reservation.assert_not_called() def test_unrecognized_commit_error_releases_and_raises(self, mock_client, mandate) -> None: @@ -44,8 +68,6 @@ def test_unrecognized_commit_error_releases_and_raises(self, mock_client, mandat # release the reservation AND raise AP2GuardCommitFailed so the caller cannot # miss the reconciliation event (the silent `guard.committed == False` signal # from earlier versions was too easy to overlook). - import pytest - from runcycles_ap2 import AP2GuardCommitFailed from tests.conftest import release_success_response @@ -71,8 +93,6 @@ def test_commit_failed_records_release_failure(self, mock_client, mandate) -> No # must report `released=False` and surface the release error in its message. # Earlier versions said "reservation released" regardless, misleading callers # about whether budget had actually been recovered. - import pytest - from runcycles_ap2 import AP2GuardCommitFailed mock_client.create_reservation.return_value = allow_response() @@ -91,8 +111,6 @@ def test_commit_failed_records_release_failure(self, mock_client, mandate) -> No def test_commit_failed_records_release_non_success(self, mock_client, mandate) -> None: # Same as above but the release POST returned a 5xx — server didn't release. - import pytest - from runcycles_ap2 import AP2GuardCommitFailed mock_client.create_reservation.return_value = allow_response() diff --git a/tests/test_mapping.py b/tests/test_mapping.py index 4b6a688..3aef924 100644 --- a/tests/test_mapping.py +++ b/tests/test_mapping.py @@ -34,53 +34,97 @@ def _expected_hash(tx: str) -> str: class TestIdempotencyKey: - def test_reserve_phase(self) -> None: + """Default scope: transaction-id (when no open_mandate_hash).""" + + def test_reserve_phase_tx_scope(self) -> None: h = _expected_hash("ap2-tx-001") - assert idempotency_key("ap2-tx-001", "reserve") == f"ap2:{h}:reserve" + assert idempotency_key(make_mandate(), "reserve") == f"ap2:tx:{h}:reserve" - def test_commit_phase(self) -> None: + def test_commit_phase_tx_scope(self) -> None: h = _expected_hash("ap2-tx-001") - assert idempotency_key("ap2-tx-001", "commit") == f"ap2:{h}:commit" + assert idempotency_key(make_mandate(), "commit") == f"ap2:tx:{h}:commit" - def test_release_with_suffix(self) -> None: + def test_release_with_suffix_tx_scope(self) -> None: h = _expected_hash("ap2-tx-001") - assert idempotency_key("ap2-tx-001", "release", "RuntimeError") == f"ap2:{h}:release:RuntimeError" + assert idempotency_key(make_mandate(), "release", "RuntimeError") == f"ap2:tx:{h}:release:RuntimeError" def test_release_sanitizes_unsafe_chars(self) -> None: # Whitespace, slashes, and other header-unsafe chars get replaced with `_`. + m = make_mandate(transaction_id="tx") h = _expected_hash("tx") - key = idempotency_key("tx", "release", "some/bad value") - assert key == f"ap2:{h}:release:some_bad_value" + key = idempotency_key(m, "release", "some/bad value") + assert key == f"ap2:tx:{h}:release:some_bad_value" def test_long_transaction_id_does_not_drop_phase(self) -> None: # P1 regression: a 256-char tx_id used to overflow the cap and have the phase - # suffix sliced off. Now the hash is fixed-length, so all phases stay distinct. - long_tx = "x" * 256 - reserve = idempotency_key(long_tx, "reserve") - commit = idempotency_key(long_tx, "commit") - release = idempotency_key(long_tx, "release", "RuntimeError") + # suffix sliced off. The fixed-length hash keeps phases distinct. + m = make_mandate(transaction_id="x" * 256) + reserve = idempotency_key(m, "reserve") + commit = idempotency_key(m, "commit") + release = idempotency_key(m, "release", "RuntimeError") assert reserve.endswith(":reserve") assert commit.endswith(":commit") assert release.endswith(":release:RuntimeError") assert reserve != commit != release def test_distinct_long_tx_ids_yield_distinct_keys(self) -> None: - # P1 regression: two tx_ids sharing the first 252 chars used to collide on - # reserve. SHA-256 distinguishes them. - a = ("y" * 252) + "AAAA" - b = ("y" * 252) + "BBBB" + # P1 regression: two tx_ids sharing the first 252 chars used to collide on reserve. + a = make_mandate(transaction_id=("y" * 252) + "AAAA") + b = make_mandate(transaction_id=("y" * 252) + "BBBB") assert idempotency_key(a, "reserve") != idempotency_key(b, "reserve") def test_unsafe_transaction_id_chars_do_not_leak_to_header(self) -> None: - # P1 regression: tx_ids containing whitespace / control bytes used to reach the - # Idempotency-Key header. The hash output is hex only — header-safe by design. - key = idempotency_key("tx with\nnewline\tand\r\x00null", "reserve") + # P1 regression: tx_ids with whitespace / control bytes used to reach the header. + m = make_mandate(transaction_id="tx with\nnewline\tand\r\x00null") + key = idempotency_key(m, "reserve") assert "\n" not in key and "\t" not in key and "\r" not in key and "\x00" not in key assert all(c.isalnum() or c in (":", "_", "-", ".") for c in key) def test_key_length_bounded(self) -> None: - # Hash + phase + suffix always fits inside the protocol's 256-char cap. - assert len(idempotency_key("x" * 1024, "release", "x" * 1024)) <= 256 + # Hash + phase + suffix always fits inside the protocol's 256-char cap, even + # at the longest possible transaction_id (model max 256) and a max-length + # exception-type suffix. + m = make_mandate(transaction_id="x" * 256) + assert len(idempotency_key(m, "release", "x" * 256)) <= 256 + + +class TestIdempotencyKeyOpenMandateScope: + """P0-A: when ``open_mandate_hash`` is present, the lock scope shifts to it. + + This is the AP2-spec consume-once boundary for human-not-present flows: every + checkout derived from the same open mandate must collapse onto one reservation, + even when their transaction_ids differ. + """ + + def test_open_mandate_scope_used_when_hash_present(self) -> None: + m = make_mandate(open_mandate_hash="omh_abc") + h = _expected_hash("omh_abc") + # Scope namespace is `open_mandate`, hash input is the open_mandate_hash — + # the transaction_id is NOT mixed in here (it'd defeat the whole point). + assert idempotency_key(m, "reserve") == f"ap2:open_mandate:{h}:reserve" + assert idempotency_key(m, "commit") == f"ap2:open_mandate:{h}:commit" + + def test_different_transactions_same_open_mandate_share_key(self) -> None: + # Two checkouts spawned from one open mandate must collide on the reserve key + # so server-side dedup collapses them onto a single reservation. + a = make_mandate(transaction_id="ap2-tx-AAA", open_mandate_hash="omh_shared") + b = make_mandate(transaction_id="ap2-tx-BBB", open_mandate_hash="omh_shared") + assert idempotency_key(a, "reserve") == idempotency_key(b, "reserve") + assert idempotency_key(a, "commit") == idempotency_key(b, "commit") + + def test_tx_and_open_mandate_namespaces_never_collide(self) -> None: + # Even if some pathological caller arranged for transaction_id and + # open_mandate_hash to hash to the same value, the scope namespace ("tx" vs + # "open_mandate") embedded in the key keeps them in distinct dedup buckets. + tx_only = make_mandate(transaction_id="same-input") + omh_only = make_mandate(transaction_id="other-tx", open_mandate_hash="same-input") + assert idempotency_key(tx_only, "reserve") != idempotency_key(omh_only, "reserve") + + def test_open_mandate_scope_takes_precedence_over_transaction_id(self) -> None: + # When both are present, open_mandate wins — that's the consume-once boundary. + m = make_mandate(transaction_id="ap2-tx-001", open_mandate_hash="omh_xyz") + assert "open_mandate" in idempotency_key(m, "reserve") + assert _expected_hash("ap2-tx-001") not in idempotency_key(m, "reserve") class TestBuildSubject: @@ -322,7 +366,7 @@ def test_full_shape(self) -> None: overage_policy="REJECT", dry_run=False, ) - assert body["idempotency_key"] == f"ap2:{_expected_hash('ap2-tx-001')}:reserve" + assert body["idempotency_key"] == f"ap2:tx:{_expected_hash('ap2-tx-001')}:reserve" assert body["subject"]["tenant"] == "acme" assert body["action"]["kind"] == "payment.charge" assert body["estimate"] == {"unit": "USD_MICROCENTS", "amount": 19_900_000_000} @@ -370,7 +414,7 @@ class TestBuildCommitBody: def test_defaults_to_mandate_amount(self) -> None: body = build_commit_body(make_mandate()) assert body["actual"]["amount"] == 19_900_000_000 - assert body["idempotency_key"] == f"ap2:{_expected_hash('ap2-tx-001')}:commit" + assert body["idempotency_key"] == f"ap2:tx:{_expected_hash('ap2-tx-001')}:commit" def test_actual_override(self) -> None: body = build_commit_body(make_mandate(), actual_micros=5_000_000_000) @@ -400,7 +444,7 @@ def test_actual_micros_above_int64_rejected(self) -> None: class TestBuildReleaseBody: def test_with_exception_type(self) -> None: body = build_release_body(make_mandate(), reason="fail", exception_type="RuntimeError") - assert body["idempotency_key"] == f"ap2:{_expected_hash('ap2-tx-001')}:release:RuntimeError" + assert body["idempotency_key"] == f"ap2:tx:{_expected_hash('ap2-tx-001')}:release:RuntimeError" assert body["reason"] == "fail" def test_reason_truncated(self) -> None: From 1797db40ab83e7ee1ab2b2743efd13cd649c2f42 Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Wed, 13 May 2026 10:46:56 -0400 Subject: [PATCH 2/6] docs: tighten consume-once wording + add IDEMPOTENCY_MISMATCH-on-reserve test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per positioning-review round 2 on PR #2. 1) "Collapses onto one reservation" was an overclaim. The code makes the idempotency key collide; the payload may still differ (distinct transaction_id / checkout_hash on each AP2 checkout from the same open mandate). Cycles semantics: same key + same payload replays; same key + different payload → 409 IDEMPOTENCY_MISMATCH (surfaced as AP2GuardDenied). Both block a second valid reservation, but the mechanism differs. Reworded in README, AUDIT, and two test docstrings. 2) GuardedPayment.__doc__ and the README lifecycle table still said the retry/lock key is "transaction_id". Generalized to "consume-once key — open_mandate_hash when present, otherwise transaction_id". 3) New regression test: test_open_mandate_divergent_payload_reserve_mismatch_raises_denied exercises the second half of the consume-once contract — same open mandate hash, distinct transaction_id, first reserve allowed, second reserve gets 409 IDEMPOTENCY_MISMATCH from the server and the guard raises AP2GuardDenied(reason_code="IDEMPOTENCY_MISMATCH"). Documents what the spec really gives you for divergent payloads. 4) Reject empty open_mandate_hash / checkout_hash via pydantic min_length=1. Previously an empty string passed validation and consume_once_input's truthy check silently fell back to the tx scope — data corruption disguised as the default. Two new tests cover the rejection paths. Test posture: 103 tests (up from 100), 98.35% coverage. ruff + mypy strict clean. `python -m build` produces sdist + wheel cleanly. No public API additions. AUDIT.md and CHANGELOG.md updated. --- AUDIT.md | 18 +++++++ CHANGELOG.md | 3 +- README.md | 13 +++-- runcycles_ap2/guard.py | 7 ++- runcycles_ap2/models.py | 9 +++- tests/test_double_spend_same_transaction.py | 59 ++++++++++++++++++--- tests/test_mapping.py | 30 ++++++++++- 7 files changed, 122 insertions(+), 17 deletions(-) diff --git a/AUDIT.md b/AUDIT.md index 3a6a84c..381f612 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -2,6 +2,24 @@ Per `CLAUDE.md`: this file records material changes to the repo (server, admin, client). For a client package, that means public API, on-the-wire request shape, and protocol-conformance posture. +## 2026-05-13 — positioning-review follow-ups (wording accuracy + empty-hash guard) + +**Author:** strategic/positioning review round 2 (PR #2 in-flight) +**Scope:** documentation accuracy, additional regression coverage, defensive validation + +1. **Reworded "collapses onto one reservation"** — earlier wording implied server-side merging when the actual semantics are: same key + same payload replays the original; same key + divergent payload returns `409 IDEMPOTENCY_MISMATCH` (surfaced as `AP2GuardDenied`). Both prevent a second valid reservation, but the mechanism differs. README, AUDIT, and two test docstrings now describe the actual contract. The consume-once defense itself is unchanged. + +2. **Stale `transaction_id`-only retry wording** — `GuardedPayment.__doc__` and the README lifecycle table still implied the retry/lock key is always `transaction_id`. Generalized to "consume-once key (`open_mandate_hash` when present, otherwise `transaction_id`)". + +3. **New regression test for IDEMPOTENCY_MISMATCH on second reserve** — `test_open_mandate_divergent_payload_reserve_mismatch_raises_denied` exercises the second half of the consume-once contract: same open mandate hash, distinct transaction_id, first reserve allows, second reserve gets 409 from server → `AP2GuardDenied(reason_code="IDEMPOTENCY_MISMATCH")`. Documents the actual server semantics for divergent payloads sharing the open-mandate scope. + +4. **Reject empty `open_mandate_hash` / `checkout_hash`** — pydantic `min_length=1` on both fields. Previously an empty string would pass model validation, then `consume_once_input`'s truthy check would silently fall back to the `tx` scope — data corruption disguised as the default. Now fails fast at model construction. Two new tests cover the rejection paths. + +**Test posture after fixes:** +- 103 tests (up from 100), 98.35% coverage. + +No public API additions in this round. + ## 2026-05-13 — positioning-review fixes (P0-A consume-once, P0-B commit-uncertain) **Author:** strategic/positioning review response (still v0.1.0; not yet on PyPI) diff --git a/CHANGELOG.md b/CHANGELOG.md index 66f0bc6..8b7aac2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,7 +21,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Post-conversion int64 boundary check (`micros <= 2**63 - 1`) rejects amounts one over int64.max client-side instead of relying on the server. Applied symmetrically to `AP2Mandate.amount_micros()` and `GuardedPayment.set_actual_micros()` so a caller-supplied commit override cannot bypass the cap. - `set_actual_micros()` raises `AP2MandateError` (a `ValueError` subclass) for both negative and over-int64 inputs, plus any non-int type (notably `bool` and `float`, which numerical comparisons would otherwise let slip through). - Direct callers of `mapping.build_commit_body(actual_micros=...)` get the same input validation via a shared private validator. -- 100 tests, ≥ 95% coverage, ruff + mypy strict. +- `AP2Mandate.checkout_hash` and `open_mandate_hash` require `min_length=1` when present — empty strings would otherwise silently fall back to the transaction-id lock scope. +- 103 tests, ≥ 95% coverage, ruff + mypy strict. ### Planned for v0.2 - `AsyncGuardedPayment` (asyncio). diff --git a/README.md b/README.md index 176b44c..24a8761 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ From the AP2 spec, human-not-present flows let the agent act autonomously using That is a **runtime-state** problem: concurrency, retries, in-flight attempts, quota counters, consume-once. AP2 mandates are cryptographic *authorization*. Cycles adds the missing runtime enforcement. -When an `AP2Mandate` carries an `open_mandate_hash`, this package keys the consume-once lock on the open mandate (not the transaction id) — so every checkout derived from the same open mandate collapses onto a single reservation server-side, even when their transaction ids differ. See [Deterministic idempotency keys](#deterministic-idempotency-keys) below. +When an `AP2Mandate` carries an `open_mandate_hash`, this package keys the consume-once lock on the open mandate (not the transaction id) — so every checkout derived from the same open mandate shares one idempotency bucket, even when their transaction ids differ. Identical replays return the original reservation; divergent attempts hit `IDEMPOTENCY_MISMATCH` server-side. Either way the second attempt cannot create a second valid reservation. See [Deterministic idempotency keys](#deterministic-idempotency-keys) below. ## What this does NOT do @@ -97,7 +97,7 @@ Required upstream attributes (duck-typed): `payment_mandate.transaction_id`, `pa | Scenario | Outcome | Detail | |---|---|---| -| `Decision.ALLOW`, body completes | **Commit** | Server idempotency key derived from `transaction_id` — see *Deterministic idempotency keys* below | +| `Decision.ALLOW`, body completes | **Commit** | Server idempotency key derived from the consume-once scope (`open_mandate_hash` when present, otherwise `transaction_id`) — see *Deterministic idempotency keys* below | | `Decision.ALLOW`, body raises | **Release** | Reason `ap2_guard_failed:{ExcType}`, idempotency key includes the exception type | | `Decision.DENY` | **Neither** | `AP2GuardDenied` raised in `__enter__`; real money never moves | | HTTP / transport error on reserve | **Neither** | `AP2GuardDenied` raised; caller can retry — same `transaction_id` ⇒ same reserve key | @@ -131,9 +131,16 @@ The wrapper computes idempotency keys from the mandate; callers MUST NOT pass th | Mandate carries… | Key shape | Lock boundary | |---|---|---| -| `open_mandate_hash` (human-not-present) | `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}[:{suffix}]` | every checkout derived from one open mandate collapses onto a single reservation | +| `open_mandate_hash` (human-not-present) | `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}[:{suffix}]` | every checkout derived from one open mandate uses the same reserve idempotency key | | only `transaction_id` (default / human-present) | `ap2:tx:{sha256(transaction_id)[:32]}:{phase}[:{suffix}]` | one `transaction_id` == one payment attempt | +**What sharing a key actually gets you**, per Cycles idempotency semantics: + +- Same key + **identical payload** → server replays the original response (same `reservation_id`). +- Same key + **divergent payload** (different `transaction_id`, `checkout_hash`, amount, etc.) → server rejects with `409 IDEMPOTENCY_MISMATCH`, surfaced as `AP2GuardDenied(reason_code="IDEMPOTENCY_MISMATCH")`. + +Either way, the second attempt cannot create a second valid reservation — that's the consume-once defense. Multiple distinct checkouts from one open mandate are forced into the same idempotency bucket, so the server sees the conflict. + The scope namespace (`open_mandate` or `tx`) is embedded in the key so the two buckets never collide server-side. The hash is fixed-length (SHA-256 truncated to 32 hex chars, 128-bit collision resistance), header-safe, and the phase suffix (`reserve` / `commit` / `release:{ExcType}`) is always preserved. Raw `transaction_id` and `open_mandate_hash` stay on `Subject.dimensions` for debug/audit; only the idempotency key uses the hash. diff --git a/runcycles_ap2/guard.py b/runcycles_ap2/guard.py index 1f8736f..ce0b883 100644 --- a/runcycles_ap2/guard.py +++ b/runcycles_ap2/guard.py @@ -41,8 +41,11 @@ class GuardedPayment: - Commit rejected with unrecognized code → raises :class:`AP2GuardCommitFailed` after attempting a release; check the exception's ``released`` / ``release_error`` attributes to know whether budget was recovered. - - Same ``transaction_id`` on a retry → server returns the original reservation - (idempotent replay) because the idempotency key is deterministic. + - Same consume-once key on a retry (``open_mandate_hash`` when present, + otherwise ``transaction_id``) → both attempts hit the same Cycles idempotency + bucket. Same payload replays the original reservation; divergent payload is + rejected by the server with ``IDEMPOTENCY_MISMATCH`` (surfaced as + :class:`AP2GuardDenied`). """ def __init__( diff --git a/runcycles_ap2/models.py b/runcycles_ap2/models.py index 2267dfd..6b3c1c1 100644 --- a/runcycles_ap2/models.py +++ b/runcycles_ap2/models.py @@ -43,8 +43,13 @@ class AP2Mandate(BaseModel): amount_value: Annotated[str, Field(min_length=1, max_length=64)] currency: Annotated[str, Field(min_length=3, max_length=3)] payee_website: Annotated[str, Field(min_length=1, max_length=253)] - checkout_hash: Annotated[str, Field(max_length=256)] | None = None - open_mandate_hash: Annotated[str, Field(max_length=256)] | None = None + # ``checkout_hash`` and ``open_mandate_hash`` MUST be non-empty when present. + # An empty string would silently fall back to the transaction-id lock scope (the + # falsy check in :func:`consume_once_input`), which is data corruption disguised + # as the default. Pydantic enforces min_length=1; callers should pass ``None`` to + # mean "not present". + checkout_hash: Annotated[str, Field(min_length=1, max_length=256)] | None = None + open_mandate_hash: Annotated[str, Field(min_length=1, max_length=256)] | None = None def amount_micros(self) -> int: """Convert ``amount_value`` (decimal string in major units) to USD micro-cents. diff --git a/tests/test_double_spend_same_transaction.py b/tests/test_double_spend_same_transaction.py index 7c92e5c..4698948 100644 --- a/tests/test_double_spend_same_transaction.py +++ b/tests/test_double_spend_same_transaction.py @@ -2,8 +2,9 @@ The point is not to simulate Cycles' server-side replay logic in-process — that lives on the server. The point is to prove the wire payload is deterministic, so the server -*can* dedupe. If two workers compute the same reserve key from the same mandate, the -server's `(tenant, endpoint, idempotency_key)` dedup will collapse them onto one +sees both attempts under the same idempotency bucket. Per Cycles semantics: same key ++ same payload returns the original response; same key + different payload returns +409 IDEMPOTENCY_MISMATCH. Either way the second attempt cannot create a second valid reservation — that is the consume-once defense. """ @@ -32,7 +33,8 @@ def test_two_guards_same_transaction_send_same_reserve_key(self, mock_client) -> expected_reserve = idempotency_key(make_mandate(transaction_id="ap2-tx-shared"), "reserve") expected_commit = idempotency_key(make_mandate(transaction_id="ap2-tx-shared"), "commit") assert body1["idempotency_key"] == body2["idempotency_key"] == expected_reserve - # Commit keys also collide — the server collapses both onto one reservation. + # Commit keys also match — the server sees both under one bucket and either + # replays the original commit response or returns IDEMPOTENCY_MISMATCH. commit1 = mock_client.commit_reservation.call_args_list[0][0][1] commit2 = mock_client.commit_reservation.call_args_list[1][0][1] assert commit1["idempotency_key"] == commit2["idempotency_key"] == expected_commit @@ -50,12 +52,53 @@ def test_different_transactions_produce_different_keys(self, mock_client) -> Non body2 = mock_client.create_reservation.call_args_list[1][0][0] assert body1["idempotency_key"] != body2["idempotency_key"] - def test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key(self, mock_client) -> None: + def test_open_mandate_divergent_payload_reserve_mismatch_raises_denied(self, mock_client) -> None: + # The "or 409 IDEMPOTENCY_MISMATCH" half of the consume-once contract. + # First reserve succeeds. Second guard, same open_mandate_hash but different + # transaction_id, will hit the same idempotency bucket — and because the + # request payload differs, the server rejects with IDEMPOTENCY_MISMATCH. + # Our guard surfaces that as AP2GuardDenied with reason_code propagated. + from runcycles.response import CyclesResponse + + from runcycles_ap2 import AP2GuardDenied + from tests.conftest import commit_success_response + + first_allow = allow_response("rsv_omh_first") + mismatch_409 = CyclesResponse.http_error( + 409, + error_message="IDEMPOTENCY_MISMATCH", + body={"error": "IDEMPOTENCY_MISMATCH", "message": "payload differs", "request_id": "req_omh"}, + ) + mock_client.create_reservation.side_effect = [first_allow, mismatch_409] + mock_client.commit_reservation.return_value = commit_success_response() + + m1 = make_mandate(transaction_id="ap2-tx-FIRST", open_mandate_hash="omh_shared") + m2 = make_mandate(transaction_id="ap2-tx-SECOND", open_mandate_hash="omh_shared") + + # First attempt: normal commit path. + with cycles_guard_payment(mock_client, mandate=m1, run_id="r1", tenant="acme") as g1: + assert g1.reservation_id == "rsv_omh_first" + + # Second attempt: server sees same key + divergent payload → 409 → denied. + import pytest + + with pytest.raises(AP2GuardDenied) as ei: + with cycles_guard_payment(mock_client, mandate=m2, run_id="r2", tenant="acme"): + # __enter__ raises before this body runs — real money never moves. + raise AssertionError("body must not execute") + + assert ei.value.reason_code == "IDEMPOTENCY_MISMATCH" + assert ei.value.request_id == "req_omh" + # No commit/release on the second attempt — we never created a reservation. + assert mock_client.commit_reservation.call_count == 1 # only the first guard + + def test_open_mandate_overuse_shares_idempotency_bucket_across_transactions(self, mock_client) -> None: # P0-A: this is the AP2 spec consume-once defense (specification §6). Two # distinct transaction_ids spawned from one open mandate must share the same - # reserve key, so the server's (tenant, endpoint, idempotency_key) dedup - # collapses them onto a single reservation. Without this, an autonomous agent - # could authorize multiple checkouts using one open mandate. + # reserve key, so the server's (tenant, endpoint, idempotency_key) dedup sees + # both attempts in one bucket. Identical payloads replay; divergent payloads + # hit IDEMPOTENCY_MISMATCH (see test_open_mandate_divergent_payload_rejected + # in test_mapping.py). Either way, no second valid reservation. mock_client.create_reservation.return_value = allow_response("rsv_omh_shared") mock_client.commit_reservation.return_value = commit_success_response() @@ -69,7 +112,7 @@ def test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key(self, body1 = mock_client.create_reservation.call_args_list[0][0][0] body2 = mock_client.create_reservation.call_args_list[1][0][0] - # Same reserve key despite different transaction_ids — that's the lock. + # Same reserve key despite different transaction_ids — that's the bucket. assert body1["idempotency_key"] == body2["idempotency_key"] assert ":open_mandate:" in body1["idempotency_key"] # Transaction ids stay distinct in dimensions for audit. diff --git a/tests/test_mapping.py b/tests/test_mapping.py index 3aef924..3e5e4de 100644 --- a/tests/test_mapping.py +++ b/tests/test_mapping.py @@ -106,7 +106,9 @@ def test_open_mandate_scope_used_when_hash_present(self) -> None: def test_different_transactions_same_open_mandate_share_key(self) -> None: # Two checkouts spawned from one open mandate must collide on the reserve key - # so server-side dedup collapses them onto a single reservation. + # so the server sees both attempts in one (tenant, endpoint, key) bucket. The + # server then either replays the original response (if payload matches) or + # returns 409 IDEMPOTENCY_MISMATCH — either way, no second valid reservation. a = make_mandate(transaction_id="ap2-tx-AAA", open_mandate_hash="omh_shared") b = make_mandate(transaction_id="ap2-tx-BBB", open_mandate_hash="omh_shared") assert idempotency_key(a, "reserve") == idempotency_key(b, "reserve") @@ -126,6 +128,32 @@ def test_open_mandate_scope_takes_precedence_over_transaction_id(self) -> None: assert "open_mandate" in idempotency_key(m, "reserve") assert _expected_hash("ap2-tx-001") not in idempotency_key(m, "reserve") + def test_empty_open_mandate_hash_rejected_at_model_level(self) -> None: + # An empty string would otherwise silently fall back to the tx scope (the + # falsy check in consume_once_input). Model-level min_length=1 catches it. + from pydantic import ValidationError + + with pytest.raises(ValidationError): + AP2Mandate( + transaction_id="tx", + amount_value="1.00", + currency="USD", + payee_website="x.example", + open_mandate_hash="", + ) + + def test_empty_checkout_hash_rejected_at_model_level(self) -> None: + from pydantic import ValidationError + + with pytest.raises(ValidationError): + AP2Mandate( + transaction_id="tx", + amount_value="1.00", + currency="USD", + payee_website="x.example", + checkout_hash="", + ) + class TestBuildSubject: def test_minimal(self) -> None: From 4642a8d5b9dff9ac8839cddbd6f051c06062961c Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Wed, 13 May 2026 10:51:17 -0400 Subject: [PATCH 3/6] fix: route post-PSP commit failures through CommitUncertain (P0 + P1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P0 — transport errors and 5xx on commit are uncertain, not failed ================================================================= Previously a transport-level failure or 5xx response on commit fell through to the "unrecognized rejection" branch: _handle_release was called and AP2GuardCommitFailed raised with "reservation released" in the message regardless of outcome. Wrong: the commit POST may have reached Cycles and mutated state before the failure, so auto-release risks undoing a successful settle. New branch ordering in _handle_commit: 1. success 2. transport error / 5xx → AP2GuardCommitUncertain, NO release 3. RESERVATION_FINALIZED / → AP2GuardCommitUncertain, NO release RESERVATION_EXPIRED / (existing behavior) IDEMPOTENCY_MISMATCH 4. 4xx with unrecognized code → release + AP2GuardCommitFailed (the server explicitly rejected, release is safe) Synthetic error_code values for the new branches: - TRANSPORT_ERROR — CyclesResponse.transport_error / status=-1 - SERVER_ERROR — 5xx without a parseable error code in body - specific code propagates when present (e.g. INTERNAL_ERROR) P1 — bare exception during commit POST now wraps as uncertain ============================================================= The `try / except Exception / raise` re-raised the raw exception, losing the reservation_id and bypassing the reconciliation contract. Wrapped as: raise AP2GuardCommitUncertain( error_code="COMMIT_RAISED", reservation_id=..., ) from exc The original exception is preserved via __cause__. Four new regression tests cover: - 5xx with body → uncertain, specific code, no release - 5xx without body → uncertain, synthetic SERVER_ERROR, no release - transport error → uncertain, TRANSPORT_ERROR, no release - commit raises → uncertain wrapping original, COMMIT_RAISED, chained __cause__, no release Existing test_commit_failed_records_release_non_success still passes — it's about the release POST returning 5xx (not the commit POST), which remains a release-and-raise flow. AP2GuardCommitUncertain.__doc__ updated to enumerate all the new conditions and the error_code values callers can branch on. README lifecycle + exception tables updated. Test posture: 107 tests (up from 103), 99.18% coverage. ruff + mypy strict clean. `python -m build` produces sdist + wheel cleanly. No public API additions — only an expanded contract on the existing AP2GuardCommitUncertain. AUDIT.md and CHANGELOG.md updated. --- AUDIT.md | 22 +++++++++++ CHANGELOG.md | 4 +- README.md | 6 +-- runcycles_ap2/exceptions.py | 28 +++++++++----- runcycles_ap2/guard.py | 48 +++++++++++++++++++++--- tests/test_idempotency.py | 75 ++++++++++++++++++++++++++++++++++++- 6 files changed, 163 insertions(+), 20 deletions(-) diff --git a/AUDIT.md b/AUDIT.md index 381f612..857815d 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -2,6 +2,28 @@ Per `CLAUDE.md`: this file records material changes to the repo (server, admin, client). For a client package, that means public API, on-the-wire request shape, and protocol-conformance posture. +## 2026-05-13 — positioning-review round 3 (P0 transport/5xx routing + P1 bare-exception wrap) + +**Author:** strategic/positioning review round 3 (PR #2 still in-flight) +**Scope:** completing the commit-uncertain contract for all post-PSP failure modes + +1. **[P0] Transport errors and 5xx responses on commit are now uncertain (no release)** — previously they fell into the "unrecognized rejection" branch, which called `_handle_release` and raised `AP2GuardCommitFailed` with the message "reservation released" regardless. Wrong: the commit POST may have reached Cycles and mutated state before the failure was observed, so auto-release risks undoing a successful settle. New branch ordering in `_handle_commit`: success → transport/5xx → terminal codes → 4xx unrecognized. The first three all raise `AP2GuardCommitUncertain` with no release; only 4xx-with-unrecognized-code still releases. Synthetic `error_code` values: `TRANSPORT_ERROR` for transport-level failures, `SERVER_ERROR` for 5xx without a parseable code (specific codes propagate when present). + +2. **[P1] Bare exceptions during commit POST now surface as commit-uncertain** — the `try / except Exception / raise` block re-raised the raw exception, losing the `reservation_id` and bypassing the reconciliation contract. Now wrapped as `AP2GuardCommitUncertain(error_code="COMMIT_RAISED", reservation_id=...) from exc` so the caller still gets the standard fields and the original exception is preserved via `__cause__`. + +3. **Docstring updated** — `AP2GuardCommitUncertain.__doc__` enumerates all the new conditions and how to distinguish them via `error_code`. + +**Test posture after fixes:** +- 107 tests (up from 103), 99.18% coverage. + +**Four new regression tests:** +- `test_commit_5xx_raises_uncertain_no_release` +- `test_commit_5xx_without_body_uses_synthetic_code` +- `test_commit_transport_error_raises_uncertain_no_release` +- `test_commit_raises_exception_surfaces_as_uncertain` + +No public API additions; only behavior shift inside `_handle_commit` and an expanded contract on the existing `AP2GuardCommitUncertain`. + ## 2026-05-13 — positioning-review follow-ups (wording accuracy + empty-hash guard) **Author:** strategic/positioning review round 2 (PR #2 in-flight) diff --git a/CHANGELOG.md b/CHANGELOG.md index 8b7aac2..9288443 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,7 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Deterministic idempotency keys with automatic consume-once scope selection — `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}` when the mandate carries an open mandate hash (AP2 spec §6 normative consume-once defense for human-not-present flows), `ap2:tx:{sha256(transaction_id)[:32]}:{phase}` otherwise. Hash is fixed-length (128-bit collision resistance), header-safe, and the phase suffix is always preserved. - `AP2DryRunResult` exception — raised from `__enter__` when `dry_run=True` so the `with` body cannot execute (prevents real PSP calls under dry-run from moving money off the books). - `AP2GuardCommitFailed` exception — raised after a release attempt when the server rejects a commit with an unrecognized code; carries `released: bool` and `release_error: str | None` so the caller can distinguish "budget recovered" from "budget stranded until TTL". -- `AP2GuardCommitUncertain` exception — raised on terminal commit responses (`RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, `IDEMPOTENCY_MISMATCH`) so post-PSP reconciliation events cannot be silently missed. No auto-release (a prior commit may already have settled). +- `AP2GuardCommitUncertain` exception — raised whenever the commit outcome is unknown after the PSP body ran: terminal codes (`RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH`), transport-level failures (`error_code="TRANSPORT_ERROR"`), 5xx server errors (`SERVER_ERROR` or specific code), and uncaught exceptions during commit (`COMMIT_RAISED`, original chained via `__cause__`). No auto-release in any of these cases — the POST may have mutated Cycles before the failure. - USD-only enforcement; non-USD raises `AP2CurrencyError`. Rejects NaN, +/-Infinity, and amounts with more than 8 decimal places (sub-micro precision); wraps all `decimal` failures as `AP2MandateError`. - Exact integer-tuple conversion in `amount_micros()` — does not depend on the default decimal context, so large valid inputs convert exactly instead of being silently rounded. - Pre-allocation digit-count cap (≤ 19 integer digits, the int64 USD_MICROCENTS ceiling) blocks exponent-notation DoS like `Decimal("1E+1000000000000")` that would otherwise hang allocating a trillion-digit scaling factor. @@ -22,7 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `set_actual_micros()` raises `AP2MandateError` (a `ValueError` subclass) for both negative and over-int64 inputs, plus any non-int type (notably `bool` and `float`, which numerical comparisons would otherwise let slip through). - Direct callers of `mapping.build_commit_body(actual_micros=...)` get the same input validation via a shared private validator. - `AP2Mandate.checkout_hash` and `open_mandate_hash` require `min_length=1` when present — empty strings would otherwise silently fall back to the transaction-id lock scope. -- 103 tests, ≥ 95% coverage, ruff + mypy strict. +- 107 tests, ≥ 95% coverage, ruff + mypy strict. ### Planned for v0.2 - `AsyncGuardedPayment` (asyncio). diff --git a/README.md b/README.md index 24a8761..36a69d4 100644 --- a/README.md +++ b/README.md @@ -101,8 +101,8 @@ Required upstream attributes (duck-typed): `payment_mandate.transaction_id`, `pa | `Decision.ALLOW`, body raises | **Release** | Reason `ap2_guard_failed:{ExcType}`, idempotency key includes the exception type | | `Decision.DENY` | **Neither** | `AP2GuardDenied` raised in `__enter__`; real money never moves | | HTTP / transport error on reserve | **Neither** | `AP2GuardDenied` raised; caller can retry — same `transaction_id` ⇒ same reserve key | -| Commit returns `RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH` | **Raise, no release** | `AP2GuardCommitUncertain` raised so the caller cannot silently miss the reconciliation event; we don't auto-release (a prior commit may already have settled) | -| Commit returns other 4xx | **Release + raise** | Reservation release attempted; `AP2GuardCommitFailed` raised with `released` + `release_error` so the caller cannot miss the reconciliation event | +| Commit transport error / 5xx / `RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH` / uncaught exception | **Raise, no release** | `AP2GuardCommitUncertain` raised. The commit POST may have reached and mutated Cycles before the failure, so auto-release could undo a successful settle. `error_code` distinguishes the flavor (`TRANSPORT_ERROR`, `SERVER_ERROR`, `COMMIT_RAISED`, or the specific code) | +| Commit returns 4xx with unrecognized code | **Release + raise** | Server explicitly rejected the request (malformed, forbidden, etc.) — release is safe. `AP2GuardCommitFailed` raised with `released` + `release_error` so the caller can still see the reconciliation context | | `guard.abort(reason)` called inside `with` | **Release** | Reason `ap2_guard_aborted:{reason}` | | `dry_run=True` | **Neither** | `__enter__` raises `AP2DryRunResult` carrying the decision payload — the `with` body never runs, so a real PSP call cannot leak under a dry-run probe | @@ -197,7 +197,7 @@ Exception hierarchy: | `AP2GuardError` | Base for all AP2-guard errors | | `AP2GuardDenied` | Cycles returned `DENY` or the reserve POST failed | | `AP2DryRunResult` | Raised from `__enter__` when `dry_run=True` — carries the decision payload; the `with` body never executes | -| `AP2GuardCommitUncertain` | Commit returned a terminal status (`RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, or `IDEMPOTENCY_MISMATCH`) after the body ran. No auto-release (a prior commit may already have settled). `RESERVATION_EXPIRED` is the worst case: PSP may have charged while Cycles reclaimed the budget on TTL. Reconcile with PSP | +| `AP2GuardCommitUncertain` | Commit outcome is unknown after the body ran. Covers terminal status codes (`RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, `IDEMPOTENCY_MISMATCH`), transport-level failures (`error_code="TRANSPORT_ERROR"`), 5xx server errors (`error_code="SERVER_ERROR"` or specific code), and uncaught exceptions during commit (`error_code="COMMIT_RAISED"`, with the original chained via `__cause__`). **No auto-release** — the POST may have mutated Cycles before the failure. Reconcile with PSP | | `AP2GuardCommitFailed` | Commit was rejected with an unrecognized code after the body ran. Check `.released` (bool) and `.release_error` (string \| None) on the exception — `released=False` means budget is stranded until TTL; reconcile with PSP either way | | `AP2CurrencyError` | Non-USD mandate in v0.1 (subclass of `ValueError`) | | `AP2MandateError` | Adapter input is malformed — NaN, infinity, sub-micro precision, missing payee, etc. (subclass of `ValueError`) | diff --git a/runcycles_ap2/exceptions.py b/runcycles_ap2/exceptions.py index eb2c712..0ff83e1 100644 --- a/runcycles_ap2/exceptions.py +++ b/runcycles_ap2/exceptions.py @@ -53,21 +53,31 @@ def __init__( class AP2GuardCommitUncertain(AP2GuardError): - """Cycles returned a terminal status for this reservation on commit. + """The commit POST ran after the PSP body but the outcome is unknown. + + Raised when ANY of these conditions hits the commit phase — in every case the + server may have mutated state before the failure was observed, so auto-release + is unsafe (it could undo a successful settle): - Raised when the server replies with one of: - ``RESERVATION_FINALIZED`` — a prior attempt already finalized the reservation; usually a benign replay, but we can't verify "matching actuals" client-side. - ``RESERVATION_EXPIRED`` — the TTL elapsed before we could commit. The server - reclaimed the budget on the cycles side, but the PSP body has already run, so - the payment may have moved without a Cycles settlement. + reclaimed the budget, but the PSP body already ran, so the payment may have + moved without a Cycles settlement. - ``IDEMPOTENCY_MISMATCH`` — a prior commit under this idempotency key carried a different payload (often: different actuals across retries). - - After the PSP body has executed, any of these is a reconciliation event. We do - NOT auto-release for them (a previous successful commit would be undone). The - caller MUST handle this exception — silently returning would let unreconciled - payment state propagate. + - **Transport error** (``error_code="TRANSPORT_ERROR"``) — the commit POST + never got a response (connection lost, timeout, DNS failure, etc.). Cycles + may have received and applied the commit before the wire died. + - **5xx server error** (``error_code="SERVER_ERROR"`` or the specific code) — + the server failed *after* possibly mutating state. + - **Uncaught exception during commit** (``error_code="COMMIT_RAISED"``) — the + client code raised before a response was processed; the chained ``__cause__`` + is the original exception. May or may not have reached the server. + + The caller MUST handle this exception — silently returning would let + unreconciled payment state propagate. Use ``error_code`` to distinguish the + flavors if you want different reconciliation paths. """ def __init__( diff --git a/runcycles_ap2/guard.py b/runcycles_ap2/guard.py index ce0b883..384fc91 100644 --- a/runcycles_ap2/guard.py +++ b/runcycles_ap2/guard.py @@ -236,9 +236,20 @@ def _handle_commit(self) -> None: ) try: response = self._client.commit_reservation(self._reservation_id, body) - except Exception: + except Exception as exc: + # Post-PSP exception during the commit POST itself. The request may have + # reached Cycles and mutated state before the exception was raised, so the + # commit outcome is unknown. NEVER auto-release — that would risk undoing + # a successful settle. Surface as commit-uncertain so the caller cannot + # miss the reconciliation event; chain the original via `from exc`. logger.exception("AP2 commit raised: tx=%s", self._mandate.transaction_id) - raise + raise AP2GuardCommitUncertain( + f"AP2 commit raised before a response was received for transaction " + f"{self._mandate.transaction_id}: {type(exc).__name__}: {exc}. " + "Commit outcome is unknown; no release was attempted.", + error_code="COMMIT_RAISED", + reservation_id=self._reservation_id, + ) from exc if response.is_success: self._committed = True @@ -254,6 +265,31 @@ def _handle_commit(self) -> None: error = response.get_error_response() error_code = error.error_code.value if (error and error.error_code) else None request_id = error.request_id if error else None + + # Transport-level or server-side (5xx) failure. The commit POST may have + # reached the server and mutated state before the response was lost or the + # server errored out. We CANNOT tell whether the commit succeeded, so we + # MUST NOT auto-release — that risks undoing a successful settle. Same + # uncertain semantics as the terminal-status codes below. + if response.is_transport_error or response.is_server_error: + synthetic_code = error_code or ("TRANSPORT_ERROR" if response.is_transport_error else "SERVER_ERROR") + logger.warning( + "AP2 commit %s failure (raising uncertain): id=%s, tx=%s, status=%d, code=%s", + "transport" if response.is_transport_error else "server", + self._reservation_id, + self._mandate.transaction_id, + response.status, + synthetic_code, + ) + raise AP2GuardCommitUncertain( + f"AP2 commit failed at the transport/server layer for transaction " + f"{self._mandate.transaction_id} (status={response.status}, code={synthetic_code}). " + "Commit may have reached Cycles before the failure; no release was attempted.", + error_code=synthetic_code, + request_id=request_id, + reservation_id=self._reservation_id, + ) + if error_code in ("RESERVATION_FINALIZED", "RESERVATION_EXPIRED", "IDEMPOTENCY_MISMATCH"): # The reservation is in a terminal state on the server. We do NOT auto-release # (that would undo a prior successful commit), but we also do NOT return @@ -275,9 +311,11 @@ def _handle_commit(self) -> None: reservation_id=self._reservation_id, ) - # Unrecognized commit rejection. The PSP may already have moved money, so we - # release the budget and raise — the caller MUST reconcile, and a silent - # `guard.committed == False` was too easy to miss. + # 4xx with an unrecognized error code: the server explicitly rejected the + # commit request itself (malformed, forbidden, etc.). Releasing is safe here + # — the server saw the request and refused, so no mutation occurred on its + # side. The PSP may still have moved money, hence we raise AP2GuardCommitFailed + # for explicit reconciliation. logger.warning( "AP2 commit rejected (releasing): id=%s, tx=%s, code=%s", self._reservation_id, diff --git a/tests/test_idempotency.py b/tests/test_idempotency.py index 283458e..88721b0 100644 --- a/tests/test_idempotency.py +++ b/tests/test_idempotency.py @@ -110,7 +110,9 @@ def test_commit_failed_records_release_failure(self, mock_client, mandate) -> No assert "FAILED" in str(ei.value) and "stranded" in str(ei.value) def test_commit_failed_records_release_non_success(self, mock_client, mandate) -> None: - # Same as above but the release POST returned a 5xx — server didn't release. + # Same as test_unrecognized_commit_error_releases_and_raises but the RELEASE + # POST (not the commit) returned a 5xx — server didn't release. The commit + # itself is still a 4xx unrecognized rejection (release-and-raise path). from runcycles_ap2 import AP2GuardCommitFailed mock_client.create_reservation.return_value = allow_response() @@ -124,3 +126,74 @@ def test_commit_failed_records_release_non_success(self, mock_client, mandate) - assert ei.value.released is False assert ei.value.release_error is not None assert "500" in ei.value.release_error + + def test_commit_5xx_raises_uncertain_no_release(self, mock_client, mandate) -> None: + # P0 regression: a 5xx on the commit POST itself means the server may have + # applied the commit before erroring. Releasing risks undoing a successful + # settle. Must raise AP2GuardCommitUncertain WITHOUT calling release. + mock_client.create_reservation.return_value = allow_response() + mock_client.commit_reservation.return_value = commit_error_response("INTERNAL_ERROR", status=500) + + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + + # Specific 5xx error_code propagated from response body. + assert ei.value.error_code == "INTERNAL_ERROR" + assert ei.value.reservation_id == "rsv_ap2_001" + mock_client.release_reservation.assert_not_called() + + def test_commit_5xx_without_body_uses_synthetic_code(self, mock_client, mandate) -> None: + # P0 regression: a 5xx with no parseable error body still raises uncertain; + # we synthesize error_code="SERVER_ERROR" so callers can branch on it. + from runcycles.response import CyclesResponse + + mock_client.create_reservation.return_value = allow_response() + mock_client.commit_reservation.return_value = CyclesResponse.http_error( + 502, + error_message="bad gateway", + body=None, + ) + + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + + assert ei.value.error_code == "SERVER_ERROR" + mock_client.release_reservation.assert_not_called() + + def test_commit_transport_error_raises_uncertain_no_release(self, mock_client, mandate) -> None: + # P0 regression: a transport-level failure (CyclesResponse.transport_error, + # status=-1) means the POST never got a response. Cycles may have received and + # applied the commit before the wire died. Must raise uncertain, NO release. + from runcycles.response import CyclesResponse + + mock_client.create_reservation.return_value = allow_response() + mock_client.commit_reservation.return_value = CyclesResponse.transport_error(ConnectionError("network down")) + + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + + assert ei.value.error_code == "TRANSPORT_ERROR" + assert ei.value.reservation_id == "rsv_ap2_001" + mock_client.release_reservation.assert_not_called() + + def test_commit_raises_exception_surfaces_as_uncertain(self, mock_client, mandate) -> None: + # P1 regression: if the client itself raises (e.g. a defect somewhere in the + # transport wrapper, or a programming error), the guard used to re-raise the + # raw exception — losing the reservation_id and bypassing the reconciliation + # contract. Now we wrap it as AP2GuardCommitUncertain with the original + # chained via __cause__ so the caller still gets the standard fields. + mock_client.create_reservation.return_value = allow_response() + mock_client.commit_reservation.side_effect = ConnectionError("low-level boom") + + with pytest.raises(AP2GuardCommitUncertain) as ei: + with cycles_guard_payment(mock_client, mandate=mandate, run_id="r", tenant="acme") as _: + pass + + assert ei.value.error_code == "COMMIT_RAISED" + assert ei.value.reservation_id == "rsv_ap2_001" + assert isinstance(ei.value.__cause__, ConnectionError) + assert "low-level boom" in str(ei.value.__cause__) + mock_client.release_reservation.assert_not_called() From 88f2524dfc01ada385cfaac1ecc22fc98ec3d2a4 Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Wed, 13 May 2026 10:53:45 -0400 Subject: [PATCH 4/6] docs: fix three stale README lines pointing at transaction_id-only MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three README spots still implied the consume-once boundary is always transaction_id and that distinct attempts "collapse onto one reservation". Both are wrong with the open-mandate scope and the real Cycles idempotency semantics (replay or IDEMPOTENCY_MISMATCH). 1. "How the guard responds" table — the HTTP/transport-error-on-reserve row said "same transaction_id ⇒ same reserve key". Generalized to "same consume-once scope (open_mandate_hash when present, otherwise transaction_id) ⇒ same reserve key". 2. AP2 mapping table — the transaction_id row said "also feeds idempotency keys" unconditionally. Now: "feeds the idempotency key only when open_mandate_hash is absent (otherwise the open mandate is the consume-once scope)". 3. Features bullet — "duplicate workers on the same mandate collapse onto one reservation server-side" replaced with the accurate "share one idempotency bucket server-side; identical replays return the original reservation, divergent attempts are rejected with IDEMPOTENCY_MISMATCH". Code unchanged. Gates still green: ruff + mypy strict + 107 tests. --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 36a69d4..67ab2fb 100644 --- a/README.md +++ b/README.md @@ -100,7 +100,7 @@ Required upstream attributes (duck-typed): `payment_mandate.transaction_id`, `pa | `Decision.ALLOW`, body completes | **Commit** | Server idempotency key derived from the consume-once scope (`open_mandate_hash` when present, otherwise `transaction_id`) — see *Deterministic idempotency keys* below | | `Decision.ALLOW`, body raises | **Release** | Reason `ap2_guard_failed:{ExcType}`, idempotency key includes the exception type | | `Decision.DENY` | **Neither** | `AP2GuardDenied` raised in `__enter__`; real money never moves | -| HTTP / transport error on reserve | **Neither** | `AP2GuardDenied` raised; caller can retry — same `transaction_id` ⇒ same reserve key | +| HTTP / transport error on reserve | **Neither** | `AP2GuardDenied` raised; caller can retry — same consume-once scope (`open_mandate_hash` when present, otherwise `transaction_id`) ⇒ same reserve key | | Commit transport error / 5xx / `RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH` / uncaught exception | **Raise, no release** | `AP2GuardCommitUncertain` raised. The commit POST may have reached and mutated Cycles before the failure, so auto-release could undo a successful settle. `error_code` distinguishes the flavor (`TRANSPORT_ERROR`, `SERVER_ERROR`, `COMMIT_RAISED`, or the specific code) | | Commit returns 4xx with unrecognized code | **Release + raise** | Server explicitly rejected the request (malformed, forbidden, etc.) — release is safe. `AP2GuardCommitFailed` raised with `released` + `release_error` so the caller can still see the reconciliation context | | `guard.abort(reason)` called inside `with` | **Release** | Reason `ap2_guard_aborted:{reason}` | @@ -112,7 +112,7 @@ Required upstream attributes (duck-typed): `payment_mandate.transaction_id`, `pa | AP2 source | Cycles destination | Notes | |---|---|---| -| `PaymentMandate.transaction_id` | `Subject.dimensions["ap2_transaction_id"]` | also feeds idempotency keys | +| `PaymentMandate.transaction_id` | `Subject.dimensions["ap2_transaction_id"]` | feeds the idempotency key only when `open_mandate_hash` is absent (otherwise the open mandate is the consume-once scope — see [Deterministic idempotency keys](#deterministic-idempotency-keys)) | | `PaymentMandate.payment_amount.value` | `Amount.amount` | Exact integer conversion to USD micro-cents (10⁻⁸ USD). Rejects NaN, ±Infinity, negative values, more than 8 decimal places, or amounts beyond int64 micro-cents | | `PaymentMandate.payment_amount.currency` | `Action.policy_keys.custom["currency"]` | MVP enforces `"USD"` | | `PaymentMandate.payee.website` | `Action.policy_keys.host` | required for policy routing | @@ -206,7 +206,7 @@ Exception hierarchy: - **One context manager** — `cycles_guard_payment` wraps a single AP2 payment moment in reserve → commit / release. - **Deterministic idempotency** — no caller-supplied keys; retries replay the same reservation. -- **Consume-once defense** — duplicate workers on the same mandate collapse onto one reservation server-side. +- **Consume-once defense** — duplicate workers on the same mandate share one idempotency bucket server-side; identical replays return the original reservation, divergent attempts are rejected with `IDEMPOTENCY_MISMATCH`. - **Built-in `payment.charge` action** — no custom action-kind registration, no protocol PR required. - **Adapter layer** (`AP2Mandate`) insulates from upstream AP2 SDK churn. - **Pydantic v2 models** with strict validation. From 00835b5201ed297b874c605234c8faf92504de5f Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Wed, 13 May 2026 11:00:43 -0400 Subject: [PATCH 5/6] fix: ASCII-only suffix filter, preserve empty checkout_hash, refresh stale docstrings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P2 — Idempotency-key suffix accepted non-ASCII letters ====================================================== safe_suffix used `c.isalnum()`, but Python's str.isalnum is Unicode-aware: "É".isalnum() is True. A non-ASCII suffix (e.g. a localized exception class name like "Échec") would have reached the Idempotency-Key HTTP header. RFC 7230 requires ASCII tokens, so httpx would reject the request. Tightened the predicate to `(c.isascii() and c.isalnum()) or c in ("_", "-", ".")`. Regression test asserts the key is isascii() even for a non-ASCII suffix. P3 — from_ap2() masked empty checkout_hash as None ================================================== The `or` short-circuit at getattr(cm, "hash", None) or getattr(cm, "checkout_hash", None) silently dropped upstream `hash=""` to None, bypassing the model's new `min_length=1` rejection. Replaced with explicit None-check on the first attribute; fall back only when the first is genuinely missing. Empty strings now propagate to the model and get rejected at construction. Two regression tests: - hash="" → ValidationError at AP2Mandate(...) - alternate naming (checkout_hash attr only, no hash attr) still falls through correctly Kept payee.website / .identifier fallback unchanged — that one is intentional alternates-not-aliases behavior with a dedicated test. P3 — Stale docstrings refreshed =============================== 1. mapping.build_commit_body.__doc__: said "key derived from transaction_id" — generalized to the consume-once scope (open_mandate_hash when present, otherwise transaction_id), matching the actual implementation. 2. AP2GuardCommitFailed.__doc__: still described RESERVATION_FINALIZED / RESERVATION_EXPIRED / IDEMPOTENCY_MISMATCH as "benign replay" that this exception is NOT raised for. Misleading — those raise AP2GuardCommitUncertain (a different exception), not a quiet no-op. Rewrote to scope this exception to 4xx-explicit-rejection only and point readers at AP2GuardCommitUncertain for the unknown-outcome conditions. Test posture: 110 tests (up from 107), 99.19% coverage. ruff + mypy strict clean. `python -m build` produces sdist + wheel cleanly. No public API additions — only behavior tightening on existing paths and documentation sync. AUDIT.md and CHANGELOG.md updated. --- AUDIT.md | 18 +++++++++++++++++ CHANGELOG.md | 4 +++- README.md | 5 ++--- runcycles_ap2/exceptions.py | 18 +++++++++-------- runcycles_ap2/mapping.py | 16 +++++++++++---- runcycles_ap2/models.py | 9 ++++++++- tests/test_ap2_shape_adapter.py | 36 +++++++++++++++++++++++++++++++++ tests/test_mapping.py | 13 ++++++++++++ 8 files changed, 102 insertions(+), 17 deletions(-) diff --git a/AUDIT.md b/AUDIT.md index 857815d..7ca86aa 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -2,6 +2,24 @@ Per `CLAUDE.md`: this file records material changes to the repo (server, admin, client). For a client package, that means public API, on-the-wire request shape, and protocol-conformance posture. +## 2026-05-13 — positioning-review round 4 (ASCII suffix + empty-hash preservation + docstrings) + +**Author:** strategic/positioning review round 4 (PR #2 still in-flight) +**Scope:** correctness polish + docstring sync + +1. **[P2] ASCII-only filter in idempotency-key suffix** — `safe_suffix` used `c.isalnum()`, but Python's `str.isalnum` is Unicode-aware (e.g. `"É".isalnum() == True`). A non-ASCII exception class name (or any non-ASCII suffix input) would have reached the `Idempotency-Key` HTTP header — RFC 7230 requires ASCII tokens, so httpx would reject the request. **Fix:** tighten the predicate to `(c.isascii() and c.isalnum()) or c in ("_", "-", ".")`. Regression test asserts the key is `isascii()` even when given a French-style suffix `"Échec"`. + +2. **[P3] `AP2Mandate.from_ap2` preserved empty `checkout_hash`** — the `or` short-circuit at `getattr(..., "hash", None) or getattr(..., "checkout_hash", None)` masked upstream `hash=""` as `None`, bypassing the model's `min_length=1` rejection. **Fix:** explicit `None` check on the first alternative; fall back only when the first attr is genuinely absent. Two regression tests: empty `hash=""` now raises `ValidationError` at construction; alternate naming (`checkout_hash` attr with no `hash` attr) still falls through correctly. + +3. **[P3] Stale docstrings refreshed** — + - `mapping.build_commit_body.__doc__` now references the consume-once scope rather than `transaction_id` exclusively. + - `AP2GuardCommitFailed.__doc__` rewritten: it covers only 4xx-explicit-rejection now; the previous paragraph that claimed it excludes `RESERVATION_FINALIZED` / `RESERVATION_EXPIRED` / `IDEMPOTENCY_MISMATCH` as "benign replay" was misleading — those raise `AP2GuardCommitUncertain` (a different exception), not a quiet no-op. The new docstring points readers at `AP2GuardCommitUncertain` for all unknown-outcome conditions. + +**Test posture after fixes:** +- 110 tests (up from 107), 99.19% coverage. + +No public API additions; only behavior tightening on existing paths. + ## 2026-05-13 — positioning-review round 3 (P0 transport/5xx routing + P1 bare-exception wrap) **Author:** strategic/positioning review round 3 (PR #2 still in-flight) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9288443..819f240 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,7 +22,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `set_actual_micros()` raises `AP2MandateError` (a `ValueError` subclass) for both negative and over-int64 inputs, plus any non-int type (notably `bool` and `float`, which numerical comparisons would otherwise let slip through). - Direct callers of `mapping.build_commit_body(actual_micros=...)` get the same input validation via a shared private validator. - `AP2Mandate.checkout_hash` and `open_mandate_hash` require `min_length=1` when present — empty strings would otherwise silently fall back to the transaction-id lock scope. -- 107 tests, ≥ 95% coverage, ruff + mypy strict. +- Idempotency-key suffix filter is ASCII-only (`isascii() and isalnum()` — defends against Unicode-aware `isalnum()` letting non-ASCII chars reach the HTTP header). +- `AP2Mandate.from_ap2()` preserves an empty upstream `checkout_hash` so the model's `min_length=1` constraint can reject it (was previously masked to `None` by a falsy-`or` short-circuit). +- 110 tests, ≥ 95% coverage, ruff + mypy strict. ### Planned for v0.2 - `AsyncGuardedPayment` (asyncio). diff --git a/README.md b/README.md index 67ab2fb..bc997c7 100644 --- a/README.md +++ b/README.md @@ -3,13 +3,12 @@ [![CI](https://github.com/runcycles/cycles-ap2-python/actions/workflows/ci.yml/badge.svg)](https://github.com/runcycles/cycles-ap2-python/actions) [![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE) [![Coverage](https://img.shields.io/badge/coverage-98%25-brightgreen)](https://github.com/runcycles/cycles-ap2-python/actions) -[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/runcycles/cycles-ap2-python/badge)](https://scorecard.dev/viewer/?uri=github.com/runcycles/cycles-ap2-python) # Cycles AP2 Guard — Runtime authority for AP2 agent payments -**Cycles AP2 Guard adds runtime authority to [AP2](https://github.com/google-agentic-commerce/AP2) payment flows.** +**Cycles AP2 Guard adds runtime authority to [Google AP2](https://github.com/google-agentic-commerce/AP2) payment flows.** -> *AP2 proves that a payment mandate is valid.* +> *Google AP2 proves that a payment mandate is valid.* > *Cycles decides whether this agent, tenant, run, mandate, and merchant are still allowed to attempt the payment right now.* Use it to prevent: diff --git a/runcycles_ap2/exceptions.py b/runcycles_ap2/exceptions.py index 0ff83e1..1502372 100644 --- a/runcycles_ap2/exceptions.py +++ b/runcycles_ap2/exceptions.py @@ -95,12 +95,12 @@ def __init__( class AP2GuardCommitFailed(AP2GuardError): - """Cycles rejected the commit AFTER the body ran (PSP may already have charged). + """Cycles **explicitly rejected** the commit request itself AFTER the body ran. - The wrapper attempts to release the reservation to recover budget; whether that - release succeeded is reported via :attr:`released`. The caller MUST treat this as - a reconciliation event regardless: PSP state and Cycles' view of the budget can - be out of sync. + Used only for 4xx responses with an unrecognized error code (e.g. malformed + request, forbidden, etc.). The server saw the request and refused; releasing + the reservation is safe and is attempted before this exception is raised. The + PSP may still have moved money, so the caller MUST reconcile regardless. - ``released = True``: budget was returned, but PSP-side capture may still need reconciliation against your books. @@ -108,9 +108,11 @@ class AP2GuardCommitFailed(AP2GuardError): (server-side rollback) AND PSP-side capture may need reconciliation. This is the worst case — escalate. - This exception is NOT raised for ``RESERVATION_FINALIZED`` / ``RESERVATION_EXPIRED`` - / ``IDEMPOTENCY_MISMATCH`` — those indicate a prior attempt already finalized the - reservation and the current call is a benign replay. + This is NOT the right exception for unknown-outcome failures. Anything where the + commit POST might have reached and mutated Cycles before the failure — transport + errors, 5xx, terminal reservation statuses (``RESERVATION_FINALIZED`` / + ``RESERVATION_EXPIRED`` / ``IDEMPOTENCY_MISMATCH``), and uncaught exceptions — + is raised as :class:`AP2GuardCommitUncertain` with **no auto-release**. """ def __init__( diff --git a/runcycles_ap2/mapping.py b/runcycles_ap2/mapping.py index 39c8030..d451828 100644 --- a/runcycles_ap2/mapping.py +++ b/runcycles_ap2/mapping.py @@ -79,9 +79,13 @@ def idempotency_key(mandate: AP2Mandate, phase: str, suffix: str | None = None) scope, raw = consume_once_input(mandate) base = f"{IDEMPOTENCY_PREFIX}:{scope}:{_hash_input(raw)}:{phase}" if suffix: - # Header-safe charset: alphanumeric, underscore, hyphen, dot. Anything else - # becomes ``_`` so the resulting key is always a valid HTTP header value. - safe_suffix = "".join(c if c.isalnum() or c in ("_", "-", ".") else "_" for c in suffix) + # Header-safe charset: ASCII alphanumeric, underscore, hyphen, dot. We use + # ``isascii() and isalnum()`` rather than ``isalnum()`` alone because Python's + # ``str.isalnum`` is Unicode-aware ("É".isalnum() is True), and HTTP header + # tokens per RFC 7230 must be ASCII. Non-ASCII chars (e.g. a localized + # exception class name like "Échec") would otherwise reach the wire and + # httpx would reject the request. Anything outside the safe set becomes "_". + safe_suffix = "".join(c if (c.isascii() and c.isalnum()) or c in ("_", "-", ".") else "_" for c in suffix) base = f"{base}:{safe_suffix[:64]}" return base @@ -200,7 +204,11 @@ def build_commit_body( actual_micros: int | None = None, metadata: dict[str, Any] | None = None, ) -> dict[str, Any]: - """Commit body with deterministic idempotency key derived from ``transaction_id``. + """Commit body with the deterministic idempotency key for this mandate. + + The key is derived from the consume-once scope: ``open_mandate_hash`` when the + mandate carries one (HNP flows), otherwise ``transaction_id`` — see + :func:`idempotency_key` and :func:`consume_once_input`. ``actual_micros`` is validated when supplied (rejects ``bool``, ``float``, and out-of-range ints) so direct callers of this builder get the same protection as diff --git a/runcycles_ap2/models.py b/runcycles_ap2/models.py index 6b3c1c1..b1694a6 100644 --- a/runcycles_ap2/models.py +++ b/runcycles_ap2/models.py @@ -160,7 +160,14 @@ def from_ap2( checkout_hash: str | None = None if checkout_mandate is not None: - checkout_hash = getattr(checkout_mandate, "hash", None) or getattr(checkout_mandate, "checkout_hash", None) + # Two upstream naming variants point at the same field, so fall back only + # when the first is genuinely missing (``None``) — NOT when it's an empty + # string. Empty-string `hash` is bad data; let it propagate so the model's + # ``min_length=1`` rejects it instead of silently masking the corruption + # via the falsy-``or`` short-circuit. + checkout_hash = getattr(checkout_mandate, "hash", None) + if checkout_hash is None: + checkout_hash = getattr(checkout_mandate, "checkout_hash", None) return cls( transaction_id=transaction_id, diff --git a/tests/test_ap2_shape_adapter.py b/tests/test_ap2_shape_adapter.py index bd2c289..9e8983f 100644 --- a/tests/test_ap2_shape_adapter.py +++ b/tests/test_ap2_shape_adapter.py @@ -76,6 +76,42 @@ def test_from_ap2_human_not_present_with_checkout(self) -> None: # When open_mandate_hash is present, the consume-once lock keys on it. assert ":open_mandate:" in idempotency_key(mandate, "reserve") + def test_from_ap2_empty_checkout_hash_is_rejected_not_silently_dropped(self) -> None: + # P3 regression: previously, an upstream `CheckoutMandate(hash="")` would be + # short-circuited to None via the `or` in from_ap2(), masking the bad data + # and bypassing the model's min_length=1 check. Now the empty string is + # preserved and the AP2Mandate constructor raises. + from pydantic import ValidationError + + pm = _PaymentMandate( + transaction_id="tx-empty-ch", + payment_amount=_PaymentAmount(value="1.00", currency="USD"), + payee=_Payee(website="shop.example"), + ) + cm = _CheckoutMandate(hash="") # empty — upstream data corruption + + import pytest + + with pytest.raises(ValidationError): + AP2Mandate.from_ap2(pm, cm) + + def test_from_ap2_checkout_hash_alt_naming_still_falls_through(self) -> None: + # When the first attribute is genuinely missing (no `hash` attr at all), the + # alternate naming (`checkout_hash`) is used — that's the intended fallback. + @dataclass + class _CheckoutMandateAltName: + checkout_hash: str + + pm = _PaymentMandate( + transaction_id="tx-alt", + payment_amount=_PaymentAmount(value="1.00", currency="USD"), + payee=_Payee(website="shop.example"), + ) + cm = _CheckoutMandateAltName(checkout_hash="ch_via_alt_name") + + mandate = AP2Mandate.from_ap2(pm, cm) + assert mandate.checkout_hash == "ch_via_alt_name" + def test_from_ap2_payee_identifier_fallback(self) -> None: # Some AP2 sample shapes use `identifier` when no website is set. pm = _PaymentMandate( diff --git a/tests/test_mapping.py b/tests/test_mapping.py index 3e5e4de..356dfef 100644 --- a/tests/test_mapping.py +++ b/tests/test_mapping.py @@ -80,6 +80,19 @@ def test_unsafe_transaction_id_chars_do_not_leak_to_header(self) -> None: assert "\n" not in key and "\t" not in key and "\r" not in key and "\x00" not in key assert all(c.isalnum() or c in (":", "_", "-", ".") for c in key) + def test_non_ascii_suffix_is_sanitized_to_ascii(self) -> None: + # P2 regression: Python's str.isalnum() is Unicode-aware, so "É".isalnum() is + # True. An earlier filter that used `c.isalnum() or c in (...)` alone would + # let non-ASCII chars reach the Idempotency-Key header — RFC 7230 requires + # ASCII tokens, so httpx would reject the request. The tightened predicate + # `c.isascii() and c.isalnum()` keeps the key strictly ASCII. + m = make_mandate() + key = idempotency_key(m, "release", "Échec") # French "Failure" + assert key.isascii() + # The non-ASCII "É" must have been replaced (with "_"); "chec" survives. + assert "É" not in key + assert key.endswith(":release:_chec") + def test_key_length_bounded(self) -> None: # Hash + phase + suffix always fits inside the protocol's 256-char cap, even # at the longest possible transaction_id (model max 256) and a max-length From 7f0d8db45eadecc3995a1bfd4e6d262eab297b36 Mon Sep 17 00:00:00 2001 From: Albert Mavashev Date: Wed, 13 May 2026 11:06:29 -0400 Subject: [PATCH 6/6] docs: scrub remaining "collapse onto one reservation" wording from internals MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Earlier passes fixed README and feature-bullet wording but missed two internal docstrings/comments that still used the inaccurate phrasing: - runcycles_ap2/_constants.py — the IDEMPOTENCY_SCOPE_OPEN_MANDATE comment block said all checkouts "collapse onto one reservation" - runcycles_ap2/mapping.py — consume_once_input.__doc__ said the server "collapses them onto one reservation" Replaced both with the accurate "share one (tenant, endpoint, idempotency_key) bucket; identical payloads replay, divergent payloads hit IDEMPOTENCY_MISMATCH; either way no second valid reservation" framing. Also: - tests/test_mapping.py — TestIdempotencyKeyOpenMandateScope class docstring still said "must collapse onto one reservation"; reworded - AUDIT.md — stale test-name reference (test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key was renamed in a follow-up to ...shares_idempotency_bucket...); updated and annotated. The two remaining "collapse" matches in AUDIT.md (lines 50 and 71) are intentional — they quote the old phrasing inside historical entries that document the wording fix itself, not current behavior claims. No behavior change. Gates green: ruff + mypy strict + 110 tests pass. --- AUDIT.md | 2 +- runcycles_ap2/_constants.py | 13 ++++++++----- runcycles_ap2/mapping.py | 12 +++++++----- tests/test_mapping.py | 6 ++++-- 4 files changed, 20 insertions(+), 13 deletions(-) diff --git a/AUDIT.md b/AUDIT.md index 7ca86aa..31957a5 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -68,7 +68,7 @@ No public API additions in this round. 1. **[P0-A] Open-mandate consume-once locking** — earlier rounds keyed idempotency on `transaction_id` only. That breaks the AP2 spec's normative defense (specification §6) against an autonomous agent presenting subsequent open mandates without a rejection receipt: two distinct transactions derived from the same open mandate would produce different idempotency keys and BOTH would be authorized. **Fix:** when `AP2Mandate.open_mandate_hash` is present, key the lock on `open_mandate_hash` instead of `transaction_id`. Scope namespace embedded in the key so the two buckets never collide server-side: - human-not-present: `ap2:open_mandate:{sha256(open_mandate_hash)[:32]}:{phase}[:{suffix}]` - default / human-present: `ap2:tx:{sha256(transaction_id)[:32]}:{phase}[:{suffix}]` - New `mapping.consume_once_input()` returns the (scope, raw_value) pair. The `idempotency_key()` function signature changed from `(transaction_id: str, ...)` to `(mandate: AP2Mandate, ...)` to enable automatic scope selection. **Wire-shape change** — pre-PyPI so no migration. New test class `TestIdempotencyKeyOpenMandateScope` + new test `test_open_mandate_overuse_collapses_distinct_transactions_onto_one_key` cover the regression. + New `mapping.consume_once_input()` returns the (scope, raw_value) pair. The `idempotency_key()` function signature changed from `(transaction_id: str, ...)` to `(mandate: AP2Mandate, ...)` to enable automatic scope selection. **Wire-shape change** — pre-PyPI so no migration. New test class `TestIdempotencyKeyOpenMandateScope` + `test_open_mandate_overuse_shares_idempotency_bucket_across_transactions` cover the regression (the test was renamed in the follow-up wording pass to avoid the "collapse" overclaim). 2. **[P0-B] Post-PSP commit failures now raise instead of silently logging** — `RESERVATION_FINALIZED`, `RESERVATION_EXPIRED`, and `IDEMPOTENCY_MISMATCH` previously returned silently with a warning log; callers only saw `guard.committed == False`. After the PSP body has run, any of these is a reconciliation event the caller MUST handle. `RESERVATION_EXPIRED` in particular is dangerous: the PSP may have charged while Cycles reclaimed the budget on TTL. **Fix:** new `AP2GuardCommitUncertain` exception (carries `error_code` + `request_id` + `reservation_id`); raised from `_handle_commit` for all three terminal-status codes. Still no auto-release (a prior commit may already have settled). Existing tests that asserted silent behavior were flipped to expect the raise; new `RESERVATION_EXPIRED` test added. diff --git a/runcycles_ap2/_constants.py b/runcycles_ap2/_constants.py index 437b802..baef385 100644 --- a/runcycles_ap2/_constants.py +++ b/runcycles_ap2/_constants.py @@ -33,11 +33,14 @@ # Scope namespace embedded in the idempotency key. The scope determines what counts as # "the same payment attempt" for server-side dedup: -# - "open_mandate": all checkouts derived from one open mandate collapse onto one -# reservation. This matches AP2's normative warning (specification §6) that a -# shopping agent must not present subsequent open mandates without a rejection -# receipt — without this scope, two transactions from one open mandate produce -# different transaction_ids and both would be authorized. +# - "open_mandate": all checkouts derived from one open mandate land in the same +# (tenant, endpoint, idempotency_key) bucket. Identical replays return the +# original reservation; divergent payloads are rejected with IDEMPOTENCY_MISMATCH. +# Either way, no second valid reservation. This matches AP2's normative warning +# (specification §6) that a shopping agent must not present subsequent open +# mandates without a rejection receipt — without this scope, two transactions +# from one open mandate produce different transaction_ids and both would be +# authorized. # - "tx": one transaction_id == one payment attempt (the default for human-present # and any flow without an open mandate). IDEMPOTENCY_SCOPE_OPEN_MANDATE: Final[str] = "open_mandate" diff --git a/runcycles_ap2/mapping.py b/runcycles_ap2/mapping.py index d451828..6c12d07 100644 --- a/runcycles_ap2/mapping.py +++ b/runcycles_ap2/mapping.py @@ -48,11 +48,13 @@ def consume_once_input(mandate: AP2Mandate) -> tuple[str, str]: Returns ``(scope, raw_value)`` where: - ``scope == "open_mandate"`` and ``raw_value == mandate.open_mandate_hash`` when the mandate carries an open mandate hash. This makes every checkout derived from - the same open mandate collide on the same idempotency key — the server's - ``(tenant, endpoint, idempotency_key)`` dedup then collapses them onto one - reservation. **This is the AP2 spec's normative defense** against an autonomous - agent presenting subsequent open mandates without a rejection receipt - (specification §6). + the same open mandate collide on the same idempotency key — they all land in one + ``(tenant, endpoint, idempotency_key)`` bucket. Identical payloads replay the + original reservation; divergent payloads are rejected with + ``IDEMPOTENCY_MISMATCH``. Either way the second attempt cannot create a second + valid reservation. **This is the AP2 spec's normative defense** against an + autonomous agent presenting subsequent open mandates without a rejection + receipt (specification §6). - ``scope == "tx"`` and ``raw_value == mandate.transaction_id`` otherwise. One transaction == one payment attempt (the human-present / no-open-mandate case). diff --git a/tests/test_mapping.py b/tests/test_mapping.py index 356dfef..cce0f2e 100644 --- a/tests/test_mapping.py +++ b/tests/test_mapping.py @@ -105,8 +105,10 @@ class TestIdempotencyKeyOpenMandateScope: """P0-A: when ``open_mandate_hash`` is present, the lock scope shifts to it. This is the AP2-spec consume-once boundary for human-not-present flows: every - checkout derived from the same open mandate must collapse onto one reservation, - even when their transaction_ids differ. + checkout derived from the same open mandate lands in one ``(tenant, endpoint, + idempotency_key)`` bucket. Identical payloads replay the original reservation; + divergent payloads are rejected with ``IDEMPOTENCY_MISMATCH``. Either way, no + second valid reservation — even when the transaction_ids differ. """ def test_open_mandate_scope_used_when_hash_present(self) -> None: