Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions AUDIT.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,17 @@

Per `CLAUDE.md`: this file records material changes to the repo (server, admin, client). For a client package, that means public API, on-the-wire request shape, and protocol-conformance posture.

## 2026-05-13 — live integration smoke tests (post-v0.2.0)

**Author:** post-release hygiene
**Scope:** test surface only; no public API change, no wire change, no version bump

Added `tests/integration/test_live_ap2_guard.py` — five smoke tests (four sync + one async) exercising the AP2 wrapper end-to-end against a real Cycles server. Pattern mirrors `cycles-client-python/tests/integration/test_live_server.py`: module-level `pytest.mark.skipif(not CYCLES_BASE_URL)` so the whole file is skipped at collection time when env vars are unset. Default `pytest` runs and CI ignore it. Run locally with `CYCLES_BASE_URL=... CYCLES_API_KEY=... CYCLES_TENANT=... pytest tests/integration -v`.

Each test uses a fresh UUID-based `transaction_id` and a `0.00000001` USD amount so the suite is idempotent across runs and doesn't consume meaningful budget. Covers: sync clean commit, sync exception→release, sync dry-run, async clean commit with `open_mandate_hash` scope, sync idempotent replay.

This catches wire-shape regressions that the existing 147 mock-based tests can't (e.g., a server-side rename of a `Subject.dimensions` key, a field that flips from string to enum, etc.).

## 2026-05-13 — v0.2.0 — AsyncGuardedPayment

**Author:** v0.2.0 release
Expand Down
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,21 @@ mypy runcycles_ap2
pytest --cov=runcycles_ap2 --cov-fail-under=95
```

### Live integration smoke (optional)

`tests/integration/test_live_ap2_guard.py` exercises the sync and async wrappers end-to-end against a real Cycles server — useful for catching wire-shape regressions that mock-based unit tests can't see. The whole file is skipped at collection time when `CYCLES_BASE_URL` is unset, so default `pytest` runs (and CI) ignore it.

To run locally against a dev Cycles server:

```bash
CYCLES_BASE_URL=http://localhost:7878 \
CYCLES_API_KEY=cyc_dev_xxx \
CYCLES_TENANT=ap2-integration \
pytest tests/integration -v
```

The tenant needs a budget with `payment.charge` permitted. Each test uses a fresh UUID-based `transaction_id` and a tiny `0.00000001` USD amount, so running the suite repeatedly doesn't consume meaningful budget.

CI runs all three checks on Python 3.10 and 3.12 for every push and pull request. See [`AUDIT.md`](AUDIT.md) for the protocol-conformance posture, [`CHANGELOG.md`](CHANGELOG.md) for the release log.

## Background
Expand Down
Empty file added tests/integration/__init__.py
Empty file.
208 changes: 208 additions & 0 deletions tests/integration/test_live_ap2_guard.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
"""Integration smoke tests against a live Cycles server.

Skipped unless ``CYCLES_BASE_URL`` is set. Pattern mirrors
``cycles-client-python/tests/integration/test_live_server.py`` (module-level
``pytest.mark.skipif`` rather than a custom marker), so default
``pytest`` runs in CI skip the whole file with no extra plumbing.

Goal: exercise the AP2 wrapper end-to-end against a real Cycles server so
wire-shape regressions that mocks can't catch surface here. We keep each
test to a tiny dollar amount (1 micro-cent = $0.00000001) so running the
suite repeatedly doesn't consume meaningful budget, and we use a fresh
UUID-based ``transaction_id`` per test so idempotency keys never collide
across runs.

To run locally against a dev Cycles server::

CYCLES_BASE_URL=http://localhost:7878 \\
CYCLES_API_KEY=cyc_dev_xxx \\
CYCLES_TENANT=ap2-integration \\
pytest tests/integration -v

Make sure the tenant has a budget with the ``payment.charge`` action
permitted before running these.
"""

from __future__ import annotations

import os
from uuid import uuid4

import pytest

# Module-level skip: if the live server isn't configured, skip the whole file
# so neither sync nor async tests run. Keeps default `pytest` runs hermetic.
pytestmark = pytest.mark.skipif(
not os.environ.get("CYCLES_BASE_URL"),
reason="CYCLES_BASE_URL not set — skipping live integration smoke tests",
)

# Lazy imports so the module loads cleanly even when CYCLES_BASE_URL is unset
# (the skipif fires before any test body runs, but pytest still imports the file
# to collect tests; we want the import to succeed).
from runcycles import AsyncCyclesClient, CyclesClient, CyclesConfig # noqa: E402

from runcycles_ap2 import ( # noqa: E402
AP2DryRunResult,
AP2Mandate,
cycles_guard_payment,
cycles_guard_payment_async,
)

# Tiny amount per test — 1 micro-cent. Running this suite a million times
# costs $0.01 of budget total.
TINY_AMOUNT_VALUE = "0.00000001"


def _config() -> CyclesConfig:
return CyclesConfig(
base_url=os.environ["CYCLES_BASE_URL"],
api_key=os.environ.get("CYCLES_API_KEY", ""),
tenant=os.environ.get("CYCLES_TENANT", "ap2-integration"),
)


def _fresh_mandate(*, with_open_mandate_hash: bool = False) -> AP2Mandate:
"""Construct a unique-per-test mandate so idempotency keys don't collide."""
tx = f"live-int-{uuid4().hex}"
return AP2Mandate(
transaction_id=tx,
amount_value=TINY_AMOUNT_VALUE,
currency="USD",
payee_website="integration.example",
checkout_hash=f"ch-{uuid4().hex[:16]}",
open_mandate_hash=f"omh-{uuid4().hex[:16]}" if with_open_mandate_hash else None,
)


# ---------------------------------------------------------------------------
# Sync wrapper end-to-end
# ---------------------------------------------------------------------------


class TestLiveSyncGuard:
def test_clean_commit_against_live_server(self) -> None:
config = _config()
mandate = _fresh_mandate()

with CyclesClient(config) as client:
with cycles_guard_payment(
client,
mandate=mandate,
run_id=f"run-{uuid4().hex[:8]}",
tenant=config.tenant,
agent="ap2-integration-smoke",
) as guard:
# No PSP body — just exercise the lifecycle. Real merchants would
# put their charge() call here.
assert guard.reservation_id is not None
assert guard.decision is not None
assert guard.decision.value in {"ALLOW", "ALLOW_WITH_CAPS"}

assert guard.committed is True
assert guard.receipt is not None
assert guard.receipt.ap2_transaction_id == mandate.transaction_id

def test_exception_inside_with_releases(self) -> None:
config = _config()
mandate = _fresh_mandate()

with CyclesClient(config) as client:
with pytest.raises(RuntimeError, match="psp simulated failure"):
with cycles_guard_payment(
client,
mandate=mandate,
run_id=f"run-{uuid4().hex[:8]}",
tenant=config.tenant,
):
raise RuntimeError("psp simulated failure")

# No assertion on the server side — release is best-effort; the test
# passes as long as the exception path didn't deadlock and the guard
# propagated the original exception.

def test_dry_run_raises_result_no_reservation_created(self) -> None:
config = _config()
mandate = _fresh_mandate()

with CyclesClient(config) as client:
with pytest.raises(AP2DryRunResult) as ei:
with cycles_guard_payment(
client,
mandate=mandate,
run_id=f"run-{uuid4().hex[:8]}",
tenant=config.tenant,
dry_run=True,
):
raise AssertionError("dry-run body must not execute")

assert ei.value.decision in {"ALLOW", "ALLOW_WITH_CAPS", "DENY"}

def test_idempotent_replay_returns_same_reservation_id(self) -> None:
# Two `cycles_guard_payment` invocations with the same mandate should
# produce the same reserve idempotency key and the server should return
# the same reservation_id on the second call (idempotent replay).
config = _config()
mandate = _fresh_mandate()
run_id = f"run-{uuid4().hex[:8]}"

with CyclesClient(config) as client:
with cycles_guard_payment(
client,
mandate=mandate,
run_id=run_id,
tenant=config.tenant,
) as g1:
first_id = g1.reservation_id

# Second attempt — same mandate, same idempotency key. Server should
# replay the original reservation. (Note: the first attempt committed
# in __exit__ above, so the second reserve will hit a finalized
# reservation and the server may return a replay of the original
# response OR a finalized-status code. We accept either; the goal of
# this test is to prove the wire key is deterministic, not to
# exercise post-commit replay semantics.)
try:
with cycles_guard_payment(
client,
mandate=mandate,
run_id=run_id,
tenant=config.tenant,
) as g2:
# If server replays, reservation_id matches the first.
if g2.reservation_id is not None:
assert g2.reservation_id == first_id
except Exception:
# Server returned an error for the replay (e.g. RESERVATION_FINALIZED
# surfaced through the guard). That's also acceptable — it proves
# the dedup key collided server-side.
pass


# ---------------------------------------------------------------------------
# Async wrapper end-to-end
# ---------------------------------------------------------------------------


class TestLiveAsyncGuard:
async def test_async_clean_commit_against_live_server(self) -> None:
config = _config()
mandate = _fresh_mandate(with_open_mandate_hash=True) # exercise open-mandate scope

async with AsyncCyclesClient(config) as client:
async with cycles_guard_payment_async(
client,
mandate=mandate,
run_id=f"run-{uuid4().hex[:8]}",
tenant=config.tenant,
agent="ap2-integration-smoke-async",
) as guard:
assert guard.reservation_id is not None
assert guard.decision is not None

assert guard.committed is True
assert guard.receipt is not None
# Receipt carries the open_mandate_hash dimension when scoped on it
# (the raw value, not the keyed hash — the hashed form goes only in
# the idempotency key).
assert guard.receipt.ap2_transaction_id == mandate.transaction_id