Skip to content

Commit b351db1

Browse files
committed
Merge branch 'feature/demo-app-v0.3.0' into develop
MedAssist AI demo application — full-stack FastAPI app pairing a local LLM (vLLM + Gemma) with the AgentAuth v0.3.0 SDK. Demonstrates per-patient scoped agents, delegation, tool gating via scope_is_subset, and the complete agent lifecycle (create → renew → release → validate). Includes repo decoupling from agentauth-core, expanded 22-story acceptance tracker, first-principles design doc, and Scope Creation Tool backlog proposal. Milestone: SDK + demo + docs + acceptance tests all on one branch.
2 parents 410f02d + 00d7a80 commit b351db1

46 files changed

Lines changed: 5769 additions & 105 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.agents/skills/broker/SKILL.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
---
2+
name: broker
3+
description: Use when needing to start, stop, or check the AgentAuth core broker for integration testing, live verification, or acceptance tests
4+
---
5+
6+
# Broker Management
7+
8+
Manage the AgentAuth core broker Docker stack for local SDK testing.
9+
10+
## Usage
11+
12+
- `/broker up` — Start the broker
13+
- `/broker down` — Stop the broker
14+
- `/broker status` — Check if broker is running and healthy
15+
16+
## Instructions
17+
18+
Parse the argument from the skill invocation. Default to `status` if no argument given.
19+
20+
### Configuration
21+
22+
| Variable | Default | Override |
23+
|----------|---------|----------|
24+
| `AA_ADMIN_SECRET` | `live-test-secret-32bytes-long-ok` | Pass as second arg: `/broker up mysecret` |
25+
| `AA_HOST_PORT` | `8080` | Set env var before invoking |
26+
| Broker path | `./broker` (vendored in-repo) ||
27+
28+
### `up`
29+
30+
```bash
31+
export AA_ADMIN_SECRET="${SECRET:-live-test-secret-32bytes-long-ok}"
32+
./broker/scripts/stack_up.sh
33+
```
34+
35+
After stack_up completes, run a health check:
36+
37+
```bash
38+
curl -sf http://127.0.0.1:${AA_HOST_PORT:-8080}/v1/health
39+
```
40+
41+
Report success or failure clearly. If health check fails, wait 3 seconds and retry once — the broker may need a moment after `docker compose up -d`.
42+
43+
### `down`
44+
45+
```bash
46+
./broker/scripts/stack_down.sh
47+
```
48+
49+
### `status`
50+
51+
```bash
52+
curl -sf http://127.0.0.1:${AA_HOST_PORT:-8080}/v1/health
53+
```
54+
55+
Report whether the broker is reachable. If not, suggest `/broker up`.
56+
57+
## Output Format
58+
59+
Always announce the action and result:
60+
61+
```
62+
Broker: [action] — [result]
63+
```
64+
65+
Examples:
66+
- `Broker: up — healthy at http://127.0.0.1:8080`
67+
- `Broker: down — stack removed`
68+
- `Broker: status — not reachable (run /broker up)`
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
name: devflow-client
3+
description: >
4+
Use when starting any development work on AgentAuth Python SDK — loads the
5+
Development Flow, checks tracker state, and tells you which step to execute next.
6+
Trigger on: "start dev", "what's next", "resume work", "continue",
7+
"where are we", "pick up where we left off", any development request.
8+
No council steps, Python-specific gates.
9+
---
10+
11+
# AgentAuth Python SDK — Development Flow
12+
13+
Start here for any development work. This skill loads context and tells you
14+
what to do next.
15+
16+
## Instructions
17+
18+
1. Read these files in order:
19+
- `MEMORY.md` (repo root)
20+
- `FLOW.md` (repo root) — if it doesn't exist or has no current step, start at Step 1
21+
- `.plans/tracker.jsonl` (current state of all stories and tasks) — create if missing
22+
23+
2. From FLOW.md + tracker, identify the current step:
24+
25+
| Step | What | Skill | Model | Done when |
26+
|------|------|-------|-------|-----------|
27+
| 1 | Brainstorm | `superpowers:brainstorming` | **opus** | Design doc in `.plans/designs/` |
28+
| 2 | Write Spec | Follow `.plans/SPEC-TEMPLATE.md` | **opus** | Spec in `.plans/specs/` |
29+
| 3 | Impl Plan | `superpowers:writing-plans` | **opus** | Plan in `.plans/` with tasks |
30+
| 4 | Acceptance Tests | Write stories in `tests/sdk-core/` | **opus** | Stories with Who/What/Why/How/Expected |
31+
| 5 | Register Tracker | Update `.plans/tracker.jsonl` | any | All stories + tasks registered |
32+
| 6 | Code | `superpowers:executing-plans` | **sonnet** | All tasks PASS, gates green |
33+
| 7 | Review | `superpowers:requesting-code-review` + `writing-plans` | **sonnet** / **opus** | Findings documented + fix plan written |
34+
| 7.5 | Fix Findings | `superpowers:executing-plans` | **sonnet** | Fix plan complete, gates green |
35+
| 8 | Live Test | `superpowers:verification-before-completion` | **sonnet** | Integration tests PASS against live broker |
36+
| 9 | Merge | `superpowers:finishing-a-development-branch` | any | Human approved, merged to `main` |
37+
38+
**No council steps.** This is a client SDK — faster iteration, fewer review gates.
39+
40+
**Step 7:** Reviewer produces findings AND a fix plan. No ad-hoc fixes.
41+
42+
**Step 6 + 7.5:** Use `executing-plans` for all coding — even small fixes.
43+
44+
3. Announce: "Dev Flow (Python SDK): Step N — [step name]. [X/Y tasks done]. Next: [action]."
45+
46+
4. Invoke the relevant superpowers skill if one is listed.
47+
48+
## API Source of Truth
49+
50+
The broker API contract lives in-repo (vendored, frozen):
51+
- **API contract:** `broker/docs/api.md` — see `broker/VENDOR.md` for provenance
52+
53+
Read the API doc before writing or modifying any HTTP call in the SDK.
54+
55+
## Gates (run after every commit)
56+
57+
```bash
58+
uv run ruff check . # G1: lint
59+
uv run mypy --strict src/ # G2: type check
60+
uv run pytest tests/unit/ # G3: unit tests
61+
```
62+
63+
All three must PASS before moving to the next task.
64+
65+
## Contamination Check
66+
67+
After any HITL removal work:
68+
```bash
69+
grep -ri "hitl\|approval\|oidc\|federation\|sidecar" src/ tests/
70+
```
71+
Must return nothing.
72+
73+
## Live Broker Testing
74+
75+
Integration and acceptance tests require a running broker. Use the in-repo vendored copy:
76+
```bash
77+
export AA_ADMIN_SECRET="live-test-secret-32bytes-long-ok"
78+
./broker/scripts/stack_up.sh
79+
```
80+
81+
Then run SDK integration tests:
82+
```bash
83+
uv run pytest -m integration
84+
```
85+
86+
## Rules
87+
88+
- Branch from `main`. Feature branches: `feature/*`, fix branches: `fix/*`.
89+
- Plans save to `.plans/`, specs to `.plans/specs/`, designs to `.plans/designs/`.
90+
- Update tracker when story/task status changes.
91+
- **Run gates after each commit.** Fix failures before moving on.
92+
- **Update `CHANGELOG.md` with every user-facing change** — same commit as the code.
93+
- **Strict types everywhere** — no untyped variables, parameters, or returns.
94+
- **`uv` only** — never pip, poetry, or conda.

.claude/skills/broker/SKILL.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,13 @@ Parse the argument from the skill invocation. Default to `status` if no argument
2323
|----------|---------|----------|
2424
| `AA_ADMIN_SECRET` | `live-test-secret-32bytes-long-ok` | Pass as second arg: `/broker up mysecret` |
2525
| `AA_HOST_PORT` | `8080` | Set env var before invoking |
26-
| Core project path | `~/proj/agentauth-core` ||
26+
| Broker path | `./broker` (vendored in-repo) ||
2727

2828
### `up`
2929

3030
```bash
3131
export AA_ADMIN_SECRET="${SECRET:-live-test-secret-32bytes-long-ok}"
32-
cd ~/proj/agentauth-core
33-
./scripts/stack_up.sh
32+
./broker/scripts/stack_up.sh
3433
```
3534

3635
After stack_up completes, run a health check:
@@ -44,8 +43,7 @@ Report success or failure clearly. If health check fails, wait 3 seconds and ret
4443
### `down`
4544

4645
```bash
47-
cd ~/proj/agentauth-core
48-
./scripts/stack_down.sh
46+
./broker/scripts/stack_down.sh
4947
```
5048

5149
### `status`

.claude/skills/devflow-client/SKILL.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: >
55
Development Flow, checks tracker state, and tells you which step to execute next.
66
Trigger on: "start dev", "what's next", "resume work", "continue",
77
"where are we", "pick up where we left off", any development request.
8-
Adapted from agentauth-core's devflow — no council steps, Python-specific gates.
8+
No council steps, Python-specific gates.
99
---
1010

1111
# AgentAuth Python SDK — Development Flow
@@ -45,12 +45,10 @@ what to do next.
4545

4646
4. Invoke the relevant superpowers skill if one is listed.
4747

48-
## Parent Project Context
48+
## API Source of Truth
4949

50-
The API source of truth lives in the parent project:
51-
- **API contract:** `~/proj/agentauth-core/docs/api.md`
52-
- **Design doc:** `~/proj/agentauth-core/.plans/designs/2026-04-01-python-sdk-repo-design.md`
53-
- **Strategic decisions:** `~/proj/agentauth-core/FLOW.md`
50+
The broker API contract lives in-repo (vendored, frozen):
51+
- **API contract:** `broker/docs/api.md` — see `broker/VENDOR.md` for provenance
5452

5553
Read the API doc before writing or modifying any HTTP call in the SDK.
5654

@@ -74,11 +72,10 @@ Must return nothing.
7472

7573
## Live Broker Testing
7674

77-
Integration and acceptance tests require a running core broker:
75+
Integration and acceptance tests require a running broker. Use the in-repo vendored copy:
7876
```bash
79-
cd ~/proj/agentauth-core
8077
export AA_ADMIN_SECRET="live-test-secret-32bytes-long-ok"
81-
./scripts/stack_up.sh
78+
./broker/scripts/stack_up.sh
8279
```
8380

8481
Then run SDK integration tests:

.plans/2026-04-02-sdk-broker-gap-review.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
> **Date:** 2026-04-02
44
> **Status:** Reviewed — Codex adversarial review added findings 12–15
55
> **Scope:** Every field the broker returns vs what the Python SDK exposes, drops, or hides.
6-
> **Source of truth:** Broker handlers in `agentauth-core/internal/handler/` and `agentauth-core/internal/admin/`, `agentauth-core/internal/app/`. API spec: `agentauth-core/docs/api.md`.
6+
> **Source of truth:** Broker handlers in `broker/internal/handler/`, `broker/internal/admin/`, `broker/internal/app/` (vendored). API spec: `broker/docs/api.md`.
77
88
---
99

.plans/2026-04-05-v0.3.0-phase2-cache-correctness-plan.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -863,10 +863,8 @@ Expected: all PASS.
863863

864864
First ensure broker is up:
865865
```bash
866-
cd ~/proj/agentauth-core
867866
export AA_ADMIN_SECRET="live-test-secret-32bytes-long-ok"
868-
./scripts/stack_up.sh
869-
cd -
867+
./broker/scripts/stack_up.sh
870868
```
871869

872870
Then:
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{"type":"story","id":"DEMO-PC1","title":"Broker Is Running and Accessible","classification":"PRECONDITION","status":"NOT_VERIFIED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
2+
{"type":"story","id":"DEMO-PC2","title":"Anthropic API Key Is Valid","classification":"PRECONDITION","status":"NOT_VERIFIED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
3+
{"type":"story","id":"DEMO-PC3","title":"Demo App Starts Successfully","classification":"PRECONDITION","status":"NOT_VERIFIED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
4+
{"type":"story","id":"DEMO-S1","title":"Pipeline Processes All 12 Transactions","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
5+
{"type":"story","id":"DEMO-S2","title":"Each Agent Gets Correctly Scoped Credential","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
6+
{"type":"story","id":"DEMO-S3","title":"Prompt Injection Contained by Credential Layer","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
7+
{"type":"story","id":"DEMO-S4","title":"Report Writer Never Sees Raw Transactions","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
8+
{"type":"story","id":"DEMO-S5","title":"Delegation Chain Shows Scope Attenuation","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
9+
{"type":"story","id":"DEMO-S6","title":"Audit Trail Has Verifiable Hash Chain","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
10+
{"type":"story","id":"DEMO-S7","title":"All Tokens Revoked After Pipeline Completes","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
11+
{"type":"story","id":"DEMO-S8","title":"Startup Fails Clearly When Dependencies Missing","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
12+
{"type":"story","id":"DEMO-S9","title":"Dashboard Shows Real-Time Token Lifecycle","classification":"ACCEPTANCE","status":"NOT_STARTED","spec":".plans/specs/2026-04-01-demo-app-spec.md"}
13+
{"type":"step","id":"STEP-1","title":"Brainstorm","status":"DONE","note":"Design v2 approved - real LLM pipeline, not showcase booth"}
14+
{"type":"step","id":"STEP-2","title":"Write Spec","status":"DONE","note":"Rewritten against v2 design"}
15+
{"type":"step","id":"STEP-3","title":"Impl Plan","status":"DONE","note":"Plan saved to .plans/2026-04-01-demo-app-plan.md — 10 tasks"}
16+
{"type":"step","id":"STEP-4","title":"Acceptance Tests","status":"DONE","note":"12 stories (3 PC + 9 ACC) in tests/demo-app/user-stories.md"}
17+
{"type":"step","id":"STEP-5","title":"Register Tracker","status":"DONE","note":"This file"}

.plans/PROMPT.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
Gemma-4 "felt" like it was reading because it was actually trying to build a mental model, which is why it hit the context limit.
2+
3+
### The Solution: The "Decomposition" Strategy
4+
5+
Instead of asking the agent to "Read the code and write the SDK," you must break this into a multi-stage pipeline. You need to separate
6+
Discovery from Design from Implementation.
7+
8+
Here is the exact workflow you should use with me (or any agent) to get that Python SDK skeleton:
9+
10+
#### Stage 1: The "Knowledge Extraction" Phase (Building the Map)
11+
12+
Do not ask for code yet. Ask for a "System Map."
13+
- Prompt: "I want to build a Python SDK for this Go application. First, I don't want any code. I want you to explore the Go codebase and
14+
create a high-level 'Logic Map.' Identify: 1. The primary entry points (APIs/CLI), 2. The core data structures (the 'Models'), and 3. The
15+
lifecycle of a request/command. Use ls, grep, and read to build this map. Output it as a Markdown summary."
16+
- Goal: This forces the model to use its tools to build a compressed version of the logic in its own "memory" without filling the context
17+
with raw code yet.
18+
19+
#### Stage 2: The "Interface Definition" Phase (The Contract)
20+
21+
Once you have the map, define the "Contract."
22+
- Prompt: "Based on the Logic Map we just created, let's define the Python Interface. I want a list of Python Classes and Methods that
23+
would allow a developer to do [X, Y, and Z] without knowing the Go implementation. Do not write the implementation logic; just write the
24+
Class signatures and Docstrings in Python."
25+
- Goal: This creates the "Skeleton" you mentioned. By focusing on signatures and docstrings, you aren't wasting tokens on implementation
26+
details.
27+
28+
#### Stage 3: The "Incremental Implementation" Phase (The Heavy Lifting)
29+
30+
Now, you implement the skeleton piece by piece. This is where most people fail by trying to do it all at once.
31+
- Prompt: "Now, let's implement the Client class from our skeleton. To do this accurately, please read internal/transport/client.go and
32+
pkg/models/user.go in the Go repo. Ensure the Python Client handles the authentication flow exactly as the Go version does. Use asyncio
33+
for the Python implementation."
34+
- Goal: You are providing the "source of truth" (the specific Go files) only when the model is ready to write the specific Python code.
35+
36+
### Why this works for Gemma-4 specifically:
37+
38+
1. Context Management: By breaking it into stages, you clear the "noise" of the Go implementation once the "Map" is built. You aren't
39+
carrying 50 Go files in the context while trying to write 10 Python files.
40+
2. Precision: You are directing the "Attention" of the model. Instead of "Look at everything," you are saying "Look at these 2 files to
41+
solve this 1 specific class."
42+
3. Verification: In Stage 1 and 2, you can check the model's work. If the "Logic Map" is wrong, you catch it before it writes a single
43+
line of broken Python.
44+
45+
If you want to start right now, give me this command:
46+
47+
│ "I need to build a Python SDK for a Go application. Let's start with Stage 1. Explore the current directory, identify the core Go
48+
│ modules, and give me a summary of how the application's main logic flows. Don't write any Python yet; just build the map."

0 commit comments

Comments
 (0)