From b71c5c966db6cbb57fe1f68391d09e7f2d412235 Mon Sep 17 00:00:00 2001 From: shreyas-lyzr Date: Mon, 25 May 2026 13:36:41 -0400 Subject: [PATCH] docs: refresh OpenGAP paper for v0.4 (mcp_servers, financial_governance, identity) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Updates the paper to the current repo state: - Rebrand complete: opengap CLI / @open-gitagent/opengap v0.4.0; repo URL → open-gitagent/opengap; honest note on the unscoped-name npm block - Capabilities §: add mcp_servers as a first-class portable concern - Compliance §: financial_governance is now a shipped, schema-validated block (was hypothetical) with the real YAML and validate-time warnings - New §: cryptographic-identity RFC (provenance vs runtime delegation, Ed25519, passport_uri, optional) - Reference impl §: opengap verbs, v0.4.0, provenance, gitagent alias - Evaluation §: 15 adapters + 11 runners, three community-originated spec features, CI on Node 18/20/22, accurate npm publish state - Future work §: financial_governance Phase 1 shipped → Phase 2/3 remain; identity-layer implementation added --- paper/open-gap.md | 64 +++++++++++++----- paper/open-gap.tex | 157 +++++++++++++++++++++++++++++++++------------ 2 files changed, 163 insertions(+), 58 deletions(-) diff --git a/paper/open-gap.md b/paper/open-gap.md index 3ca3076..d3cf8e7 100644 --- a/paper/open-gap.md +++ b/paper/open-gap.md @@ -3,7 +3,7 @@ **Shreyas Kapale** OpenGAP Working Group -[shreyas@lyzr.ai](mailto:shreyas@lyzr.ai) · [gitagent.sh](https://gitagent.sh) · [github.com/open-gitagent/gitagent](https://github.com/open-gitagent/gitagent) +[shreyas@lyzr.ai](mailto:shreyas@lyzr.ai) · [gitagent.sh](https://gitagent.sh) · [github.com/open-gitagent/opengap](https://github.com/open-gitagent/opengap) *Preprint. April 2026.* @@ -15,7 +15,9 @@ AI agents are being deployed into regulated, high-stakes environments faster tha Under OpenGAP, an agent is fully described by files in a directory: identity (`SOUL.md`), hard constraints (`RULES.md`), segregation-of-duties (`DUTIES.md`), skills, tools, knowledge, memory, hooks, sub-agents, and regulatory-compliance artifacts. A single canonical definition deterministically exports to **15 execution environments** (Claude Code, OpenAI Agents SDK, CrewAI, Cursor, Gemini CLI, Codex, OpenCode, Kiro, Lyzr, OpenClaw, Nanobot, GitHub Copilot, GitHub Models, GitClaw, system-prompt) with a documented **fidelity profile** per target. **Fourteen lifecycle patterns** — which we show decompose into four meta-patterns: *structural guarantees*, *lifecycle operations*, *collaboration primitives*, and *runtime hooks* — emerge naturally from the git substrate. Compliance is a first-class spec element, mapped onto FINRA 3110/4511/2210, SEC Reg BI/Reg S-P/17a-4, Federal Reserve SR 11-7, and CFPB Circular 2022-03, and enforced through `pre_tool_use` hooks with `fail_open: false` semantics. We prove a simple **structural SOD theorem**: any segregation-of-duties conflict declared in `DUTIES.md` is unbypassable by the agent itself, provided branch-protection rules are active and the agent has no force-push rights. -OpenGAP is MIT-licensed. The reference implementation has **2,700+ GitHub stars**, 21 adapters, 11 runtime runners, and ships as `@open-gitagent/opengap` on npm. This paper consolidates the specification, the adapter model, the patterns, and the compliance framing into a single citable artifact, and sets an agenda for conformance testing, formal SOD verification, and an event schema for regulated agent-to-agent interaction. +Spec v0.4 adds three concrete extensions to this surface: portable `mcp_servers` declarations that export to each runtime's native MCP configuration, a `financial_governance` block with spending caps and approval thresholds for payment-capable agents, and an accepted RFC for an optional Ed25519 cryptographic-identity layer. + +OpenGAP is MIT-licensed. The reference implementation has **2,700+ GitHub stars**, 15 adapters, 11 runtime runners, and ships provenance-signed as `@open-gitagent/opengap` (v0.4.0) on npm. This paper consolidates the specification, the adapter model, the patterns, and the compliance framing into a single citable artifact, and sets an agenda for conformance testing, formal SOD verification, and an event schema for regulated agent-to-agent interaction. **Keywords:** AI agents · protocols · version control · governance · segregation of duties · FINRA · SEC · SR 11-7 · model risk management · interoperability · open standards. @@ -235,6 +237,8 @@ Three files carry the agent's normative content. They are read in this order by **Tools** ($T$) are MCP-compatible schemas annotated with cost class (`none|low|medium|high`) and side-effect class. Implementations may be local (`tools/name.py`) or remote (URL). +**MCP servers** (`mcp_servers` in `agent.yaml`) declare external Model Context Protocol servers once and export them to each runtime's native config. Each entry is stdio-based (`command` + `args` + `env`) or HTTP-based (`url` + `headers`), with `${VAR}` interpolation for secrets. One declaration renders to `.mcp.json` (Claude Code), the `mcpServers` block (Codex), `.cursor/mcp.json` (Cursor), or a markdown section (crewai, copilot) — extending "define once, run anywhere" from behavior to infrastructure integrations. + **Knowledge** ($K$) is read-only reference material: Markdown, CSV, PDF, or any readable format, indexed by `knowledge/index.yaml`. Embeddings and retrieval indices are derived artifacts materialized in `.gitagent/cache/`. **Memory** ($Me$) is append-only by convention. `MEMORY.md` (≤200 lines) is always loaded at session start; `runtime/` holds live state; `archive/` holds aged-out snapshots. @@ -471,7 +475,27 @@ pre_tool_use: The registered pre-tool hook fires before any matching tool call, receives the tool name and arguments on `stdin`, and exits `0` to allow or non-zero to block. `fail_open: false` upgrades the hook from advisory to enforcement: hook error, timeout, and non-zero exit all block the call. -Concretely, a financial agent's `check-spending.sh` reads `compliance.financial_governance.spending.max_per_transaction_cents`, parses the pending payment from tool input, and blocks the call if the cap is exceeded. The hook is part of the agent repository, under version control, reviewed in the same PR as the policy it enforces. This locality — *policy and enforcement in one diff* — is the property that runtime-only policy engines cannot provide. +As of spec v0.4 the `compliance.financial_governance` block is a shipped, schema-validated feature for payment-capable agents — spending limits in absolute cents, an approval threshold, allowed/blocked categories, and a named `firewall` identifier (never an endpoint URL, keeping the spec vendor-neutral): + +```yaml +compliance: + risk_tier: high + financial_governance: + enabled: true + firewall: valkurai # named identifier, not an endpoint + spending: + max_per_transaction_cents: 5000 # $50.00 hard cap + max_monthly_cents: 100000 # $1,000.00 cumulative + currency: AUD # ISO 4217 + allowed_categories: [software, compute, api_services] + blocked_categories: [gambling, crypto, unknown] + approval: + require_above_cents: 2000 # human approval above $20.00 + timeout_minutes: 60 + auto_deny_on_timeout: true +``` + +The block is declarative; enforcement is a `pre_tool_use` hook (`check-spending.sh`) that reads `financial_governance.spending.max_per_transaction_cents`, parses the pending payment from tool input, and blocks the call if the cap is exceeded. `opengap validate` warns when a `risk_tier: high` agent declares financial tools but no `financial_governance` block, and rejects a `firewall` value that looks like a URL. The hook lives in the agent repository under version control, reviewed in the same PR as the policy it enforces. This locality — *policy and enforcement in one diff* — is the property that runtime-only policy engines cannot provide. The cross-runtime payment **event schema** is deliberately deferred (§13) until two or more enforcement implementations exist. ### 7.4 Audit trail is `git log` @@ -501,6 +525,10 @@ Honesty about limitations is more valuable than an enumeration of features. Open 4. **Model capability drift.** `human_in_the_loop: always` does not verify that the human reviewed the output; it only requires that the runtime offer a review gate. 5. **Extrinsic compliance obligations.** GAP maps structural controls onto git workflows. Jurisdiction-specific duties (MiCA, AI Act specifics, HIPAA, GDPR Article 22) require local legal review. +### 7.6 Cryptographic identity (RFC, optional) + +The insider item above motivates an optional cryptographic-identity layer, accepted as an RFC (`spec/rfcs/identity.md`) in spec v0.4 and slated as an optional `identity` block on `agent.yaml`. It separates two concerns the literature conflates: **provenance** (this manifest at this commit was authored by the holder of key *X* — already solvable with signed tags or sigstore, no spec change needed) and **runtime delegation** (the running agent producing this output acts on behalf of parent agent *Y*, with scope *Z*, signed by *Y*'s key, not yet revoked — the genuine gap). The RFC binds a manifest to an Ed25519 public key, with an optional `passport_uri` pointing at a richer identity document for scoped delegation and revocation, and reserves `signatures.` for manifest signatures. Verification semantics live in the spec; enforcement (refuse-to-load, sandbox, log-and-continue) is left to the runtime. Fully optional — manifests without it are unchanged. This closes the inter-organization delegation case and the regulated-runtime check: *prove the agent that produced this output was the agent in the manifest, with authority not yet revoked.* + --- ## 8. A Structural SOD Theorem @@ -545,7 +573,7 @@ Let $w \in W$ be a trace that contains tool calls attributable to both $r_i$ and ## 9. Reference Implementation -The reference implementation is **`opengap`** (GAP manager), a TypeScript CLI published to npm as both `@open-gitagent/opengap` (scoped, provenance-signed) and `opengap` (unscoped alias). The repository is [github.com/open-gitagent/gitagent](https://github.com/open-gitagent/gitagent); the license is MIT. +The reference implementation is **`opengap`**, a TypeScript CLI published to npm as `@open-gitagent/opengap` (scoped, provenance-signed via GitHub Actions OIDC) at v0.4.0. The `gitagent` command is installed as a backward-compatibility alias for the same binary. The repository is [github.com/open-gitagent/opengap](https://github.com/open-gitagent/opengap); the license is MIT. (The project was originally named `gitagent`, briefly `gapman`; the unscoped root name `opengap` is unavailable on npm under its package-similarity policy, so the scoped name is canonical.) ### 9.1 Verb surface @@ -628,15 +656,16 @@ Evaluation has three axes: adoption, fidelity, and qualitative comparison. ### 11.1 Adoption -As of April 2026, the reference repository has: +As of May 2026, the reference repository `open-gitagent/opengap` has: - **2,700+ GitHub stars** -- **21 adapters**, five of which (Cursor, Kiro, Codex, Gemini, OpenCode) were contributed by pull request from *external authors* unaffiliated with the maintainer -- **15+ closed issues** and active RFC discussions (financial governance, conformance testing, registry trust model) -- **Provenance-signed** releases on npm under both `@open-gitagent/opengap` and `opengap` +- **15 export adapters + 11 runtime runners**; five adapters (Cursor, Kiro, Codex, Gemini, OpenCode) were contributed by pull request from *external authors* unaffiliated with the maintainer +- **Three spec features shipped from community RFCs/PRs** in v0.4: portable `mcp_servers`, the `financial_governance` block, and the accepted cryptographic-identity RFC +- **Provenance-signed** releases on npm as `@open-gitagent/opengap` v0.4.0 (the unscoped root `opengap` is blocked by npm's package-similarity policy, so the scoped name is canonical) +- **CI on Node 18 / 20 / 22** building and validating the bundled example agents on every push - **Cross-community pollination**: GAP agents appear in registries that predate it (the Lyzr registry) and in new registries that postdate it (GitClaw, the OpenClaw ecosystem) -External adapter contributions are the strongest signal that the protocol is perceived as a **neutral substrate**, not a vendor product. No single contributor has standing to force a breaking change in favor of their framework. +External adapter and spec contributions are the strongest signal that the protocol is perceived as a **neutral substrate**, not a vendor product. No single contributor has standing to force a breaking change in favor of their framework. ### 11.2 Export fidelity (empirical) @@ -697,12 +726,13 @@ Git is imperfect. Its UX is unloved; its model of content-addressed history is s 1. **Conformance test suite.** A portable, language-agnostic test suite that an adapter implementer can run to certify a fidelity claim. The single highest-leverage next artifact. 2. **Formal SOD verification.** Express `DUTIES.md` conflict rules as logical assertions and verify that an agent's tool-call trace satisfies them. TLA⁺ or Alloy is a natural fit; a more ambitious direction is synthesis — generate an agent's executable policy from its declarative `DUTIES.md`. -3. **Financial-governance event schema.** Once two or more external enforcement implementations exist, standardize `payment_required` / `payment_approval` / `payment_receipt` events as an RFC addendum. Doing this *prematurely* risks standardizing imaginary interop; waiting for real implementations grounds it in observed need [16]. -4. **A2A server adapter.** GAP currently declares A2A compatibility in `agent.yaml` but does not ship a reference A2A server. Closing this loop enables GAP agents to serve as first-class A2A peers. -5. **Empirical user study.** A controlled comparison of team velocity, change-review latency, and compliance-prep time when defining, porting, and auditing agents under GAP versus framework-native equivalents. This is the evaluation a full systems-venue submission would require. -6. **Cross-jurisdictional regulatory mappings.** Extend the mapping table (§7.2) to EU AI Act, UK FCA guidance, MAS Singapore, ISO 42001, and sectoral controls (HIPAA, HITRUST). The structural claims are jurisdiction-neutral, but the mapping is not. -7. **Knowledge provenance.** A spec extension for signed, cryptographically-verifiable provenance on documents in `knowledge/`, addressing the prompt-injection threat in §7.5. -8. **Agent-to-agent supply chain.** Transitively pinned `extends:` chains with signed provenance — `this agent extends A@v1.2.3 which extends B@v4.0.1, verified signatures, verified fidelity claims`. +3. **Financial-governance Phase 2/3.** The declarative `financial_governance` block shipped in v0.4 (§6.3); Phase 2 is a reference `pre_tool_use` enforcement hook, and Phase 3 is a `payment_required` / `payment_approval` / `payment_receipt` event schema for cross-runtime interop. Phase 3 waits until two or more enforcement implementations exist; standardizing earlier risks codifying imaginary interop [16]. +4. **Identity-layer implementation.** The cryptographic-identity RFC (§7.6) is accepted; next is the `identity` block in `agent-yaml.schema.json`, a reference verifier, and a delegation-chain example. +6. **A2A server adapter.** GAP currently declares A2A compatibility in `agent.yaml` but does not ship a reference A2A server. Closing this loop enables GAP agents to serve as first-class A2A peers. +7. **Empirical user study.** A controlled comparison of team velocity, change-review latency, and compliance-prep time when defining, porting, and auditing agents under GAP versus framework-native equivalents. This is the evaluation a full systems-venue submission would require. +8. **Cross-jurisdictional regulatory mappings.** Extend the mapping table (§7.2) to EU AI Act, UK FCA guidance, MAS Singapore, ISO 42001, and sectoral controls (HIPAA, HITRUST). The structural claims are jurisdiction-neutral, but the mapping is not. +9. **Knowledge provenance.** A spec extension for signed, cryptographically-verifiable provenance on documents in `knowledge/`, addressing the prompt-injection threat in §7.5. +10. **Agent-to-agent supply chain.** Transitively pinned `extends:` chains with signed provenance — `this agent extends A@v1.2.3 which extends B@v4.0.1, verified signatures, verified fidelity claims`. --- @@ -739,8 +769,8 @@ The OpenGAP working group thanks contributors to the reference implementation an 13. **CFPB.** *Circular 2022-03: Adverse Action Notification Requirements in Connection with Credit Decisions Based on Complex Algorithms.* 2022. 14. **A. Karpathy.** *LLM Wiki — a note on a pattern for LLM-managed knowledge bases.* GitHub Gist, 2025. 15. **P. Priyam.** *GitClose: A git-native reference implementation of the monthly financial close using OpenGAP.* 2026. -16. **OpenGAP Working Group.** *RFC #38 — `compliance.financial_governance` block.* 2026. -17. **OpenGAP Working Group.** *OpenGAP Specification v0.1.0.* 2026. +16. **OpenGAP Working Group.** *RFC #38 — `compliance.financial_governance` block.* 2026. +17. **OpenGAP Working Group.** *OpenGAP Specification v0.1.0.* 2026. 18. **OpenGAP Working Group.** *opengap on npm.* 2026. 19. **L. Lamport.** *Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers.* Addison-Wesley, 2002. 20. **D. Jackson.** *Software Abstractions: Logic, Language, and Analysis (Alloy).* MIT Press, 2012. diff --git a/paper/open-gap.tex b/paper/open-gap.tex index adef00e..a1f45fd 100644 --- a/paper/open-gap.tex +++ b/paper/open-gap.tex @@ -54,7 +54,7 @@ \title{\textbf{\Huge OpenGAP} \\[4pt] \large\textbf{GitAgentProtocol} \\[4pt] \normalsize\textit{A Git-Native Protocol for the AI Agent Lifecycle}} \author[1]{Shreyas Kapale} -\affil[1]{OpenGAP Working Group \\ \texttt{shreyas@lyzr.ai} \\ \href{https://gitagent.sh}{gitagent.sh} \\ \href{https://github.com/open-gitagent/gitagent}{github.com/open-gitagent/gitagent}} +\affil[1]{OpenGAP Working Group \\ \texttt{shreyas@lyzr.ai} \\ \href{https://gitagent.sh}{gitagent.sh} \\ \href{https://github.com/open-gitagent/opengap}{github.com/open-gitagent/opengap}} \date{\today} @@ -74,7 +74,7 @@ \texttt{knowledge/}, \texttt{memory/}, \texttt{hooks/}), and regulatory-compliance artifacts (\texttt{compliance/})\,---\,all as files under version control. -From this single source of truth, the reference CLI (\texttt{gapman}) deterministically +From this single source of truth, the reference CLI (\texttt{opengap}) deterministically exports the same agent into fifteen different execution environments, including Claude Code, OpenAI Agents SDK, CrewAI, Cursor, Gemini CLI, Codex, OpenCode, Kiro, Lyzr, OpenClaw, Nanobot, GitHub Copilot, GitHub Models, GitClaw, and a plain system-prompt @@ -86,11 +86,15 @@ SEC~Reg\,BI/Reg\,S-P/17a-4, Federal Reserve~SR\,11-7, and CFPB~Circular~2022-03. Enforcement is achieved through \texttt{pre\_tool\_use} hooks with \texttt{fail\_open: false} semantics, making policy a first-class, testable artifact rather than a runtime -wrapper. +wrapper. Spec v0.4 adds three concrete extensions to this surface: portable +\texttt{mcp\_servers} declarations that export to each runtime's native MCP +configuration, a \texttt{financial\_governance} block with spending caps and approval +thresholds for payment-capable agents, and an accepted RFC for an optional +Ed25519 cryptographic-identity layer. OpenGAP is released under the MIT license. The reference implementation has 2{,}700+ -GitHub stars, 21 adapters, 11 runtime runners, and ships as -\texttt{@open-gitagent/gapman} on npm. This paper consolidates the specification, the +GitHub stars, 15 adapters, 11 runtime runners, and ships provenance-signed as +\texttt{@open-gitagent/opengap} (v0.4.0) on npm. This paper consolidates the specification, the architectural patterns, the portability model, and the compliance framing into a single citable artifact, and sets out an agenda for conformance testing, formal segregation-of-duties verification, and event-schema standardization for regulated @@ -138,7 +142,7 @@ \section{Introduction} \item The \textbf{adapter model} (\S\ref{sec:portability}): a systematic way to translate the canonical definition into fifteen different execution environments with a measurable fidelity profile per target, validated by the reference implementation - \texttt{gapman}. + \texttt{opengap}. \item A catalogue of \textbf{fourteen architectural patterns} (\S\ref{sec:patterns}) that emerge when agents are defined git-natively, including human-in-the-loop via branch~+~PR, agent versioning via git tags, segregation of @@ -202,7 +206,7 @@ \section{Related Work}\label{sec:related} semantic\,---\,\texttt{git log} shows diffs but not ``v2.1.0 changed the supervisory policy.'' OpenGAP sits above frameworks: the canonical definition is the repository, and the framework-native implementation is a derived artifact produced by -\texttt{gapman export}. +\texttt{opengap export}. \paragraph{Raw YAML/JSON configuration.} Many teams reach for a hand-rolled \texttt{config.yaml}. This offers flexibility but no @@ -309,9 +313,9 @@ \subsection{Identity, rules, and duties} segregation of duties for regulated deployments. \end{itemize} -\subsection{Capabilities: skills, tools, knowledge, memory, hooks} +\subsection{Capabilities: skills, tools, MCP servers, knowledge, memory, hooks} -A GAP agent's capabilities are factored into five concerns. +A GAP agent's capabilities are factored into six concerns. \textbf{Skills} (\texttt{skills//SKILL.md}) are reusable capability modules: each is a Markdown file with YAML frontmatter describing inputs, outputs, and @@ -324,6 +328,16 @@ \subsection{Capabilities: skills, tools, knowledge, memory, hooks} (\texttt{none|low|medium|high}) and side-effect class. Implementations may live in \texttt{tools/.\{py,sh,js\}} or at an external endpoint. +\textbf{MCP servers} (\texttt{mcp\_servers} in \texttt{agent.yaml}) declare external +Model Context Protocol servers once and export them to each runtime's native +configuration. Each entry is either stdio-based (\texttt{command} + \texttt{args} + +\texttt{env}) or HTTP-based (\texttt{url} + \texttt{headers}), with \texttt{\$\{VAR\}} +interpolation for secrets. A single declaration renders to \texttt{.mcp.json} for +Claude Code, to the \texttt{mcpServers} block for Codex, to \texttt{.cursor/mcp.json} +for Cursor, and to a human-readable section for markdown-shaped targets. This extends +the ``define once, run anywhere'' property from the agent's own behavior to its +infrastructure integrations. + \textbf{Knowledge} (\texttt{knowledge/}) is reference material the agent may consult: Markdown, CSV, PDF, or any readable format, plus an \texttt{index.yaml} that hints retrieval keys. Knowledge is read-only from the agent's perspective. @@ -352,7 +366,7 @@ \subsection{Composition: extends, dependencies, sub-agents} \subsection{Formal manifest} Appendix~\ref{app:schema} gives an excerpt of \texttt{agent-yaml.schema.json}, the -JSON Schema that \texttt{gapman validate} enforces on every \texttt{agent.yaml}. The +JSON Schema that \texttt{opengap validate} enforces on every \texttt{agent.yaml}. The schema currently has ten files covering the agent manifest, hooks, hook I/O, tools, skills, knowledge, memory, skillflows, configuration, and the marketplace descriptor. @@ -362,12 +376,12 @@ \section{Portability: The Adapter Model}\label{sec:portability} \subsection{Export-time versus runtime execution} -A GAP agent can be consumed in two ways. In \emph{export} mode, \texttt{gapman export +A GAP agent can be consumed in two ways. In \emph{export} mode, \texttt{opengap export --format } deterministically renders the canonical definition into a directory or file in the target framework's native format (for example, \texttt{.cursor/rules/} files for Cursor, \texttt{CLAUDE.md}~+~\texttt{.claude/} for Claude Code, \texttt{AGENTS.md}~+~\texttt{opencode.json} for OpenCode). In \emph{runtime} mode, -\texttt{gapman run --adapter } performs the equivalent export into a temporary +\texttt{opengap run --adapter } performs the equivalent export into a temporary workspace and launches the target runtime over it, one-shot or interactively. This separation matters. Export produces a reviewable artifact; runtime produces a @@ -416,7 +430,7 @@ \subsection{Fidelity profile} blob. We define the \emph{fidelity profile} of an adapter as the set of GAP elements it faithfully represents in its target format. Table~\ref{tab:fidelity} summarizes the profile at a coarse grain; the full matrix (Appendix~\ref{app:fidelity}) -is regenerated by running \texttt{gapman export} against the reference agents +is regenerated by running \texttt{opengap export} against the reference agents \texttt{examples/standard/} and \texttt{examples/full/} and mechanically diffing the output for the presence of each element. @@ -481,7 +495,7 @@ \section{Lifecycle Patterns}\label{sec:patterns} policy assertion. \paragraph{P4: CI/CD for Agents.} -\texttt{gapman validate --compliance} runs in continuous integration on every push. +\texttt{opengap validate --compliance} runs in continuous integration on every push. A failing agent manifest blocks the merge. Agent quality is treated as code quality. \paragraph{P5: Branch-Based Deployment.} @@ -625,12 +639,42 @@ \subsection{Hook-based enforcement}\label{sec:hook-enforcement} or non-zero exit. This is the mechanism by which spending caps, PII redaction, and segregation-of-duties conflicts are structurally enforced. -Concretely, a financial agent might register a \texttt{pre\_tool\_use} hook that reads -\texttt{compliance.financial\_governance.spending.max\_per\_transaction\_cents}, parses -the pending payment amount from tool input, and blocks the call if the cap is -exceeded. The hook is part of the agent repository, under version control, and -reviewable in the same PR as the policy it enforces\,---\,a property that runtime-only -policy engines do not provide. +\subsection{Financial governance}\label{sec:fingov} + +As of spec v0.4, the \texttt{compliance.financial\_governance} block is a shipped, +schema-validated feature for payment-capable agents. It declares spending limits in +absolute minor units (cents), an approval threshold, allowed and blocked categories, +and a named \emph{firewall} identifier\,---\,never an endpoint URL, keeping the spec +vendor-neutral: + +\begin{lstlisting}[language=yamlex] +compliance: + risk_tier: high + financial_governance: + enabled: true + firewall: valkurai # named identifier, not an endpoint + spending: + max_per_transaction_cents: 5000 # $50.00 hard cap + max_monthly_cents: 100000 # $1,000.00 cumulative + currency: AUD # ISO 4217 + allowed_categories: [software, compute, api_services] + blocked_categories: [gambling, crypto, unknown] + approval: + require_above_cents: 2000 # human approval above $20.00 + timeout_minutes: 60 + auto_deny_on_timeout: true +\end{lstlisting} + +The block is declarative; enforcement is a \texttt{pre\_tool\_use} hook that reads +\texttt{financial\_governance.spending.max\_per\_transaction\_cents}, parses the pending +payment from tool input, and blocks the call if the cap is exceeded. \texttt{opengap +validate} warns when a \texttt{risk\_tier: high} agent declares financial tools but no +\texttt{financial\_governance} block, and rejects a \texttt{firewall} value that looks +like a URL. The hook lives in the agent repository under version control, reviewable in +the same pull request as the policy it enforces\,---\,a locality that runtime-only +policy engines cannot provide. The event schema for cross-runtime payment interop is +deliberately deferred (\S\ref{sec:future}) until two or more enforcement implementations +exist; standardizing it earlier would standardize imaginary interop. \subsection{Audit trail is \texttt{git log}} @@ -666,16 +710,37 @@ \subsection{Threat model: what OpenGAP does not prevent}\label{sec:threat} local legal review; GAP provides the substrate, not the opinion. \end{enumerate} +\subsection{Cryptographic identity (RFC, optional)}\label{sec:identity} + +The threat-model item on insiders motivates an optional cryptographic identity layer, +accepted as an RFC (\texttt{spec/rfcs/identity.md}) in spec v0.4 and slated as an +optional \texttt{identity} block on \texttt{agent.yaml}. It distinguishes two concerns +the literature often conflates: \emph{provenance} (this manifest at this commit was +authored by the holder of key $X$\,---\,already solvable with signed tags or sigstore, +no spec change needed) and \emph{runtime delegation} (the running agent producing this +output acts on behalf of parent agent $Y$, with scope $Z$, signed by $Y$'s key, not yet +revoked\,---\,the genuine gap). The RFC binds a manifest to an Ed25519 public key, with +an optional \texttt{passport\_uri} pointing at a richer identity document for scoped +delegation and revocation, and reserves \texttt{signatures.} for manifest +signatures. Verification semantics live in the spec; enforcement (refuse to load, +sandbox, log-and-continue) is left to the runtime. The layer is fully optional; +manifests without it are unchanged. This closes the loop on inter-organization +delegation and the regulated-runtime check ``prove the agent that produced this output +was the agent in the manifest, with authority that had not been revoked.'' + % =============================================================== \section{Reference Implementation}\label{sec:impl} % =============================================================== -The reference implementation is \texttt{gapman} (``GAP manager''), a TypeScript CLI -published to npm as both \texttt{@open-gitagent/gapman} (scoped, provenance-signed) -and \texttt{gapman} (unscoped alias). The repository is -\texttt{github.com/open-gitagent/gitagent}; the license is MIT. +The reference implementation is \texttt{opengap}, a TypeScript CLI published to npm as +\texttt{@open-gitagent/opengap} (scoped, provenance-signed via GitHub Actions OIDC) at +v0.4.0. The \texttt{gitagent} command is installed as a backward-compatibility alias +for the same binary. The repository is \texttt{github.com/open-gitagent/opengap}; the +license is MIT. (The project was originally named \texttt{gitagent}, briefly +\texttt{gapman}; the unscoped root name \texttt{opengap} is unavailable on npm under its +package-similarity policy, so the scoped name is canonical.) -\texttt{gapman} exposes the following verbs: \texttt{init} (scaffold from templates +\texttt{opengap} exposes the following verbs: \texttt{init} (scaffold from templates \texttt{minimal}, \texttt{standard}, \texttt{full}, \texttt{llm-wiki}), \texttt{validate} (JSON-schema validation and optional compliance audit), \texttt{info} (summarize an agent), \texttt{export} (render to a target framework), @@ -739,21 +804,26 @@ \section{Evaluation}\label{sec:eval} \subsection{Adoption signals} -As of \today, the reference repository \texttt{open-gitagent/gitagent} has -\textbf{2{,}700+} GitHub stars, \textbf{21 community adapters} (Cursor, Kiro, Codex, -Gemini, OpenCode contributed via pull request by external authors), \textbf{15+ closed -issues} and active RFC discussions, and has been published on npm under both the -scoped and unscoped names. Adapter contributions from external developers are the -strongest signal that the protocol is perceived as a neutral substrate rather than a -vendor product\,---\,no contributor has standing to force a breaking change in favor -of their framework. These numbers are to be regenerated from the GitHub API at the -time of arXiv submission; we report them here as a floor. +As of \today, the reference repository \texttt{open-gitagent/opengap} has +\textbf{2{,}700+} GitHub stars and \textbf{15 export adapters} plus \textbf{11 runtime +runners}. Five adapters (Cursor, Kiro, Codex, Gemini, OpenCode) were contributed via +pull request by external authors, and three spec features now in \texttt{main}\,---\, +portable \texttt{mcp\_servers} (\S\ref{sec:protocol}), \texttt{financial\_governance} +(\S\ref{sec:fingov}), and the cryptographic-identity RFC (\S\ref{sec:identity})\,---\, +originated as community RFCs and pull requests. The package ships provenance-signed as +\texttt{@open-gitagent/opengap} v0.4.0; continuous integration builds on Node 18, 20, +and 22 and validates the bundled example agents on every push. Adapter and spec +contributions from external developers are the strongest signal that the protocol is +perceived as a neutral substrate rather than a vendor product\,---\,no contributor has +standing to force a breaking change in favor of their framework. These numbers are to be +regenerated from the GitHub API at the time of arXiv submission; we report them here as +a floor. \subsection{Export fidelity} The fidelity profile (Table~\ref{tab:fidelity}, full matrix in Appendix~\ref{app:fidelity}) is derived by running -\texttt{gapman export --format } over the reference agents +\texttt{opengap export --format } over the reference agents \texttt{examples/standard/} and \texttt{examples/full/} and diffing the output for the presence of each GAP element. Two headline observations: @@ -831,7 +901,7 @@ \section{Discussion}\label{sec:discuss} \paragraph{Why a protocol and not a framework.} A framework solves problems for the framework's users. A protocol creates a substrate that multiple competing products can build on. OpenGAP chooses the protocol -path\,---\,the reference implementation (\texttt{gapman}) is one of potentially many, +path\,---\,the reference implementation (\texttt{opengap}) is one of potentially many, and the spec has deliberate latitude so that vendors can add value within target adapters without needing to modify the canonical definition. @@ -846,11 +916,16 @@ \section{Future Work}\label{sec:future} \item \textbf{Formal SOD semantics.} Express \texttt{DUTIES.md} conflict rules as a set of logical assertions, and verify that an agent's tool-call trace satisfies them. A model checker such as TLA$^+$ or Alloy is a natural fit. - \item \textbf{Event schema for financial governance.} Once two or more external - enforcement implementations exist, standardize a \texttt{payment\_required}~/ - \texttt{payment\_approval}~/ \texttt{payment\_receipt} event schema as an RFC - addendum. Doing this prematurely risks standardizing imaginary interop; waiting - for real implementations grounds it in observed need~\cite{rfc_38}. + \item \textbf{Financial-governance Phase 2/3.} The declarative + \texttt{financial\_governance} block shipped in v0.4 (\S\ref{sec:fingov}); Phase 2 is + a reference \texttt{pre\_tool\_use} enforcement hook, and Phase 3 is a + \texttt{payment\_required}~/ \texttt{payment\_approval}~/ \texttt{payment\_receipt} + event schema for cross-runtime interop. Phase 3 waits until two or more enforcement + implementations exist; standardizing earlier risks codifying imaginary + interop~\cite{rfc_38}. + \item \textbf{Identity-layer implementation.} The cryptographic-identity RFC + (\S\ref{sec:identity}) is accepted; the next step is the \texttt{identity} block in + \texttt{agent-yaml.schema.json}, a reference verifier, and a delegation-chain example. \item \textbf{Deeper A2A integration.} GAP currently declares A2A compatibility in \texttt{agent.yaml} but does not ship an A2A server adapter. A reference A2A adapter would close the loop between agent-to-agent interfaces and GAP definitions.