-
Notifications
You must be signed in to change notification settings - Fork 1
Concepts
Vocabulary used throughout the rest of the wiki. Every term has a stable meaning in the JSON report and CLI output.
A shipgate.yaml file. The single source of truth for one agent release. Every scan reads exactly one manifest and produces exactly one report. Schema is versioned (version: "0.1") and validated strictly — typos fail the scan with a suggested fix. Full grammar in Manifest Reference.
A manifest declares:
- project / agent identity — names used in run IDs, fingerprints, finding evidence
- declared purpose — short prose used to detect scope contradictions (e.g. read-only purpose + DELETE tool → finding)
-
environment.target —
local | staging | production_like | production. Production targets fire stricter inventory checks - tool_sources — pointers to the supported tool surfaces: MCP exports, OpenAPI specs, OpenAI Agents SDK, Anthropic Messages API, Google ADK, LangChain/LangGraph, CrewAI, OpenAI API, Codex plugin packages, and n8n workflows
- permissions / policies / risk_overrides — declared expectations against which checks fire
- ci — advisory/strict mode and which severities fail
- checks.ignore — explicit suppressions with required reasons
The complete set of tools an agent can call after scanning. Built by:
- Loading every
tool_sources[]entry and anyopenai_apiartifacts - Flattening into a single list keyed by tool name
- Resolving duplicates by source priority (
openai_api > openapi > mcp > sdk_function); losers emit aDuplicate tool namewarning - Enriching each tool with risk hints
The flattened list is what the check catalog operates on. It's also surfaced verbatim in report.json under tool_inventory.
A (tag, source, confidence, evidence) record attached to a tool. Tags include read_only, write, destructive, external_write, financial_action, customer_communication, sensitive_data_access, infrastructure_change, code_execution. Sources include openapi_method, mcp_annotation, sdk_keyword, manual (from risk_overrides). Confidence is low | medium | high.
Hints are inputs to checks, not findings themselves. Most checks demand min_confidence="medium" to fire. You can promote, demote, or remove hints per tool via risk_overrides.tools.
Internally hints are produced by core/risk_hints.py:_add_automatic_hints plus your manual overrides. The keyword classifier is fully tokenized — "deploy" matches the standalone token deploy but not the substring inside deployments.
A pure function (ScanContext) -> list[Finding]. The 80+ built-in checks are listed in the Check Catalog and live under src/agents_shipgate/checks/. Each has:
- a stable check ID (e.g.
SHIP-POLICY-APPROVAL-MISSING) - a default severity (
critical | high | medium | low | info) - a category — one of ~19, including
inventory,schema,auth,scope,policy,side_effects,evidence,security,manifest,baseline,documentation,action_surface,verify, plus per-source families (api,adk,langchain,crewai,codex_plugin,n8n)
You can override the severity via checks.severity_overrides and add custom checks via Plugin Authoring.
A single scan output. Every finding has:
-
id— the fingerprint plus a content-derived discriminator on collision -
fingerprint— a stablesha256(check_id | tool_name | canonical evidence)[:16], prefixedfp_ -
check_id,severity,category,title -
tool_name/tool_id(oragent_idfor agent-level findings) -
evidence— structured payload describing why the check fired -
recommendation— actionable next step -
suppressed,suppression_reason -
baseline_status—new,matched, orresolvedwhen a baseline is applied
Fingerprints are content-addressed and stable across runs. They are the identity primitive used by suppressions and baselines.
The release gate. report.json.release_decision.decision is one of blocked,
review_required, insufficient_evidence, or passed, derived from the active
findings, declared policies, and any baseline. It is baseline-aware — a
baseline-matched critical lands in review_items (accepted debt), not
blockers. Read this field for gating, not the legacy summary.status (kept
baseline-blind for v0.7 callers). Treat unknown future enum values as
review_required.
The diff-derived delta in what an agent can do — tools, scopes, schemas, or policies added, modified, or removed between a base and a head ref. The verifier computes it so a reviewer sees exactly how a PR changed the agent's reach, not just the absolute surface.
agents-shipgate verify --base <ref> --head <ref> runs the gate on a PR diff and
returns a merge verdict — a deterministic projection of
release_decision.decision for the ongoing-PR flow. It writes
agents-shipgate-reports/verifier.json with the trigger and base-scan
orchestration status; that file is not a second verdict — the gate remains
report.json.release_decision.decision. Trust-root edits (weakening policies,
baselines, waivers, CI, or agent instructions) surface as SHIP-VERIFY-*
findings routed to human review, so a change can't quietly disable its own gate.
A manifest entry under checks.ignore that marks matching findings with suppressed: true and a required reason. Suppressed findings still appear in the JSON report (audit trail) but do not count toward severity totals or trigger CI failure. Stale suppressions (referencing missing checks or tools) emit SHIP-MANIFEST-STALE-SUPPRESSION.
A snapshot of currently-active findings stored at .agents-shipgate/baseline.json. After saving a baseline, future scans tag each finding as matched (already in baseline), new, or resolved. Strict CI with --baseline fails only on new findings. See Baseline Workflow for the full pattern.
The fingerprint algorithm is v1 and intentionally excludes severity overrides, baseline status, source paths, timestamps, and the default_severity audit-evidence key — so a baseline survives manifest tweaks that don't change actual finding identity.
advisory (default) or strict. Strict mode exits with code 20 on any unsuppressed finding whose severity is in ci.fail_on (default [critical]; configurable). Advisory mode never fails. See CI Recipes for usage patterns.
Shipgate runs as a static analyzer. By default it does not import user code, run agents, call tools, invoke LLMs, connect to MCP servers, make network calls, or collect telemetry. The only opt-in to this guarantee is third-party check plugins, gated by AGENTS_SHIPGATE_ENABLE_PLUGINS=1 and overridable per-scan with --no-plugins. See Trust Model for details.
| Code | Meaning |
|---|---|
0 |
Scan completed; advisory mode or strict-mode pass |
2 |
Manifest config error (typo, missing field) |
3 |
Input parse error (malformed YAML/JSON, path traversal blocked, file too large) |
4 |
Other Agents Shipgate error |
6 |
Baseline integrity failure |
20 |
Strict-mode gate failure (findings exist at ci.fail_on severity) |
A nonzero exit is always either a real finding (20) or a real error (2/3/4). Check the stderr message to disambiguate. See Troubleshooting.
Agents Shipgate · Apache-2.0 · maintained by Three Moons Lab · Report a false positive
Getting started
Reference
Workflows
Extending
Project