Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,8 @@ Every compiled pipeline runs as three sequential jobs:
├── prompts/ # AI agent prompt files for workflow authoring tasks
│ ├── create-ado-agentic-workflow.md # Step-by-step guide for creating a new agentic pipeline
│ ├── update-ado-agentic-workflow.md # Guide for modifying an existing agentic pipeline
│ └── debug-ado-agentic-workflow.md # Guide for troubleshooting a failing agentic pipeline
│ ├── debug-ado-agentic-workflow.md # Guide for troubleshooting a failing agentic pipeline
│ └── audit-ado-agentic-workflow.md # Guide for auditing pipeline runs (cost, efficiency, security)
├── scripts/ # Supporting scripts shipped as release artifacts
│ └── ado-script/ # TypeScript workspace for bundled gate.js, import.js, exec-context-pr.js, exec-context-pr-synth.js
│ └── src/
Expand Down Expand Up @@ -254,6 +255,9 @@ index to jump to the right page.
modifying an existing agentic pipeline (read-then-update workflow with validation).
- [`prompts/debug-ado-agentic-workflow.md`](prompts/debug-ado-agentic-workflow.md) — guide for
troubleshooting a failing agentic pipeline and filing a diagnostic report.
- [`prompts/audit-ado-agentic-workflow.md`](prompts/audit-ado-agentic-workflow.md) — guide for
auditing pipeline runs: cost/token analysis, hoist-candidate detection, reliability patterns,
safe-output quality, and security posture review.

### Authoring agent files

Expand All @@ -280,6 +284,8 @@ index to jump to the right page.
fragment with pre-filled ADO MCP identifiers, auto-extension of the
agent's bash allow-list with read-only git commands; configured via
the `execution-context:` front-matter block.
- [`docs/self-optimization.md`](docs/self-optimization.md) — self-optimization:
opt-in runtime step-proposal feature (staged preview, live PR, security model).
- [`docs/safe-outputs.md`](docs/safe-outputs.md) — full reference for every
safe-output tool agents can use to propose actions (PRs, work items, wiki
pages, comments, etc.) plus their per-agent configuration.
Expand Down
14 changes: 14 additions & 0 deletions docs/extending.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,20 @@ Typical steps:

> **Type path/identifier `Params` fields with validated newtypes.** If your tool's input holds a file path, git ref, commit SHA, artifact name, or similar identifier, use a newtype from [`src/secure.rs`](../src/secure.rs) (`RelativeSafePath`, `StrictRelativePath`, `PathSegment`, `GitRefName`, `BranchName`, `CommitSha`, `ArtifactName`, `Identifier`, `HostName`, `Version`) instead of a raw `String`. These wrap the canonical primitives in [`src/validate.rs`](../src/validate.rs) and run them at deserialization time, so the path-traversal / injection / format checks are applied automatically and cannot be silently omitted. Reserve the manual `validate()` method for cross-field and semantic rules (e.g. positive IDs, length minimums).

### Validating untrusted step blocks

Use `compile::ir::validate_step_block` as the shared structural validator
whenever a safe-output tool or other component accepts a YAML step block from an
untrusted source. It returns all validation errors at once rather than
short-circuiting; collect those errors into the safe-output's structured
rejection so both the agent and audit pipeline get the full picture.

For agent-proposed blocks, always pass `StepKindAllow::Curated`. Only pass
`StepKindAllow::Full` when the caller is human-supervised, such as the
`validate_steps` author MCP tool. References: `src/compile/ir/step_validation.rs`
and the typed-factory task allow-list in
`src/compile/ir/tasks.rs::CURATED_TASK_IDS`.

Safe-output tools are not `CompilerExtension`s. If a safe output also needs compile-time MCP configuration, add that through the always-on `SafeOutputsExtension` declarations.

## Adding a runtime
Expand Down
32 changes: 31 additions & 1 deletion docs/front-matter.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,15 @@ execution-context: # optional execution-context plugin (see docs/exe
enabled: true # defaults to true when on.pr is configured. Set false to opt out
# (also suppresses auto-adding the read-only git commands to the
# agent's bash allow-list).
self-optimization: # optional, opt-in self-optimization preview (see "Self-optimization")
enabled: true # default: false. Set true to let the agent propose step optimizations.
staged: true # default: true. Preview proposals in the build summary only.
# Set false only after reviewing preview quality and choosing to
# allow source-file mutation proposals to land.
max-proposals-per-run: 3 # default: 3. Sanitized/clamped to a maximum of 50.
allowed-sections: # default: [steps, post-steps] (same job/security context as agent).
- steps # allowed values: steps, post-steps, setup, teardown.
- post-steps # setup/teardown are separate jobs and require explicit opt-in.
steps: # inline steps before agent runs (same job, generate context)
- bash: echo "Preparing context for agent"
displayName: "Prepare context"
Expand Down Expand Up @@ -185,6 +194,28 @@ parameters: # optional ADO runtime parameters (surfaced in UI
Build the project and run all tests...
```

## Self-optimization (opt-in)

`self-optimization:` is an opt-in feature that gives the Stage-1 agent a
structured safe-output, `propose-step-optimization`, for proposing that
deterministic bash it successfully ran be lifted into front-matter `steps:` or
`post-steps:`. The runtime pieces that consume this configuration — the
proposing tool and Stage 3 executor — are part of the same feature build-out and
may land in follow-up layers.

The default is OFF (`enabled: false`). When you first set `enabled: true`,
`staged: true` remains the default: accepted proposals are rendered as PREVIEW
diffs in the build summary and no source-file mutations land. Flip
`staged: false` only after reviewing preview quality and deciding the pipeline
should allow accepted optimizations to update the source `.md`.

`allowed-sections` defaults to `[steps, post-steps]`, because both sections run
in the same job and security context as the agent. Opting in to `setup` or
`teardown` is explicit because those sections run in separate jobs that may have
different identities. See the self-optimization reference
([`docs/self-optimization.md`](self-optimization.md)) for the full feature
design. <!-- TODO: lands with Layer 5 -->

## Workspace Defaults

The `workspace:` field controls which directory the agent runs in. When it is
Expand Down Expand Up @@ -420,4 +451,3 @@ pipeline. In this mode the compiler:

Result: every PR update fires exactly one PR-typed build (`Build.Reason
== PullRequest`); commit-driven CI is fully silenced.

59 changes: 59 additions & 0 deletions docs/mcp-author.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,68 @@ workflows.
| `trace_failure` | Trace a build's failed-job chain using audit data plus any local IR graph. | `{ "build_id_or_url": "123", "step": null, "org": null, "project": null, "pat": null }` |
| `whatif` | Classify downstream jobs if a step or job fails. | `{ "source_path": "...", "failing_id": "Agent" }` |
| `lint_workflow` | Run structural lint checks. | `{ "source_path": "agents/example.md" }` |
| `validate_steps` | IR-validate a proposed front-matter step block. | `{ "steps": [...], "allow_list": "full" \| "curated" }` |
| `catalog` | List safe-outputs, runtimes, tools, engines, and models. | `{ "kind": "safe-outputs" }` |
| `audit_build` | Download and analyze a build; same shape as `ado-aw audit --json`. | `{ "build_id_or_url": "123", "org": null, "project": null, "pat": null, "artifacts": null, "no_cache": false }` |

## `validate_steps`

`validate_steps` lets an authoring agent run the shared IR step-block validator
against a proposed `steps:` or `post-steps:` block before writing it into the
source `.md` file.

Input schema:

```jsonc
{
"steps": [/* JSON array of ADO step entries */],
"allow_list": "full" | "curated" // optional, default "full"
}
```

Modes:

- `"full"` accepts every step kind the IR models (`bash`, `task`, `checkout`,
`download`, `publish`) and any valid task identifier. Use it when an author is
writing their own steps.
- `"curated"` restricts `task:` steps to the typed-factory set in
`src/compile/ir/tasks.rs` (`CURATED_TASK_IDS`). Use it when tooling
double-checks an untrusted agent-proposed block.

Success response:

```json
{
"ok": "true",
"kinds": [
{ "kind": "bash" }
]
}
```

Error response:

```json
{
"ok": "false",
"errors": [
{
"step_index": 0,
"path": "steps[0].task",
"message": "task \"AzureCLI@2\" is not in the curated allow-list (Curated mode only permits tasks with a typed factory in src/compile/ir/tasks.rs)"
},
{
"step_index": 1,
"path": "steps[1].bash",
"message": "bash: value must be a string (the script body)"
}
]
}
```

Validation collects errors instead of short-circuiting, so one call returns the
full picture for the proposed block.

## Trust model

`mcp-author` runs as the invoking local user. It has no bounding directory,
Expand Down
32 changes: 32 additions & 0 deletions docs/safe-outputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -672,3 +672,35 @@ safe-outputs:
Note: `wiki-name` is required. If it is not set, execution fails with an explicit error message.

**Code wikis vs project wikis:** The executor automatically detects code wikis (type 1) and resolves the published branch from the wiki metadata. You only need to set `branch` explicitly to override the auto-detected value (e.g. targeting a non-default branch). Project wikis (type 0) need no branch configuration.

## Self-modification

### `propose-step-optimization` (opt-in)

A structured safe-output for runtime self-optimization. When
`self-optimization.enabled: true` is set in the front matter, the
Stage-1 agent gets access to this tool to propose lifting deterministic
bash work into front-matter `steps:` / `post-steps:`.

Unlike regular safe-output tools (configured via `safe-outputs.<name>:`),
`propose-step-optimization` is activated by the top-level
`self-optimization:` front-matter section and is NOT accepted under
`safe-outputs:` (the compiler rejects it with a helpful message).

**Stage 3 behaviour:**
- When `staged: true` (default): IR-validates the proposed step block
(Curated allow-list — bash + typed-factory tasks only) and renders a
`🎭`-marked preview to the Stage 3 build log showing section,
rationale, estimated token savings, and the proposed YAML.
- When `staged: false`: opens a PR against the source `.md` adding the
new steps (not yet implemented; lands in a follow-up release).

**Stage 2 cross-check:** The detection agent verifies that every bash
command in the proposal's `steps` appears in
`source_command_evidence` (the bash the agent actually ran). Proposals
containing commands the agent didn't demonstrably execute are flagged as
prompt-injection candidates.

See [Self-optimization (opt-in)](front-matter.md#self-optimization-opt-in)
for configuration and [docs/self-optimization.md](self-optimization.md)
for the full feature reference (lands with the live-PR path).
Loading
Loading