You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ado-aw no longer compiles pipelines by substituting strings into YAML template files. Every production target builds a typed Azure DevOps pipeline IR, resolves graph-level facts, lowers that IR to serde_yaml::Value, and serializes once with serde_yaml::to_string.
The implementation lives under src/compile/ir/. The canonical 5-job agentic-pipeline shape (Setup → Agent → Detection → SafeOutputs → Teardown) lives in src/compile/agentic_pipeline.rs and is shared by every target. Per-target wrappers handle only the envelope:
src/compile/standalone_ir.rs
src/compile/onees_ir.rs
src/compile/job_ir.rs
src/compile/stage_ir.rs
Those wrappers are the only place per-target shape (top-level PipelineShape, template parameters, 1ES extends:) should be assembled. Shared canonical-shape logic belongs in agentic_pipeline.rs. Shared target logic should be typed IR construction helpers, not string fragments.
Module layout
src/compile/ir/ is split by responsibility:
ids.rs — typed StageId, JobId, and StepId newtypes. Constructors validate the ADO identifier grammar (^[A-Za-z_][A-Za-z0-9_]*$) so invalid names fail at compile time.
step.rs — Step and concrete step structs: BashStep, TaskStep, CheckoutStep, DownloadStep, and PublishStep.
tasks/ — typed builder structs for built-in ADO tasks (one submodule per task). Each builder exposes new(<required>), typed chained setters for optional inputs (enums for constrained values, bool for bool-string inputs), and into_step() -> TaskStep; only set fields are emitted. Command/mode-dispatch tasks (Docker@2, DotNetCoreCLI@2, NuGetCommand@2, PowerShell@2) use a command enum with per-variant data so invalid input/command combos are unrepresentable (tasks/docker.rs is the canonical template). Prefer these builders for compiler-generated tasks rather than open-coding TaskStep::new(...) with raw string keys at each call site.
job.rs — Job, Pool, job variables, 1ES templateContext support, and target-job external dependsOn / condition wrapping.
stage.rs — Stage plus target-stage external dependsOn / condition wrapping.
env.rs — typed environment values (EnvValue) including ADO macros, pipeline variables, secrets, OutputRefs, Coalesce, macro-form Concat, and RuntimeExpression (a $[ ... ] ADO runtime expression that the lowering pass auto-hoists to a job-level variables: entry — ADO does not evaluate $[ ... ] inside step env:). RuntimeExpression is only valid at the top level of a step env: value: nesting it inside Concat or Coalesce is rejected at lower time (the hoist pass walks only the top level, so a nested occurrence would emit a dangling $(AwRtExpr_…) macro). A Literal or RawYamlScalar(String) smuggling a raw $[ ... ] into a step env: value (whether at the top level or nested inside a Concat) is likewise rejected at lower time — ADO passes such scalars verbatim, so the typed RuntimeExpression variant must be used instead.
condition.rs — the Condition / Expr AST and code generation to ADO condition syntax.
output.rs — OutputDecl, OutputRef, and the output-reference lowering rules.
validate pass — there is no separate validate.rs module in the current tree; graph invariants live in graph.rs, shape checks live near the relevant lowering code in lower.rs, and target-specific validation stays in the target builder.
lower.rs — converts typed IR to a serde_yaml::Value tree.
emit.rs — calls lower::lower() and serde_yaml::to_string() for canonical YAML output.
Top-level pipeline types
The root type is Pipeline in src/compile/ir/mod.rs:
Shape is intentionally separate from body. For example, the 1ES target still builds the canonical job graph as PipelineBody::Jobs; the lowering pass wraps those jobs under the 1ES extends.parameters.stages[0].jobs shape.
Steps
All generated pipeline steps should use typed variants from src/compile/ir/step.rs:
Use the typed structs whenever the compiler owns the step:
Step::Bash for inline bash (BashStep::script is the raw body, not a YAML block).
Step::Task for ADO task invocations such as UseNode@1, UsePythonVersion@0, or UseDotNet@2. For compiler-generated built-in tasks, prefer the typed builder structs in src/compile/ir/tasks/ (e.g. tasks::copy_files::CopyFiles::new(...).into_step()) over ad-hoc TaskStep::new(...) calls.
Step::Checkout for checkout: steps.
Step::Download for pipeline-artifact downloads.
Step::Publish for pipeline-artifact publishes. Under 1ES, lowering moves publish steps into templateContext.outputs so artifacts are published by the 1ES template machinery exactly once.
Step::RawYaml is reserved for user-authored setup/teardown YAML that the IR does not model. Do not use it for compiler-generated steps that need output refs, conditions, env rewriting, or graph-derived dependencies.
BashStep and TaskStep carry common compiler-owned fields:
id: Option<StepId> — emitted as ADO step name:; required when another step consumes an output from this step.
let r = OutputRef::new(StepId::new("synthPr")?,"AW_SYNTHETIC_PR_ID");EnvValue::step_output(r)
The consumer does not choose the ADO expression syntax. output.rs::lower_outputref() chooses the correct syntax from the consumer and producer locations:
Consumer vs. producer
Lowered syntax
Same job
$(stepName.X)
Sibling job in the same stage, or both jobs are stage-less
This rule exists because Azure DevOps output variables are context-sensitive. The historical synthPr failures came from hand-written code using the wrong reference form for the consumer location. The IR centralizes that choice so new compiler code declares what it needs (OutputRef) rather than guessing how ADO will expose it.
graph.rs also sets OutputDecl::auto_is_output = true when any consumer reads the declaration. The producer can then emit ##vso[task.setvariable ...;isOutput=true] only when cross-step visibility is actually needed.
Graph pass
graph.rs::resolve() is the all-in-one pass for dependency derivation:
Index every named step and its declared outputs.
Walk every EnvValue::StepOutput, every output nested inside EnvValue::Coalesce / EnvValue::Concat, and every Expr::StepOutput inside conditions.
Validate that each reference names an existing step with a matching OutputDecl.
Lift step-output edges into job-level and stage-level dependencies.
Detect cycles in the derived job and stage graphs.
Merge the derived edges into Job::depends_on and Stage::depends_on while preserving any explicit values a target builder supplied.
Mark producer outputs that need isOutput=true.
Same-job refs do not produce dependsOn entries because ADO orders steps by position. Cross-job refs add Job::depends_on; cross-stage refs add Stage::depends_on. The lowering pass reads those fields and emits canonical dependsOn: blocks.
Conditions
condition.rs defines a small AST for ADO conditions:
Use constructors such as Condition::and([...]), Condition::or([...]), and Condition::not(...) when composing nested expressions. Codegen flattens nested And / Or nodes and quotes string literals for ADO expression syntax:
Expr::StepOutput uses the same location-aware output-ref lowering as EnvValue::StepOutput. Condition::Custom is an escape hatch for expressions not yet modeled by the AST; codegen rejects embedded newlines and ADO pipeline-command markers (##vso[, ##[) before emitting it.
Extension declarations
The extension trait lives in src/compile/extensions/mod.rs and now has exactly three surface methods:
Declarations is the typed aggregate for every signal an extension contributes:
agent_prepare_steps: Vec<Step>
setup_steps: Vec<Step>
agent_finalize_steps: Vec<Step>
detection_prepare_steps: Vec<Step>
safe_outputs_steps: Vec<Step>
network_hosts: Vec<String>
bash_commands: Vec<String>
prompt_supplement: Option<String>
mcpg_servers: Vec<(String, McpgServerConfig)>
copilot_allow_tools: Vec<String>
pipeline_env: Vec<PipelineEnvMapping>
awf_mounts: Vec<AwfMount>
awf_path_prepends: Vec<String>
agent_env_vars: Vec<(String, String)>
warnings: Vec<String>
Extension phases are System, Runtime, and Tool. The compiler sorts extensions by phase before merging declarations, so internal system plumbing lands first, runtime installs land before user tools, and tool extensions can assume requested runtimes are available.
Always-on extensions are collected in collect_extensions() before user-configured runtimes/tools:
AdoAwMarkerExtension
GitHubExtension
SafeOutputsExtension
AdoScriptExtension
ExecContextExtension
AzureCliExtension
Lowering and emission
lower.rs::lower() builds and validates a Graph, then converts the typed Pipeline into a serde_yaml::Value tree. The lowerer owns ADO wire shapes and canonical ordering: top-level identity and configuration keys first, then jobs: / stages:, with target-specific wrapping based on PipelineShape.
emit.rs::emit() is intentionally thin:
pubfnemit(pipeline:&Pipeline) -> Result<String>{let value = super::lower::lower(pipeline)?;
serde_yaml::to_string(&value)}
This gives all targets one serialization path and one canonical YAML style. Target compilers should return a complete typed Pipeline; they should not format YAML directly.
Per-target compilers
The production target wrappers are:
standalone_ir.rs — wraps the canonical shape in a top-level standalone pipeline.
onees_ir.rs — wraps the same canonical shape with PipelineShape::OneEs, causing the lowerer to emit the 1ES extends: wrapper and templateContext outputs.
job_ir.rs — wraps the canonical shape as a target-job template with external dependsOn / condition template parameters.
stage_ir.rs — wraps the canonical shape as a target-stage template with the stage-level external-parameter wrapper.
The canonical 5-job Setup → Agent → Detection → SafeOutputs → Teardown shape itself lives in agentic_pipeline.rs and is reused unchanged by every wrapper above; extensions plug into it via Declarations (steps, env, hosts, MCPG entries, and Agent-job condition clauses — see Declarations::agent_conditions).
When adding a target, follow the same pattern: parse and validate front matter, collect extension Declarations, build typed jobs/stages/steps, set the correct PipelineShape, and call the shared emit path.
Public JSON summary (ir::summary)
The internal IR types (Pipeline, Job, Step, Graph, …) are
intentionally tied to the compiler's lowering needs and are not
public API. To give agent-facing tooling a stable view of a compiled
pipeline, src/compile/ir/summary.rs defines a parallel
summary tree with #[derive(Serialize)] that is consumed by:
ado-aw graph dump <source> [--format text|json|dot] — resolved
dependency graph (subset of the summary).
ado-aw graph deps <source> <step-id> and `ado-aw graph outputs
` — focused graph queries over step dependencies and output
declaration/reference edges.
ado-aw whatif <source> --fail <step-id-or-job-id> — static
downstream skip classification from graph reachability and rendered
conditions.
The ado-aw audit JSON (AuditData.pipeline_graph) and the
author-MCP server.
Stability contract
PipelineSummary::schema_version (currently 1) is the public schema
version. Bump it when the JSON shape changes in a way a downstream
consumer would notice (renamed field, removed variant, changed
semantics). Additive changes like new optional fields do not require a
bump. New enum variants currently do require a schema-version bump
because the serialized enums do not have catch-all Unknown variants.
The summary is the public schema. Internal IR types may change freely
without bumping the summary version, as long as the summary lowering
keeps the existing field set populated correctly.
condition? is the lowered ADO condition string (e.g.
"eq(dependencies.Detection.outputs['threatAnalysis.SafeToProcess'], 'true')"),
not the typed AST — consumers don't need the AST to reason about
"would this run if X failed?".
build_pipeline_ir is the public read-only entry point: it parses
and sanitises front matter, runs the same target dispatch as
compile_pipeline, and returns the typed Pipeline without writing
any YAML. PipelineSummary::from_pipeline runs the graph pass
(reusing graph::build_graph for validation + edge derivation) and
populates auto_is_output for any output that has at least one
cross-step consumer — without mutating the input pipeline.
{ "schema_version": 1, "name": "<pipeline name>", "shape": "standalone" | "1es" | "job-template" | "stage-template", "body": { "kind": "jobs", "jobs": [...] } // OR { "kind": "stages", "stages": [...] }, "graph": { "step_locations": [{ "step", "stage?", "job", "outputs": [...] }], "job_edges": [{ "consumer", "producer" }], // consumer dependsOn producer "stage_edges": [{ "consumer", "producer" }], "outputs_needing_is_output": [{ "step", "outputs": [...] }] } }