Skip to content

feat(ir): validate front-matter task steps against typed builders#1096

Open
jamesadevine wants to merge 2 commits into
mainfrom
feat/ir-task-frontmatter-validation
Open

feat(ir): validate front-matter task steps against typed builders#1096
jamesadevine wants to merge 2 commits into
mainfrom
feat/ir-task-frontmatter-validation

Conversation

@jamesadevine

Copy link
Copy Markdown
Collaborator

Summary

Adds the parse / validate direction to the typed task builders introduced in #1082 — making them usable as a schema for front-matter task steps, not just one-way constructors.

parse_task_step(&serde_yaml::Value) -> Result<Option<TaskStep>> (in src/compile/ir/tasks/parse.rs) is the inverse of into_step(): it deserializes an authored ADO task-step mapping and validates it by reusing the builder structs.

  • Derived Deserialize (keyed on real ADO input names, deny_unknown_fields) on CopyFiles and the five Docker@2 command-variant structs.
  • Added a flexible bool deserializer in common.rs so both CleanTargetFolder: true and CleanTargetFolder: "true" validate (matching how ADO inputs are authored).
  • Docker@2 dispatches on its command input into the right variant, so each command only accepts its own inputs.

Partial coverage is safe and additive

This is designed to sit in front of the existing front-matter steps: passthrough, so it has three outcomes:

Outcome Meaning
Ok(Some(step)) recognized task, inputs valid → normalized TaskStep
Ok(None) not modeled (a task with no typed builder yet, or a non-task step like bash:/script:) → caller keeps the original YAML unchanged (today: Step::RawYaml)
Err(e) recognized task with invalid inputs (missing required, unknown key, bad value, or an input for the wrong command)

"Not mapped" is deliberately not an error, so mapping a new task only ever adds validation and never rejects a workflow that compiled before. Strict mode (reject any task: not in the IR) would be a separate opt-in policy on top.

Wired up for CopyFiles@2 and Docker@2 as a proof of concept; extending coverage = derive Deserialize on the remaining builders + one match arm.

Stacked on #1082 (refactor/ado-builtin-tasks). This PR targets that branch; retarget to main once #1082 merges.

Test plan

  • cargo build — clean
  • cargo clippy --all-targets --all-features — clean
  • cargo test — full suite green; tasks::parse adds 11 tests covering: valid CopyFiles/Docker roundtrip, missing required input, unknown input key, bad bool value, input supplied for the wrong Docker command, missing/unknown command, unmapped-task passthrough, and non-task-step passthrough

jamesadevine and others added 2 commits June 18, 2026 11:48
Replace the free-function ADO task helpers in src/compile/ir/tasks.rs
(positional required args + untyped .with_input("camelKey","str") for
optionals) with a tasks/ directory module of typed builder structs, one
file per task. Each builder exposes new(<required>) + typed chained
setters + into_step() -> TaskStep; only set fields emit inputs, with
enums for constrained values and Option<bool> for bool-string inputs.

Command/mode-dispatch tasks (Docker@2, DotNetCoreCLI@2, NuGetCommand@2,
PowerShell@2) use a command enum with per-variant data so invalid
input/command combinations are unrepresentable; tasks/docker.rs is the
canonical template. Migrate the docker_installer call sites, update the
ado-task-ir-contributor workflow + docs to the new convention.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a parse direction to the typed task builders: parse_task_step()
deserializes an authored ADO task-step mapping and validates it by
reusing the builder structs as the schema. Derive Deserialize (keyed on
ADO input names, deny_unknown_fields) on CopyFiles and the Docker@2
command variants, plus a flexible bool deserializer accepting both 	rue
and "true".

Partial coverage is safe and additive: parse_task_step returns
Ok(Some(TaskStep)) for a recognized+valid step, Ok(None) for anything not
modeled (unmapped task or non-task step) so the caller keeps the original
YAML as today's opaque passthrough, and Err only for a recognized task
with invalid inputs (missing required, unknown key, bad value, or an
input for the wrong command). Wired up for CopyFiles@2 and Docker@2 as a
proof of concept.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Good design and clean implementation overall, but there's one real bug with integer-typed inputs and a silent data-loss risk in the public API surface that should be addressed before callers wire this up.

Findings

🐛 Bugs / Logic Issues

  • copy_files.rsretryCount / delayBetweenRetries reject integer YAML values: These fields are typed Option<String>, but users naturally write retryCount: 3 (an integer literal in YAML). serde_yaml won't coerce an integer to a String, so this produces a false-positive validation error for a perfectly valid step. The bool inputs got de_opt_bool_flex exactly for this reason — the same treatment is needed here. A de_opt_str_or_int helper that accepts String | u64 | i64 and stringifies would fix it, parallel to de_opt_bool_flex.

  • parse.rs line 52–65 — condition:, env:, continueOnError:, timeoutInMinutes: are silently dropped: parse_task_step only extracts displayName and inputs; all other ADO step-level keys are discarded. The returned TaskStep will be missing those fields. Since parse_task_step is pub and the doc comment promises a "normalized TaskStep", a future caller wiring this into the compilation path could silently drop user-authored conditions and env vars. Either document the limitation clearly ("does not preserve step-level fields other than displayName") or extract and apply them here.

  • parse.rs line 77–80 — misleading error when command: is present but not a string: map.remove("command").and_then(|v| v.as_str()...) returns None for command: 123, so the error fires with "Docker@2 requires a command input" — implying the key is absent, not malformed. A separate bail! for the non-string case would give a clearer message.

✅ What Looks Good

  • The three-outcome contract (Ok(Some) / Ok(None) / Err) is the right design: strictly additive, existing workflows never break when new coverage is added.
  • Removing command from the map before routing to the per-variant deserializer with deny_unknown_fields is elegant — makes cross-command input contamination structurally impossible.
  • de_opt_bool_flex handles the ADO bool-string duality correctly; #[serde(skip)] on display_name is right (it's not an ADO input).
  • Error propagation via anyhow .context() throughout, no unwrap()/expect() on user-facing paths.
  • Test coverage is solid — roundtrip, missing required, unknown key, bad bool, wrong-command input, missing/unknown command, and passthrough paths are all exercised.

Generated by Rust PR Reviewer for issue #1096 · 85.5 AIC · ⌖ 12.2 AIC · ⊞ 35.3K ·

Base automatically changed from refactor/ado-builtin-tasks to main June 18, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant