Problem
The conductor workflow YAML contract today lives only inside the Pydantic models in src/conductor/config/schema.py. Downstream consumers (IDEs, CI in repos that author workflow YAMLs, custom linters) have no portable way to validate a YAML file against that contract without running conductor itself. Today the feedback loop on a typo'd field name is:
- Edit YAML in editor — no feedback.
- Push.
conductor validate runs in CI (or worse, conductor run at workflow launch) and fails with a parse error.
In a repo with ~15 workflow YAMLs and ~50 active authors of nested step bodies (see PolyphonyRequiem/polyphony), this is a meaningful friction tax. Hand-rolled Pester lints in polyphony partially compensate by re-encoding structural rules, but they drift from conductor's own type definitions and can never be authoritative.
Proposal
Expose conductor's existing typed schema as a public JSON Schema artifact:
CLI verb
conductor schema [--component workflow|agent|route|all] [--output json|yaml] [--out <path>]
Defaults to --component workflow --output json → stdout. Body is the output of WorkflowConfig.model_json_schema() (Pydantic gives this for free; I verified locally that it produces a valid schema today).
Release artifact
Publish the generated schema as a GitHub release asset alongside the conductor binary:
https://github.com/microsoft/conductor/releases/download/v<X.Y.Z>/workflow.schema.json
So consumers can reference it via a stable, versioned URL without parsing release notes.
Drift protection
Embed src/conductor/schemas/workflow.schema.json as a regenerated build artifact; a unit test asserts WorkflowConfig.model_json_schema() == file_contents so the runtime types and the published schema can never drift.
Downstream value
-
IDE feedback. Workflow YAMLs add a header # yaml-language-server: =https://.../workflow.schema.json and get red squiggles in VS Code for unknown fields, wrong types, missing required keys, illegal enum values. Autocomplete on Ctrl+Space. Today these wait until parse-time at workflow launch.
-
CI step. Repos that author workflow YAMLs can run check-jsonschema --schemafile <url> workflows/*.yaml as one deterministic step. No Python env required, no conductor install required, runs in seconds.
-
Replaces structural lint clauses in downstream repos. Polyphony today carries ~28 lint files under .conductor/registry/tests/lint-*.ps1 to validate workflow shape; ~30-40% of the clauses inside those files are structural checks (required field, enum value, type) that a schema would absorb. The semantic checks (M4 routing rules, M10 loop safety, vocab) stay because they need runtime reachability analysis that JSON Schema can't express — but the structural baseline becomes authoritative rather than hand-maintained.
Scope estimate
~6-10 hours in this repo:
- CLI verb (~50 LoC).
- Generated artifact + drift test (~30 LoC).
- Release pipeline update to publish the asset.
- Docs in
references/yaml-schema.md pointing to the schema URL.
Non-goals
- This does not replace
conductor validate. The validate verb does semantic checks (route reachability, agent name references, etc.) that aren't in scope for the schema. The schema is the structural baseline; validate stays the semantic layer.
- This does not propose freezing the YAML contract. The schema is versioned with the conductor release; breaking changes follow whatever versioning policy
conductor already has.
Open questions
- Is the conductor team comfortable owning a public structural contract for the workflow YAML, or would you rather downstream repos vendor their own derived schema?
- Are sub-components (
agent, route, parallel/for_each nodes, MCP nodes) worth separate --component flags, or just emit the whole tree from workflow?
- Any preference on the artifact path inside the source tree (
src/conductor/schemas/ vs schemas/ vs artifacts/)?
Happy to draft the PR if there's appetite — wanted to scope it as an issue first.
Problem
The conductor workflow YAML contract today lives only inside the Pydantic models in
src/conductor/config/schema.py. Downstream consumers (IDEs, CI in repos that author workflow YAMLs, custom linters) have no portable way to validate a YAML file against that contract without running conductor itself. Today the feedback loop on a typo'd field name is:conductor validateruns in CI (or worse,conductor runat workflow launch) and fails with a parse error.In a repo with ~15 workflow YAMLs and ~50 active authors of nested step bodies (see PolyphonyRequiem/polyphony), this is a meaningful friction tax. Hand-rolled Pester lints in polyphony partially compensate by re-encoding structural rules, but they drift from conductor's own type definitions and can never be authoritative.
Proposal
Expose conductor's existing typed schema as a public JSON Schema artifact:
CLI verb
conductor schema [--component workflow|agent|route|all] [--output json|yaml] [--out <path>]Defaults to
--component workflow --output json→ stdout. Body is the output ofWorkflowConfig.model_json_schema()(Pydantic gives this for free; I verified locally that it produces a valid schema today).Release artifact
Publish the generated schema as a GitHub release asset alongside the conductor binary:
https://github.com/microsoft/conductor/releases/download/v<X.Y.Z>/workflow.schema.jsonSo consumers can reference it via a stable, versioned URL without parsing release notes.
Drift protection
Embed
src/conductor/schemas/workflow.schema.jsonas a regenerated build artifact; a unit test assertsWorkflowConfig.model_json_schema() == file_contentsso the runtime types and the published schema can never drift.Downstream value
IDE feedback. Workflow YAMLs add a header
# yaml-language-server: =https://.../workflow.schema.jsonand get red squiggles in VS Code for unknown fields, wrong types, missing required keys, illegal enum values. Autocomplete onCtrl+Space. Today these wait until parse-time at workflow launch.CI step. Repos that author workflow YAMLs can run
check-jsonschema --schemafile <url> workflows/*.yamlas one deterministic step. No Python env required, no conductor install required, runs in seconds.Replaces structural lint clauses in downstream repos. Polyphony today carries ~28 lint files under
.conductor/registry/tests/lint-*.ps1to validate workflow shape; ~30-40% of the clauses inside those files are structural checks (required field, enum value, type) that a schema would absorb. The semantic checks (M4 routing rules, M10 loop safety, vocab) stay because they need runtime reachability analysis that JSON Schema can't express — but the structural baseline becomes authoritative rather than hand-maintained.Scope estimate
~6-10 hours in this repo:
references/yaml-schema.mdpointing to the schema URL.Non-goals
conductor validate. The validate verb does semantic checks (route reachability, agent name references, etc.) that aren't in scope for the schema. The schema is the structural baseline; validate stays the semantic layer.conductoralready has.Open questions
agent,route,parallel/for_eachnodes, MCP nodes) worth separate--componentflags, or just emit the whole tree fromworkflow?src/conductor/schemas/vsschemas/vsartifacts/)?Happy to draft the PR if there's appetite — wanted to scope it as an issue first.