Skip to content

feat: add type: set step for binding computed values into context (#221)#226

Open
jrob5756 wants to merge 4 commits into
mainfrom
feature/issue-221-set-step
Open

feat: add type: set step for binding computed values into context (#221)#226
jrob5756 wants to merge 4 commits into
mainfrom
feature/issue-221-set-step

Conversation

@jrob5756
Copy link
Copy Markdown
Collaborator

Closes #221.

Summary

Adds a new agent type that evaluates one or more Jinja2 expressions and binds the results into the workflow context — no LLM call, no subprocess, no I/O.

Two surface forms (mutually exclusive):

# Single binding — output is the typed scalar / list / dict.
- name: compute_target
  type: set
  value: "{{ workflow.input.org }}/{{ workflow.input.repo }}"
  # accessible as: compute_target.output

# Multi-binding — output is a dict; each key accessible as step.output.<key>.
- name: derive_flags
  type: set
  values:
    is_breaking: "{{ research.output.severity in ['high', 'critical'] }}"
    target_branch: "{{ workflow.input.branch or 'main' }}"

See examples/set-step.yaml for a runnable demo with a route that branches on a derived boolean.

Design highlights

  • Type detection defaults to ruamel safe YAML loading with a JSON-safety pass: datetime/date/time become ISO 8601 strings; non-string dict keys and any other non-JSON-safe value raise ExecutionError so checkpoint round-trips never silently change a stored type. Empty / whitespace renders become "", not None. Explicit output_type: (single-value only) supports string, number, integer, boolean, list, dict.
  • Multi-values: renders every binding against the original pre-step context — no left-to-right ordering — per the issue's resolution. Users who want ordered dependencies chain multiple set steps.
  • Composition with parallel groups and for-each: both are supported. The validator rejects same-group sibling references inside value / values.* templates for parallel groups so a set step can never silently miss a sibling's output via the pre-group snapshot.
  • Routes attached to a set step evaluate against the raw bound value: dict outputs expose {{ output.<key> }} and (via simpleeval) bare <key> access; scalars expose {{ output }}.
  • Output schema validation is permitted only when the rendered output is a dict; a scalar with a declared output: schema raises ValidationError with a suggestion to switch to values:. The same validation runs in linear, parallel-group, and for-each positions (shared WorkflowEngine._run_set_step helper).
  • Events: dedicated set_started / set_completed / set_failed events mirror the script-step pattern and are emitted in all three positions (main loop, parallel group, for-each iteration). The CLI verbose log, the web dashboard's synthetic replay path, and the frontend event-type union / store reducers / log + activity builders all gain coverage. A new node type set is added to NodeType and NodeData carries set_output_type, set_output_keys, set_value_repr for the detail panel. SetCompletedData.output_type and SetOutput.output_type are typed with a shared SetOutputType literal union.
  • Context relaxation: WorkflowContext.store now accepts any value; _add_agent_input handles non-dict outputs (deep-copy for dicts, shallow for scalars/lists; KeyError with a "is a <type>, not a dict" message when shorthand field access targets a scalar). The two trim strategies (truncate, summarize) skip / JSON-summarise non-dict outputs and log at debug level so a user debugging "context still too big" can see why.
  • Shared helper: render_set_value_repr + SET_VALUE_REPR_MAX live in set_step.py and are reused by WorkflowEngine and the web server's synthetic replay path so the dashboard preview is consistent across live and resumed runs.

Files

  • Schema/validator: src/conductor/config/schema.py, src/conductor/config/validator.py
  • Executor: src/conductor/executor/set_step.py (new), src/conductor/engine/context.py, src/conductor/engine/workflow.py
  • Event parity: src/conductor/cli/run.py, src/conductor/web/server.py, src/conductor/web/frontend/src/types/events.ts, src/conductor/web/frontend/src/stores/workflow-store.ts, src/conductor/web/frontend/src/lib/constants.ts
  • Example: examples/set-step.yaml
  • Tests: tests/test_config/test_set_schema.py, tests/test_executor/test_set_step.py, tests/test_engine/test_set_workflow.py, tests/test_web/test_server.py, plus regression cases in tests/test_engine/test_context.py
  • Docs: AGENTS.md mentions the new executor module

Review pass

Ran the full pr-review skill (code, tests, comments, error handling, type design) on the initial commit. The follow-up commit (fix: address PR review findings on set step) lands the substantive findings:

  • Parallel/for-each set branches now emit set_* events + run schema validation (was silently skipped).
  • _to_json_safe now raises on non-string dict keys (was silently str()-coerced).
  • Extracted shared render_set_value_repr + SET_VALUE_REPR_MAX (was duplicated 512-char literal in workflow.py + server.py).
  • Synthetic replay now propagates declared output_type (was hard-coded "auto").
  • SetOutput.output_type + frontend SetCompletedData.output_type typed with SetOutputType literal (was str).
  • Several misleading comments rewritten; YAML/log/error paths now observable at debug level.
  • Test additions: schema-path set_failed events, parallel/for-each event parity, web synthetic replay branch, render_set_value_repr truncation boundary, end-to-end date→ISO normalisation (single + multi).

Validation

  • make lint clean
  • make typecheck — single pre-existing warning in dialog_evaluator.py, no new ones
  • Full unit suite: 2798 passed, 3 skipped
  • Targeted pytest -k "set or synthetic": 184 passed
  • make validate-examples: clean
  • Frontend npm run build: clean
  • Smoke runs of conductor run examples/set-step.yaml with both severity branches produce the expected outputs

Acceptance criteria

  • type: set accepted by the YAML schema with mutually exclusive value: / values:
  • Schema rejects both being present or both being absent
  • Single-value: step binds to step.output as the scalar/object
  • Multi-values: step binds each key to step.output.<key>
  • Standard output: schema validation applies if specified (dict outputs; enforced in main loop, parallel groups, and for-each)
  • Routes work as on any other step
  • Example under examples/ showing derived flags + a route that branches on one
  • Tests for: scalar value, multi-value, type detection, explicit output_type, schema rejection of invalid combinations, routing on a set output

jrob5756 and others added 4 commits May 21, 2026 15:06
…221)

Adds a new agent type that evaluates one or more Jinja2 expressions and
binds the results into the workflow context — no LLM call, no subprocess,
no I/O. Closes #221.

Two surface forms:

- `value:` — single expression bound as `<step>.output` (scalar / list / dict
  depending on auto-detection or explicit `output_type:`).
- `values:` — named bindings rendered in a single pass against the original
  pre-step context and bound as `<step>.output.<key>`.

Type detection defaults to YAML auto-parsing with a JSON-safety pass that
converts `datetime`/`date`/`time` to ISO 8601 strings and rejects any other
non-JSON-safe value so checkpoint round-trips stay stable. Empty / whitespace
renders become `""`, not `None`. Explicit `output_type:` (single-`value`
only) supports `string`, `number`, `integer`, `boolean`, `list`, `dict`.

Schema, validator, engine, and event-parity surfaces are all updated:

- `AgentDef` accepts `"set"` plus `value` / `values` / `output_type` fields,
  with a model validator that rejects every irrelevant field and the
  not-both/not-neither combination. Other agent types now reject set-only
  fields too.
- Cross-validator walks `value` / `values.*` for template-ref checks and
  rejects same-group sibling references from set templates inside parallel
  groups.
- `WorkflowContext.store` now accepts any value; `_add_agent_input`
  handles non-dict outputs (deep-copy for dicts, shallow for scalars/lists,
  KeyError with a "is a <type>, not a dict" message when shorthand field
  access targets a scalar). Trim strategies skip non-dict outputs instead
  of crashing on `.items()`.
- Engine adds a dispatch branch in the main loop, parallel group, and
  for-each group; emits `set_started` / `set_completed` / `set_failed`.
  Output schema validation is enforced for dict outputs and rejected for
  scalar outputs with a friendly suggestion.
- CLI verbose log, web `_synth_agent_or_script` synthetic replay, frontend
  event-type union, `NodeType`, `NodeData`, store reducers, log builder,
  and activity builder all gain set-step coverage.
- New example `examples/set-step.yaml` demonstrating single + multi
  binding plus a boolean route on the derived flag.

Tests added: 161 set-related cases across schema, validator, executor,
context regression, and engine integration files. Full unit suite stays
green (2783 passed). Frontend build clean. Make lint / typecheck clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses the comprehensive PR review on `feature/issue-221-set-step`.

Behavioural fixes:
- Parallel and for-each `set` branches now emit `set_started` /
  `set_completed` / `set_failed` and apply `output:` schema validation,
  matching the linear main-loop dispatch. Previously the dashboard
  silently missed set lifecycle events inside groups and the schema
  contract was only enforced in linear position. Extracted the shared
  logic into `WorkflowEngine._run_set_step`.
- `_to_json_safe` now raises `ExecutionError` for non-string dict keys
  instead of silently `str()`-coercing them. Silent coercion risked
  collisions like `{1: "a", "1": "b"}` losing an entry, and contradicted
  the rest of `_to_json_safe`'s "JSON-safe or error" contract.
- `_coerce`'s `auto` branch now logs at debug level when YAML parsing
  fails, so a malformed render that demotes to a string is observable.
- `_trim_truncate` and `_trim_summarize` log at debug level when they
  skip / render non-dict outputs so a user debugging "context still too
  big" can see why.
- `_trim_summarize` now uses `json.dumps` instead of `repr` for non-dict
  summaries, matching the rest of the set pipeline.
- Boolean coercion error message now includes a "rendered empty"
  hint when the template produced whitespace-only output.
- Removed unreachable `TypeError` from `_coerce`'s `int()`/`float()`
  except clauses (the input is always a `str`).

API tightening:
- `SetOutput.output_type` and `SetCompletedData.output_type` (frontend)
  are now narrowed from `str` to the `SetOutputType` literal union
  `auto | string | number | integer | boolean | list | dict`.
- Extracted `render_set_value_repr` + `SET_VALUE_REPR_MAX` into
  `set_step.py` and reused them from `WorkflowEngine` and the web
  server's synthetic-replay branch. Removes a duplicated 512-char
  truncation literal and a duplicated try/except `json.dumps`/`repr`
  fallback, and adds an `logger.error` on the fallback path so corrupt
  checkpoints are visible rather than silently downgraded.
- Synthetic replay now propagates the declared `output_type` from the
  agent def instead of hard-coding `"auto"` so a resumed dashboard
  matches the live emitter.
- Moved `import copy` to module top in `context.py` (was deferred
  inside `_add_agent_input`).
- Added `logger` to `context.py` for the new debug paths.

Documentation:
- Set-step module docstring now accurately describes the type-detection
  contract: `_to_json_safe` raises on unknown Python objects (not
  "falls back to string") and converts `datetime`/`date`/`time` to
  ISO-8601 strings.
- Fixed misleading "URL-like string" rationale in the auto-detect
  None-fallback to correctly describe pure-comment renders.
- Fixed self-contradictory "sentinel" framing in `_add_agent_input` —
  the seed value is just the natural default shape for each output type.
- Dropped trailing-edge `(issue #221)` parenthetical from the dispatch
  comment and `(per-key typing may be added in a future enhancement)`
  speculation from the schema docstring.
- Trimmed stale "cheap but reusable" rationale on `_YAML_LOADER`.
- Pointed the "guaranteed by the schema validator" comment at the
  actual function name (`AgentDef.validate_agent_type`).

Tests added / tightened:
- Direct `_coerce` auto branch: tightened over-loose YAML parse failure
  test (now asserts the raw string is returned verbatim); added explicit
  pure-comment-render and null-marker coverage.
- End-to-end execute() tests for date-like renders normalising to ISO
  strings (single + multi).
- Dedicated `TestRenderSetValueRepr` covering the 512-char truncation
  boundary with the shared helper.
- `set_failed` event assertion on both schema-validation failure paths
  (scalar-with-schema, multi-with-mismatch), in addition to the existing
  template-error case.
- New `TestSetInParallelGroup.test_set_in_parallel_emits_set_events`
  and `test_set_in_parallel_output_schema_enforced` lock in the
  group-parity fixes.
- New `TestSetInForEach.test_set_in_for_each_emits_set_events_per_item`
  asserts per-iteration `set_started`.
- New `TestSyntheticReplaySetStep` covers the web `_synth_agent_or_script`
  set branch for scalar, dict, and declared-output-type cases, plus
  the shared-helper invariant.
- Renamed `TestSetCheckpointRoundtrip` →
  `TestSetContextSerialization` (the test exercises
  `WorkflowContext.to_dict`/`from_dict` directly; engine resume is
  covered separately by the CLI suite).

All 2798 unit tests pass; lint clean; frontend builds clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ctor skill (#221)

User-facing documentation now covers the new set step type added by
the engine in this PR. Each surface follows the conventions already
established by the script-step and sub-workflow docs.

- **README.md**: add "Set steps" to the Features list and a row for
  `set-step.yaml` in the Examples table.
- **examples/README.md**: new "Set Step Examples" section with both
  severity branches shown as runnable commands.
- **docs/workflow-syntax.md**: full Set Steps section after Script
  Steps — schema, type-detection table, multi-binding ordering rule,
  routing semantics for dict vs scalar outputs, output-schema rules,
  composition with parallel / for-each, restrictions, and event
  contract.
- **plugins/conductor/skills/conductor/SKILL.md**: new `type: set`
  row in the Key Concepts table.
- **plugins/conductor/skills/conductor/references/yaml-schema.md**:
  add set-only fields (`value`, `values`, `output_type`) to the
  AgentDef block, a Set agent restrictions paragraph, and a new
  "Set Agent Schema" section covering type detection, routing,
  composition, and event payload shape.
- **plugins/conductor/skills/conductor/references/authoring.md**:
  full "Set Steps" how-to after Script Steps mirroring the
  workflow-syntax depth.
- **AGENTS.md**: extend the Workflow Execution Flow with a step
  for set dispatch (including the shared `_run_set_step` helper),
  and add a Key Patterns bullet covering set typing, `_to_json_safe`,
  `WorkflowContext.store` widening, and `_add_agent_input`
  non-dict semantics.

No code changes. `make lint` clean; `make validate-examples` clean;
targeted `pytest -k 'set or synthetic'` still 184 passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…#221)

The earlier commits in this PR plumbed set_started/set_completed/
set_failed events end-to-end (engine → web server → frontend types →
store reducers) and added set_output_type / set_output_keys /
set_value_repr fields to NodeData. However, two visual surfaces still
fell through to the default LLM-agent rendering:

- graph-layout.ts only mapped script/human_gate/workflow to specific
  flow node types — set steps rendered as AgentNode (chat-bubble icon,
  empty model area, no value preview).
- DetailPanel's switch dispatched to ScriptDetail/GateDetail/etc with
  AgentDetail as default — set steps opened AgentDetail, which expects
  output/model/tokens/cost_usd/iterationHistory and ignored the set-
  specific fields the reducer carefully populated.

This commit adds the missing visual treatment:

- SetNode.tsx: graph node modelled on ScriptNode but with a Variable
  lucide icon and a set-specific stats line — key count for multi-
  binding outputs, value preview for single-binding scalars, plus
  elapsed time. Uses the same status colours, transition animations,
  and live-elapsed timer as the other node types.
- SetDetail.tsx: detail panel with a "Set" status badge, metadata
  grid (Elapsed, Output Type, Bindings), and an OutputViewer that
  surfaces value_repr (already JSON-truncated server-side at ~512
  chars) with the standard expand/copy controls.
- WorkflowGraph: register setNode in the nodeTypes map.
- graph-layout: add `nodeType === 'set'` → `setNode` mapping.
- DetailPanel: add `case 'set': return SetDetail` to the dispatch
  switch.

Verified via `npm run build` (2089 modules, +2), `make lint` clean,
`pytest -k 'set or synthetic'` still 184 passing, and `conductor run
examples/set-step.yaml` produces the expected output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a type: set step for binding computed values into the context

1 participant