feat: add type: set step for binding computed values into context (#221)#226
Open
jrob5756 wants to merge 4 commits into
Open
feat: add type: set step for binding computed values into context (#221)#226jrob5756 wants to merge 4 commits into
type: set step for binding computed values into context (#221)#226jrob5756 wants to merge 4 commits into
Conversation
…221) Adds a new agent type that evaluates one or more Jinja2 expressions and binds the results into the workflow context — no LLM call, no subprocess, no I/O. Closes #221. Two surface forms: - `value:` — single expression bound as `<step>.output` (scalar / list / dict depending on auto-detection or explicit `output_type:`). - `values:` — named bindings rendered in a single pass against the original pre-step context and bound as `<step>.output.<key>`. Type detection defaults to YAML auto-parsing with a JSON-safety pass that converts `datetime`/`date`/`time` to ISO 8601 strings and rejects any other non-JSON-safe value so checkpoint round-trips stay stable. Empty / whitespace renders become `""`, not `None`. Explicit `output_type:` (single-`value` only) supports `string`, `number`, `integer`, `boolean`, `list`, `dict`. Schema, validator, engine, and event-parity surfaces are all updated: - `AgentDef` accepts `"set"` plus `value` / `values` / `output_type` fields, with a model validator that rejects every irrelevant field and the not-both/not-neither combination. Other agent types now reject set-only fields too. - Cross-validator walks `value` / `values.*` for template-ref checks and rejects same-group sibling references from set templates inside parallel groups. - `WorkflowContext.store` now accepts any value; `_add_agent_input` handles non-dict outputs (deep-copy for dicts, shallow for scalars/lists, KeyError with a "is a <type>, not a dict" message when shorthand field access targets a scalar). Trim strategies skip non-dict outputs instead of crashing on `.items()`. - Engine adds a dispatch branch in the main loop, parallel group, and for-each group; emits `set_started` / `set_completed` / `set_failed`. Output schema validation is enforced for dict outputs and rejected for scalar outputs with a friendly suggestion. - CLI verbose log, web `_synth_agent_or_script` synthetic replay, frontend event-type union, `NodeType`, `NodeData`, store reducers, log builder, and activity builder all gain set-step coverage. - New example `examples/set-step.yaml` demonstrating single + multi binding plus a boolean route on the derived flag. Tests added: 161 set-related cases across schema, validator, executor, context regression, and engine integration files. Full unit suite stays green (2783 passed). Frontend build clean. Make lint / typecheck clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses the comprehensive PR review on `feature/issue-221-set-step`.
Behavioural fixes:
- Parallel and for-each `set` branches now emit `set_started` /
`set_completed` / `set_failed` and apply `output:` schema validation,
matching the linear main-loop dispatch. Previously the dashboard
silently missed set lifecycle events inside groups and the schema
contract was only enforced in linear position. Extracted the shared
logic into `WorkflowEngine._run_set_step`.
- `_to_json_safe` now raises `ExecutionError` for non-string dict keys
instead of silently `str()`-coercing them. Silent coercion risked
collisions like `{1: "a", "1": "b"}` losing an entry, and contradicted
the rest of `_to_json_safe`'s "JSON-safe or error" contract.
- `_coerce`'s `auto` branch now logs at debug level when YAML parsing
fails, so a malformed render that demotes to a string is observable.
- `_trim_truncate` and `_trim_summarize` log at debug level when they
skip / render non-dict outputs so a user debugging "context still too
big" can see why.
- `_trim_summarize` now uses `json.dumps` instead of `repr` for non-dict
summaries, matching the rest of the set pipeline.
- Boolean coercion error message now includes a "rendered empty"
hint when the template produced whitespace-only output.
- Removed unreachable `TypeError` from `_coerce`'s `int()`/`float()`
except clauses (the input is always a `str`).
API tightening:
- `SetOutput.output_type` and `SetCompletedData.output_type` (frontend)
are now narrowed from `str` to the `SetOutputType` literal union
`auto | string | number | integer | boolean | list | dict`.
- Extracted `render_set_value_repr` + `SET_VALUE_REPR_MAX` into
`set_step.py` and reused them from `WorkflowEngine` and the web
server's synthetic-replay branch. Removes a duplicated 512-char
truncation literal and a duplicated try/except `json.dumps`/`repr`
fallback, and adds an `logger.error` on the fallback path so corrupt
checkpoints are visible rather than silently downgraded.
- Synthetic replay now propagates the declared `output_type` from the
agent def instead of hard-coding `"auto"` so a resumed dashboard
matches the live emitter.
- Moved `import copy` to module top in `context.py` (was deferred
inside `_add_agent_input`).
- Added `logger` to `context.py` for the new debug paths.
Documentation:
- Set-step module docstring now accurately describes the type-detection
contract: `_to_json_safe` raises on unknown Python objects (not
"falls back to string") and converts `datetime`/`date`/`time` to
ISO-8601 strings.
- Fixed misleading "URL-like string" rationale in the auto-detect
None-fallback to correctly describe pure-comment renders.
- Fixed self-contradictory "sentinel" framing in `_add_agent_input` —
the seed value is just the natural default shape for each output type.
- Dropped trailing-edge `(issue #221)` parenthetical from the dispatch
comment and `(per-key typing may be added in a future enhancement)`
speculation from the schema docstring.
- Trimmed stale "cheap but reusable" rationale on `_YAML_LOADER`.
- Pointed the "guaranteed by the schema validator" comment at the
actual function name (`AgentDef.validate_agent_type`).
Tests added / tightened:
- Direct `_coerce` auto branch: tightened over-loose YAML parse failure
test (now asserts the raw string is returned verbatim); added explicit
pure-comment-render and null-marker coverage.
- End-to-end execute() tests for date-like renders normalising to ISO
strings (single + multi).
- Dedicated `TestRenderSetValueRepr` covering the 512-char truncation
boundary with the shared helper.
- `set_failed` event assertion on both schema-validation failure paths
(scalar-with-schema, multi-with-mismatch), in addition to the existing
template-error case.
- New `TestSetInParallelGroup.test_set_in_parallel_emits_set_events`
and `test_set_in_parallel_output_schema_enforced` lock in the
group-parity fixes.
- New `TestSetInForEach.test_set_in_for_each_emits_set_events_per_item`
asserts per-iteration `set_started`.
- New `TestSyntheticReplaySetStep` covers the web `_synth_agent_or_script`
set branch for scalar, dict, and declared-output-type cases, plus
the shared-helper invariant.
- Renamed `TestSetCheckpointRoundtrip` →
`TestSetContextSerialization` (the test exercises
`WorkflowContext.to_dict`/`from_dict` directly; engine resume is
covered separately by the CLI suite).
All 2798 unit tests pass; lint clean; frontend builds clean.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ctor skill (#221) User-facing documentation now covers the new set step type added by the engine in this PR. Each surface follows the conventions already established by the script-step and sub-workflow docs. - **README.md**: add "Set steps" to the Features list and a row for `set-step.yaml` in the Examples table. - **examples/README.md**: new "Set Step Examples" section with both severity branches shown as runnable commands. - **docs/workflow-syntax.md**: full Set Steps section after Script Steps — schema, type-detection table, multi-binding ordering rule, routing semantics for dict vs scalar outputs, output-schema rules, composition with parallel / for-each, restrictions, and event contract. - **plugins/conductor/skills/conductor/SKILL.md**: new `type: set` row in the Key Concepts table. - **plugins/conductor/skills/conductor/references/yaml-schema.md**: add set-only fields (`value`, `values`, `output_type`) to the AgentDef block, a Set agent restrictions paragraph, and a new "Set Agent Schema" section covering type detection, routing, composition, and event payload shape. - **plugins/conductor/skills/conductor/references/authoring.md**: full "Set Steps" how-to after Script Steps mirroring the workflow-syntax depth. - **AGENTS.md**: extend the Workflow Execution Flow with a step for set dispatch (including the shared `_run_set_step` helper), and add a Key Patterns bullet covering set typing, `_to_json_safe`, `WorkflowContext.store` widening, and `_add_agent_input` non-dict semantics. No code changes. `make lint` clean; `make validate-examples` clean; targeted `pytest -k 'set or synthetic'` still 184 passing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…#221) The earlier commits in this PR plumbed set_started/set_completed/ set_failed events end-to-end (engine → web server → frontend types → store reducers) and added set_output_type / set_output_keys / set_value_repr fields to NodeData. However, two visual surfaces still fell through to the default LLM-agent rendering: - graph-layout.ts only mapped script/human_gate/workflow to specific flow node types — set steps rendered as AgentNode (chat-bubble icon, empty model area, no value preview). - DetailPanel's switch dispatched to ScriptDetail/GateDetail/etc with AgentDetail as default — set steps opened AgentDetail, which expects output/model/tokens/cost_usd/iterationHistory and ignored the set- specific fields the reducer carefully populated. This commit adds the missing visual treatment: - SetNode.tsx: graph node modelled on ScriptNode but with a Variable lucide icon and a set-specific stats line — key count for multi- binding outputs, value preview for single-binding scalars, plus elapsed time. Uses the same status colours, transition animations, and live-elapsed timer as the other node types. - SetDetail.tsx: detail panel with a "Set" status badge, metadata grid (Elapsed, Output Type, Bindings), and an OutputViewer that surfaces value_repr (already JSON-truncated server-side at ~512 chars) with the standard expand/copy controls. - WorkflowGraph: register setNode in the nodeTypes map. - graph-layout: add `nodeType === 'set'` → `setNode` mapping. - DetailPanel: add `case 'set': return SetDetail` to the dispatch switch. Verified via `npm run build` (2089 modules, +2), `make lint` clean, `pytest -k 'set or synthetic'` still 184 passing, and `conductor run examples/set-step.yaml` produces the expected output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #221.
Summary
Adds a new agent type that evaluates one or more Jinja2 expressions and binds the results into the workflow context — no LLM call, no subprocess, no I/O.
Two surface forms (mutually exclusive):
See
examples/set-step.yamlfor a runnable demo with a route that branches on a derived boolean.Design highlights
datetime/date/timebecome ISO 8601 strings; non-string dict keys and any other non-JSON-safe value raiseExecutionErrorso checkpoint round-trips never silently change a stored type. Empty / whitespace renders become"", notNone. Explicitoutput_type:(single-valueonly) supportsstring,number,integer,boolean,list,dict.values:renders every binding against the original pre-step context — no left-to-right ordering — per the issue's resolution. Users who want ordered dependencies chain multiplesetsteps.value/values.*templates for parallel groups so asetstep can never silently miss a sibling's output via the pre-group snapshot.{{ output.<key> }}and (via simpleeval) bare<key>access; scalars expose{{ output }}.output:schema raisesValidationErrorwith a suggestion to switch tovalues:. The same validation runs in linear, parallel-group, and for-each positions (sharedWorkflowEngine._run_set_stephelper).set_started/set_completed/set_failedevents mirror the script-step pattern and are emitted in all three positions (main loop, parallel group, for-each iteration). The CLI verbose log, the web dashboard's synthetic replay path, and the frontend event-type union / store reducers / log + activity builders all gain coverage. A new node typesetis added toNodeTypeandNodeDatacarriesset_output_type,set_output_keys,set_value_reprfor the detail panel.SetCompletedData.output_typeandSetOutput.output_typeare typed with a sharedSetOutputTypeliteral union.WorkflowContext.storenow accepts any value;_add_agent_inputhandles non-dict outputs (deep-copy for dicts, shallow for scalars/lists;KeyErrorwith a "is a<type>, not a dict" message when shorthand field access targets a scalar). The two trim strategies (truncate,summarize) skip / JSON-summarise non-dict outputs and log at debug level so a user debugging "context still too big" can see why.render_set_value_repr+SET_VALUE_REPR_MAXlive inset_step.pyand are reused byWorkflowEngineand the web server's synthetic replay path so the dashboard preview is consistent across live and resumed runs.Files
src/conductor/config/schema.py,src/conductor/config/validator.pysrc/conductor/executor/set_step.py(new),src/conductor/engine/context.py,src/conductor/engine/workflow.pysrc/conductor/cli/run.py,src/conductor/web/server.py,src/conductor/web/frontend/src/types/events.ts,src/conductor/web/frontend/src/stores/workflow-store.ts,src/conductor/web/frontend/src/lib/constants.tsexamples/set-step.yamltests/test_config/test_set_schema.py,tests/test_executor/test_set_step.py,tests/test_engine/test_set_workflow.py,tests/test_web/test_server.py, plus regression cases intests/test_engine/test_context.pyAGENTS.mdmentions the new executor moduleReview pass
Ran the full pr-review skill (code, tests, comments, error handling, type design) on the initial commit. The follow-up commit (
fix: address PR review findings on set step) lands the substantive findings:set_*events + run schema validation (was silently skipped)._to_json_safenow raises on non-string dict keys (was silentlystr()-coerced).render_set_value_repr+SET_VALUE_REPR_MAX(was duplicated 512-char literal in workflow.py + server.py).output_type(was hard-coded"auto").SetOutput.output_type+ frontendSetCompletedData.output_typetyped withSetOutputTypeliteral (wasstr).set_failedevents, parallel/for-each event parity, web synthetic replay branch,render_set_value_reprtruncation boundary, end-to-end date→ISO normalisation (single + multi).Validation
make lintcleanmake typecheck— single pre-existing warning indialog_evaluator.py, no new onespytest -k "set or synthetic": 184 passedmake validate-examples: cleannpm run build: cleanconductor run examples/set-step.yamlwith both severity branches produce the expected outputsAcceptance criteria
type: setaccepted by the YAML schema with mutually exclusivevalue:/values:value:step binds tostep.outputas the scalar/objectvalues:step binds each key tostep.output.<key>output:schema validation applies if specified (dict outputs; enforced in main loop, parallel groups, and for-each)examples/showing derived flags + a route that branches on oneoutput_type, schema rejection of invalid combinations, routing on a set output