Hi! I'm building a spec decomposition pipeline (SDLC → epics/tasks) using ChatDev (DevAll) and a local model (qwopus 3.5 via OpenAI-compatible API).
I've implemented the Python tools and subgraph YAMLs, but I'm struggling with the correct wiring — the workflow either fails validation or the agents don't seem to invoke tools properly. I'd appreciate a design review rather than treating this as a bug.
What I'm trying to achieve
START → Spec Scanner (subgraph) → Task Extractor (agent + tool) → Decomposer (subgraph) → Quality Gate (agent) → [REVISE ↻|APPROVED →] Summarizer (agent + tool) → SnippetWriter (agent + tool) → [Loop Gate → loop or END]
What I've done
Custom Python tools in functions/:
- extract_tasks_from_markdown(folder_path, output_json)
- generate_context_summary(tasks_json, summary_file, project_root)
- write_snippets(folder_path, snippets_file)
Subgraphs:
- subgraphs/spec_scanner.yaml — scans markdown files
- subgraphs/reflexion_loop.yaml — iterative decomposition
Main workflow (see below)
Where I need help
-
Subgraph syntax — I used type: subgraph with nested config.type: file + config.path. I suspect the correct field is config.graph_path at the top level of config. Can someone confirm the exact schema?
-
Tooling / function calling — My tools are in functions/ and seem to load, but the agent either:
hallucinates the arguments, or the tool result is not fed back into the agent context.
Is the tooling block above the right shape? Should I use type: function with auto_load: true or explicitly reference the module path? Any gotchas with local / non-OpenAI models and tool schemas?
-
Passing variables into prompts — I hardcoded <PROJECT_ROOT> inside the role prompt, expecting it to be substituted from the input payload. What is the correct interpolation syntax? {{project_root}}? ${inputs.project_root}? Or should I use a literal / variable node to inject it into the context?
Loop timer semantics — Loop Gate has two outgoing edges with condition: 'true' (one to Spec Scanner, one to END). My intent is timer not expired
markdown_task_extractor.py
version: 0.0.0
vars:
MODEL_NAME: qwopus
BASE_URL: http://192.168.9.2:1113/v1
API_KEY: dsa
SNIPPETS_FILE: docs/snippets.md
graph:
id: SDLCSpecs_v2
description: SDLCSpecs v2 — scans project specs, extracts tasks, decomposes with LLM reflexion loop, generates summaries and snippets. Supports single, batch, and nightly continuous modes.
log_level: DEBUG
is_majority_voting: false
nodes:
- id: START
type: passthrough
config:
only_last_message: true
description: ''
context_window: 0
- id: END
type: passthrough
config:
only_last_message: true
description: ''
context_window: 0
- id: Spec Scanner
type: subgraph
config:
type: file
config:
path: subgraphs/spec_scanner.yaml
description: Scan project files and aggregate spec content
context_window: 0
- id: Decomposer
type: subgraph
config:
type: file
config:
path: subgraphs/reflexion_loop.yaml
description: Reflexion loop for iterative task decomposition
context_window: 0
- id: Loop Gate
type: loop_timer
config:
max_duration: 3600
duration_unit: seconds
reset_on_emit: true
message: Nightly run complete — starting next cycle
passthrough: false
description: Timer gate for nightly continuous mode
context_window: 0
- id: Task Extractor
type: agent
config:
name: ${MODEL_NAME}
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
role: |-
You are a task extractor in the SDLCSpecs pipeline.
You receive aggregated spec scan results (files with their content). Your job is to extract all actionable tasks, TODOs, and specifications.
Use the `extract_tasks_from_markdown` tool to extract tasks from markdown files in the project at the provided folder path.
The project root path is in the "project_root" field of the input.
Steps: 1. Call extract_tasks_from_markdown(folder_path="<PROJECT_ROOT>",
output_json="docs/tasks/extracted_tasks.json")
2. Review the extracted tasks 3. Identify missing tasks that weren't caught by regex 4. Output: the full extracted tasks JSON plus your additional findings
Return ONLY the result. No extra text.
tooling:
- type: function
config:
auto_load: true
tools:
- name: extract_tasks_from_markdown
thinking: null
memories: []
skills: null
retry:
enabled: true
max_attempts: 2
min_wait_seconds: 1
max_wait_seconds: 5
description: Extract tasks from scanned spec files
context_window: 0
log_output: true
- id: Quality Gate
type: agent
config:
name: ${MODEL_NAME}
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
role: |-
### Role: You are a "Quality Inspector" in the SDLCSpecs pipeline.
### Context:
You receive the decomposition output from the Reflexion loop.
Your job is to evaluate the quality of the decomposition.
### Evaluation Criteria:
1. Completeness: Are all extracted tasks covered?
2. Granularity: Are tasks decomposed to actionable size?
3. Clarity: Are task descriptions clear and unambiguous?
4. Consistency: No conflicting or overlapping tasks?
### Decision:
- If quality is acceptable (score >= 7/10):
Output: VERDICT: APPROVED
- If quality needs improvement (score < 7/10):
Output: VERDICT: REVISE
Followed by specific improvement suggestions.
### Output format:
Score: <0-10>
Issues: <list of issues if any>
VERDICT: APPROVED|REVISE
params:
temperature: 0.1
max_tokens: 500
tooling: []
thinking: null
memories: []
skills: null
retry: null
description: Evaluate decomposition quality
context_window: 0
log_output: true
- id: SnippetWriter
type: agent
config:
name: ${MODEL_NAME}
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
role: |-
You are a snippet writer in the SDLCSpecs pipeline.
The project root path is available from the input in the "project_root" field.
Call write_snippets(
folder_path="<PROJECT_ROOT>",
snippets_file="docs/snippets.md"
) where <PROJECT_ROOT> is the value from the input's project_root field.
Return ONLY the function result. No extra text.
tooling:
- type: function
config:
auto_load: true
tools:
- name: write_snippets
thinking: null
memories: []
skills: null
retry: null
description: Write code snippets based on analysis
context_window: 0
log_output: true
- id: Summarizer
type: agent
config:
name: ${MODEL_NAME}
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
role: |-
You are a summary generator in the SDLCSpecs pipeline.
The project root path is available from the input in the "project_root" field.
Call generate_context_summary(
tasks_json="docs/tasks/decomposed_tasks.json",
summary_file="docs/tasks/context_summary.md",
project_root="<PROJECT_ROOT>"
) where <PROJECT_ROOT> is the value from the input's project_root field.
Return ONLY the function result. No extra text.
tooling:
- type: function
config:
auto_load: true
tools:
- name: generate_context_summary
thinking: null
memories: []
skills: null
retry: null
description: Generate context summary from decomposed tasks
context_window: 0
log_output: true
edges:
- from: START
to: Spec Scanner
trigger: true
condition: 'true'
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Spec Scanner
to: Task Extractor
trigger: true
condition: 'true'
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Task Extractor
to: Decomposer
trigger: true
condition: 'true'
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Decomposer
to: Quality Gate
trigger: true
condition: 'true'
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Quality Gate
to: Decomposer
trigger: true
condition:
type: keyword
config:
any:
- REVISE
none: []
regex: []
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Quality Gate
to: Summarizer
trigger: true
condition:
type: keyword
config:
any:
- APPROVED
none: []
regex: []
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Summarizer
to: SnippetWriter
trigger: true
condition: 'true'
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: SnippetWriter
to: Loop Gate
trigger: true
condition: 'true'
carry_data: true
keep_message: true
clear_context: false
clear_kept_context: false
- from: Loop Gate
to: Spec Scanner
trigger: true
condition: 'true'
carry_data: true
keep_message: false
clear_context: false
clear_kept_context: false
- from: Loop Gate
to: END
trigger: true
condition: 'true'
carry_data: true
keep_message: false
clear_context: false
clear_kept_context: false
memory: []
initial_instruction: 'Run SDLCSpecs v2 pipeline: scan project specs, extract tasks, decompose with LLM reflexion loop, validate quality, generate summary and snippets. Supports single-run (no timer) or nightly (timer-based loop).'
start:
- START
end:
- END
graph:
id: spec_scanner
description: Scans project files for specifications, reads them in parallel, and aggregates results.
log_level: DEBUG
is_majority_voting: false
nodes:
- id: START
type: passthrough
config:
only_last_message: true
description: ''
context_window: 0
- id: END
type: passthrough
config:
only_last_message: true
description: ''
context_window: 0
- id: File Finder
type: agent
config:
name: ${MODEL_NAME}
provider: openai
role: |-
You are a file discovery agent. The user provides a project root path.
Use the `code_executor` tool to run Python code to find files. Example code:
```python
import os
root = "/path/to/project"
skip_dirs = {"node_modules", ".git", "__pycache__", "dist", "build", ".venv"}
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if d not in skip_dirs]
for f in filenames:
if f.endswith(('.md', '.yaml', '.yml')):
print(os.path.join(dirpath, f))
```
After running the code, collect the results and output:
<file>: /path/to/file1.md
<file>: /path/to/file2.yaml
If no files found, output exactly: <no files found>
**TERMINATION CONDITION: Stop after finding up to 50 files. If more than 50 files are found, output exactly: "Terminated: Found more than 50 files, showing first 50 only." Do not continue scanning further.**
base_url: ${BASE_URL}
api_key: ${API_KEY}
params:
max_rounds: 10
max_tool_calls: 20
tooling:
- type: function
config:
tools:
- name: file:All
- name: code_executor:All
timeout: null
prefix: ''
thinking: null
memories: []
retry:
enabled: true
max_attempts: 2
min_wait_seconds: 1
max_wait_seconds: 5
description: Finds all spec-related files in the project
context_window: 0
- id: File Reader
type: agent
config:
name: ${MODEL_NAME}
provider: openai
role: |-
### Role: You are a "File Content Reader" in the SDLCSpecs system.
### Context:
You receive a single file path. Read its content and extract structured info.
### Task:
1. Read the file using available tools
2. Identify spec type:
- .md: extract sections, headings, task descriptions, TODO items
- .yaml/.yml: identify workflow structure, nodes, edges
- .py: identify functions, classes, docstrings, TODO comments
### Output format:
File: <path>
Type: <md|yaml|py|other>
Summary: <1-2 sentence summary>
Key Items:
- <key finding 1>
Content:
```
<full file content>
```
base_url: ${BASE_URL}
api_key: ${API_KEY}
params: {}
tooling:
- type: function
config:
tools:
- name: file:All
- name: code_executor:All
timeout: null
prefix: ''
thinking: null
memories: []
retry: null
description: Reads and summarizes a single spec file
context_window: 0
- id: Content Aggregator
type: agent
config:
name: ${MODEL_NAME}
provider: openai
role: |-
### Role: You are a "Content Aggregation Specialist" in the SDLCSpecs system.
### Context:
You receive multiple file reading reports. Combine them into one overview.
### Task:
1. Collect all incoming file reports
2. Group files by type (md, yaml, py)
3. Produce a final aggregated document
### Output format:
# Scan Results
## Summary
Total files scanned: <count>
Files by type:
- Markdown (.md): <count>
- YAML (.yaml): <count>
- Python (.py): <count>
## Files
### <path>
- Type: <type>
- Summary: <summary>
- Key Items:
- <item 1>
...
base_url: ${BASE_URL}
api_key: ${API_KEY}
params: {}
tooling: []
thinking: null
memories: []
retry: null
description: Aggregates all file reading results
context_window: 0
edges:
- from: START
to: File Finder
trigger: true
condition: 'true'
carry_data: true
keep_message: true
- from: File Finder
to: File Reader
trigger: true
condition:
type: keyword
config:
any:
- '<file>:'
none: []
regex: []
carry_data: true
keep_message: false
dynamic:
type: map
split:
type: regex
config:
pattern: <file>:\s*(.*)
config:
max_parallel: 10
- from: File Finder
to: END
trigger: true
condition:
type: keyword
config:
any: []
none:
- '<file>:'
regex: []
carry_data: true
keep_message: true
- from: File Reader
to: Content Aggregator
trigger: true
condition: 'true'
carry_data: true
keep_message: true
- from: Content Aggregator
to: END
trigger: true
condition: 'true'
carry_data: true
keep_message: true
memory: []
initial_instruction: ''
start:
- START
end:
- END
version: 0.0.0
vars:
MODEL_NAME: qwopus
ersion: 0.4.0
graph:
id: reflexion_loop
description: Reflexion loop subgraph with actor/evaluator and memory storage.
log_level: INFO
is_majority_voting: false
start:
- Task
end:
- Final Synthesizer
memory:
- name: reflexion_blackboard
type: blackboard
config:
max_items: 500
nodes:
- id: Task
type: passthrough
config: {}
- id: Reflexion Actor
type: agent
description: Actor (πθ) generates a strategy draft based on blackboard experience and short-term context.
config:
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
name: qwopus
input_mode: messages
role: |
You are the Actor. If there are relevant memories, refer to that experience and output the latest action draft; if there are no relevant memories, provide an action draft to the best of your ability.
- Structure:
Thought: ...
Draft: ...
memories:
- name: reflexion_blackboard
retrieve_stage:
- gen
top_k: 5
read: true
write: false
params:
temperature: 0.2
max_tokens: 1200
- id: Reflexion Evaluator
type: agent
description: Evaluator (Me) provides scores and improvement directions for the Actor's draft.
config:
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
name: qwopus
input_mode: messages
role: |
You are the Evaluator. Receive and read the Actor's latest output and task objectives, and evaluate whether they meet the goals.
Append `Verdict: CONTINUE` or `Verdict: STOP` at the end of the output.
When you think the current plan is good enough, you should give `Verdict: STOP`. Other fields can be skipped.
Output:
- Score: <0-1>
- Reason: <Failure reasons or highlights>
- Next Focus: <Key points to focus on in the next round>
- Verdict: CONTINUE|STOP
params:
temperature: 0.1
max_tokens: 800
- id: Self Reflection Writer
type: agent
description: Self-Reflection (Msr) converts Evaluator results into reusable experience.
config:
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
name: qwopus
input_mode: messages
role: |
You are responsible for refining the Evaluator output and Actor Draft into JSON experience:
{
"issues": [..],
"fix_plan": [..],
"memory_cue": "A short reminder"
}
- JSON must not contain extra text.
memories:
- name: reflexion_blackboard
read: false
write: true
params:
temperature: 0.1
max_tokens: 500
- id: Final Synthesizer
type: agent
description: Converge the final answer, absorbing the latest Draft and Evaluator tips.
config:
provider: openai
base_url: ${BASE_URL}
api_key: ${API_KEY}
name: qwopus
input_mode: messages
role: |
Please synthesize all inputs and provide a final answer. Be comprehensive. Do not include any extra text other than the final answer.
params:
temperature: 0.1
max_tokens: 1000
edges:
- from: Task
to: Reflexion Actor
keep_message: True
- from: Task
to: Reflexion Evaluator
keep_message: True
trigger: false
- from: Reflexion Actor
to: Reflexion Actor
trigger: false
- from: Reflexion Actor
to: Reflexion Evaluator
- from: Reflexion Evaluator
to: Self Reflection Writer
condition: need_reflection_loop
- from: Self Reflection Writer
to: Reflexion Actor
carry_data: true
- from: Reflexion Actor
to: Final Synthesizer
trigger: false
- from: Reflexion Evaluator
to: Final Synthesizer
condition: should_stop_loop
carry_data: false
loop, timer expired → exit. Will both edges fire? What is the canonical pattern for a conditional loop break?
Context window — I set context_window: 0 everywhere. Does this strip all previous messages, making agents "forget" prior step outputs? What is a sensible default for a multi-step agent pipeline?
Hi! I'm building a spec decomposition pipeline (SDLC → epics/tasks) using ChatDev (DevAll) and a local model (qwopus 3.5 via OpenAI-compatible API).
I've implemented the Python tools and subgraph YAMLs, but I'm struggling with the correct wiring — the workflow either fails validation or the agents don't seem to invoke tools properly. I'd appreciate a design review rather than treating this as a bug.
What I'm trying to achieve
What I've done
Custom Python tools in functions/:
Subgraphs:
Main workflow (see below)
Where I need help
Subgraph syntax — I used type: subgraph with nested config.type: file + config.path. I suspect the correct field is config.graph_path at the top level of config. Can someone confirm the exact schema?
Tooling / function calling — My tools are in functions/ and seem to load, but the agent either:
hallucinates the arguments, or the tool result is not fed back into the agent context.
Is the tooling block above the right shape? Should I use type: function with auto_load: true or explicitly reference the module path? Any gotchas with local / non-OpenAI models and tool schemas?
Passing variables into prompts — I hardcoded <PROJECT_ROOT> inside the role prompt, expecting it to be substituted from the input payload. What is the correct interpolation syntax? {{project_root}}? ${inputs.project_root}? Or should I use a literal / variable node to inject it into the context?
Loop timer semantics — Loop Gate has two outgoing edges with condition: 'true' (one to Spec Scanner, one to END). My intent is timer not expired
markdown_task_extractor.py
loop, timer expired → exit. Will both edges fire? What is the canonical pattern for a conditional loop break?
Context window — I set context_window: 0 everywhere. Does this strip all previous messages, making agents "forget" prior step outputs? What is a sensible default for a multi-step agent pipeline?