From 4f3b277bc878f54456c8c213066f5c3b9486d470 Mon Sep 17 00:00:00 2001 From: ConScholar Date: Thu, 16 Apr 2026 10:49:28 -0700 Subject: [PATCH] fix(ai-evals): split compound pipeline delegation assertion into two The "self-contained Task prompt with pipeline file path and return expectation" assertion was failing at 2/4 runs. Splitting it into separate assertions for file path inclusion and return expectation makes failures more diagnosable. Made-with: Cursor --- ai-evals/aidd-pipeline/pipeline-skill-test.sudo | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ai-evals/aidd-pipeline/pipeline-skill-test.sudo b/ai-evals/aidd-pipeline/pipeline-skill-test.sudo index b9a5713..a827c79 100644 --- a/ai-evals/aidd-pipeline/pipeline-skill-test.sudo +++ b/ai-evals/aidd-pipeline/pipeline-skill-test.sudo @@ -10,7 +10,8 @@ ai-evals/aidd-pipeline/fixtures/sample-pipeline.md - Given three ordered list items, should identify exactly 3 pipeline steps - Given step 1 is a file listing task, should delegate it with subagent type `explore` or `generalPurpose` - Given sequential execution, should complete step 1 before starting step 2 -- Given each delegation, should build a self-contained Task prompt with the pipeline file path and return expectation +- Given each delegation, should build a self-contained Task prompt that includes the pipeline file path +- Given each delegation, should include a return expectation describing the specific deliverable for that step - Given all steps succeed, should summarize successes and artifacts for the user - Given a step failure, should stop execution and report completed steps plus the failing step - Given narrative text outside the "Steps" section, should not treat it as a pipeline item