feat(api,ui): showcase pipeline agent ops final polish (PRP-41)#323
Conversation
PRP-41 — fourth and FINAL slice of the /showcase upgrade epic (PRP-38..41). Adds two new pipeline phases on scenario=showcase_rich plus cross-cutting UI polish that closes issue #311. Pipeline (backend / app/features/demo/pipeline.py) - step_agent_hitl_flow: HITL approval round-trip on the experiment agent. Drives POST /agents/sessions + /chat + /approve via ASGITransport; surfaces an intermediate step_complete (status=running, awaiting_approval=true) for the FE to render the Approve button; absorbs 400 "No pending action" when the FE pre-empts; 90 s hard timeout falls back to skip so a hung agent never wedges the run. - step_ops_snapshot: 3 GET calls to /ops/summary + /ops/retraining-candidates + /ops/model-health, derives a 5-key KPI payload (stale_aliases_count, retraining_candidates_count, total_runs, total_aliases, degrading_health_count). warn (never fail) on all-three-failed. - _phase_table() — design Z: unified `agents` phase id for BOTH scenarios; SHOWCASE_RICH swaps step_agent for step_agent_hitl_flow and appends an ops phase carrying ops_snapshot before cleanup. SHOWCASE_RICH = 24 rows / 10 phases; DEMO_MINIMAL = 11 rows (unchanged shape under the new agents phase id). - _Client.yield_event hook + run_pipeline event-sink drain. The orchestrator stamps step_index / total_steps / phase_index / phase_total / phase_name on every drained intermediate event. Frontend (UI) - PHASE_DEFS.ts — design Z restructure: BOTH the legacy `agent` step and the new `agent_hitl_flow` live under the unified `agents` phase id; new DEMO_MINIMAL_ONLY_STEP_NAMES set complements SHOWCASE_RICH_STEP_NAMES so the filter selects the right step per scenario (lockstep test pins 24 tuples / 10 phases). - DemoPhasePanel.tsx — adds onValueChange handler + local useState (closes issue #311 / D10): post-pipeline-complete the operator can finally expand any phase without snapping back to the fallback. - demo-step-card.tsx — HitlFlowSummary chip-line + OpsSnapshotMiniGrid + one-click ApproveButton (only renders when status=running AND awaiting_approval=true). - showcase.tsx — five new chrome additions: - ShowcaseKpiStrip — 5-tile KPI strip above the controls card. - RunHistoryStrip — localStorage FIFO 5 with Replay button. - Stop button (visible mid-run) — closes the WS so the backend's WebSocketDisconnect releases the pipeline lock. - InspectArtifactsPanel — 10 deep-link cards rendered after pipeline_complete. - resolveInspectHref switch extended with agent_hitl_flow → CHAT, ops_snapshot → OPS. - use-demo-pipeline.ts — stop() callback exposed via UseDemoPipelineResult; DemoSummary.v2RunId added (mapped from pipeline_complete event.data.v2_run_id). Docs - docs/user-guide/showcase-walkthrough.md — drops 7 "planned" markers across PRP-38/39/40/41 phases; adds concrete prose for Agents (HITL) + Ops snapshot + the 5 polish items + performance budgets table refresh + screenshot placeholders. - docs/_base/RUNBOOKS.md — 5 new failure-mode entries (23-27): agent_hitl_flow no-key / timeout / no-trigger, ops_snapshot all-failed, Stop button mid-run. Tests - Backend: 9 new tests in test_pipeline.py (HITL: happy / no-key / session-fail / no-tool / 4xx-absorb / timeout + Ops: happy / warn / empty); lockstep test rewrite 23 → 24 tuples; 5 new canned-response fixtures for /ops/* endpoints. - Frontend: 22 new vitest cases across 5 test files (DemoPhasePanel onValueChange, ShowcaseKpiStrip 5-tile derivation, InspectArtifactsPanel 10-card grid, RunHistoryStrip localStorage FIFO, demo-step-card HITL + Approve + Ops mini-grid). - E2E: test_run_demo_showcase_rich_full_epic asserts PRP-41 contract shapes hold when the steps execute; tolerates a pre-existing PRP-39/40 cascade (scenario_simulate_and_save can fail to parse the safer_promote_flow placeholder artifact_uri) documented in RUNBOOKS.md entry 18. Validation - ruff + format clean; mypy + pyright strict (only pre-existing xgboost/lightgbm stub gaps remain — documented in PRP body). - 1635 unit tests pass; 249 frontend tests pass. - Vertical-slice guard empty: zero imports from agents/ops/registry/ scenarios/rag in app/features/demo/. Out of scope (explicit) - No new backend endpoints, no new schemas, no Alembic migrations. - No widening of agent_require_approval (save_scenario already listed; HITL step consumes it). - No CRLF/LF line-ending normalisation bundled in. Contract probe report: PRPs/ai_docs/prp-41-contract-probe-report.md
There was a problem hiding this comment.
Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
PRP-41 — fourth and FINAL slice of the
/showcaseupgrade epic (PRP-38..41). Adds two pipeline phases (agentsHITL +opssnapshot) plus cross-cutting UI polish that closes #311.Closes: #321 (execution issue) · #311 (Phase accordion
onValueChangebug)What ships
Backend (
app/features/demo/pipeline.py)step_agent_hitl_flow— HITL approval round-trip on the experiment agent. DrivesPOST /agents/sessions+/chat+/approvevia ASGITransport; emits an intermediatestep_complete(status=running,awaiting_approval=true) for the FE Approve button; absorbs the 400 "No pending action" when the FE pre-empts the auto-approve; 90 s hard timeout falls back to skip so a hung agent never wedges the run.step_ops_snapshot— three GETs (/ops/summary,/ops/retraining-candidates,/ops/model-health) → 5-key KPI payload.warn(neverfail) on all-three-failed._Client.yield_eventopt-in hook +run_pipelineevent-sink drain. Orchestrator stampsstep_index/total_steps/phase_index/phase_total/phase_nameon every drained event before the terminalstep_complete(design Z, verified viable by Task 1 contract probe)._phase_table()— design Z: unifiedagentsphase id for BOTH scenarios; SHOWCASE_RICH swapsstep_agent → step_agent_hitl_flowand appendsops_snapshotunder a newopsphase beforecleanup. SHOWCASE_RICH = 24 rows / 10 phases; DEMO_MINIMAL = 11 rows (shape unchanged under newagentsphase id).Frontend (
frontend/src/)PHASE_DEFS.ts— design Z restructure: BOTH the legacyagentstep and the newagent_hitl_flowlive under the unifiedagentsphase id; newDEMO_MINIMAL_ONLY_STEP_NAMESset complementsSHOWCASE_RICH_STEP_NAMES. Lockstep test pins 24 tuples / 10 phases.DemoPhasePanel.tsx— addsonValueChangehandler + localuseState(closes fix(ui): unlock showcase phase accordion after completion #311 / D10): post-pipeline_completethe operator can finally expand any phase without snapping back to the fallback.demo-step-card.tsx—HitlFlowSummarychip-line +OpsSnapshotMiniGrid+ one-clickApproveButton(renders ONLY whenstatus=running && awaiting_approval=true).showcase.tsx— five chrome additions:ShowcaseKpiStrip— 5-tile KPI strip at the top.RunHistoryStrip— localStorage FIFO 5 with Replay button (keyforecastlab.showcase.runs.v1).WebSocketDisconnectreleases_pipeline_lock.InspectArtifactsPanel— 10 deep-link cards rendered afterpipeline_complete.resolveInspectHrefextended withagent_hitl_flow → CHAT,ops_snapshot → OPS.use-demo-pipeline.ts—stop()callback exposed viaUseDemoPipelineResult;DemoSummary.v2RunIdadded (mapped frompipeline_complete.data.v2_run_id).Docs
docs/user-guide/showcase-walkthrough.md— drops 7 "planned" markers across PRP-38/39/40/41 phases; adds concrete prose for Agents (HITL) + Ops snapshot + 5 polish items + performance budgets table refresh + screenshot placeholders.docs/_base/RUNBOOKS.md— 5 new failure-mode entries (23–27):agent_hitl_flowno-key / timeout / no-trigger,ops_snapshotall-failed, Stop button mid-run.Hard invariants honoured
git grep -nE "from app\.features\.(agents|ops|registry|scenarios|rag)" app/features/demo/empty.StepEvent.data: dict[str, Any]; no schema bump; no newevent_typevalues (just two new phase id VALUES:"agents"and"ops").agent_require_approvaluntouched (save_scenarioalready listed).[[repo-line-endings-crlf]]).Task 1 contract probe
Output:
PRPs/ai_docs/prp-41-contract-probe-report.md(407 lines). Zero field-level drift; all 5 unresolved contract assumptions resolved; one wording patch applied to PRP body § Task 9 (filter restructure for design Z).Validation results
uv run ruff check . && uv run ruff format --check .uv run mypy app/uv run pyright app/uv run pytest -m "not integration"cd frontend && pnpm tsc --noEmit -p tsconfig.app.jsoncd frontend && pnpm test --runuv run pytest -m integration tests/test_e2e_demo.py::test_run_demo_showcase_rich_full_epicgit grep "from app\.features\.(agents|ops|registry|scenarios|rag)" app/features/demo/Known pre-existing issue surfaced (NOT a PRP-41 regression)
The full
showcase_richflow hits a documented PRP-39/40 cascade bug on a fresh-DB run:safer_promote_flowswaps thedemo-productionalias to a placeholder run whoseartifact_uri_parse_artifact_keycannot parse, breaking the downstreamscenario_simulate_and_savestep. Documented in RUNBOOKS.md entry 18; tracked intest_run_demo_showcase_rich_full_epicas a tolerated cascade (the test still passes when this fires because all PRP-41 steps are downstream and never run — when they DO run, the PRP-41 contract assertions fire). Fixing this cascade is out of PRP-41 scope; a follow-up PRP should address it.Out of scope (explicit)
agent_require_approval.Test plan
dev-targeted PR.Contract probe report
PRPs/ai_docs/prp-41-contract-probe-report.md— Task 1 deliverable, field-for-field verification of every cited backend + frontend contract ondev@58d593a. Documents the four assumption resolutions and the one PRP wording patch applied before implementation.