Add a `validator:` block for semantic output validation with retry-once

## Summary

Add an optional `validator:` block to any provider-backed agent that runs a **second LLM call** to check the primary output against a user-defined rubric. If validation fails, re-run the primary agent **once**, with the validator's feedback appended to the prompt.

## Motivation

We already have two correctness mechanisms, but neither catches the most common real failure mode:

- **`retry:`** — handles *transient* failures (provider errors, timeouts). It retries the same prompt unchanged.
- **`output:` schema** — handles *shape* failures (missing field, wrong type). Pass/fail on structure, nothing about whether the content is *good*.

The missing case: the agent returns a structurally valid output that is semantically wrong, incomplete, or off-rubric. Examples:

- Code-review agent returns "looks good!" but missed an obvious null-pointer bug
- Summarizer drops the most important point
- Plan agent skips a required step
- Research agent hallucinates a source

Today the only fix is to make the prompt longer / sterner, which doesn't actually solve the underlying problem.

## Proposed shape

```yaml
agents:
  - name: code_reviewer
    model: claude-sonnet-4-5
    prompt: |
      Review the diff for bugs and suggest fixes.
      Diff:
      {{ workflow.input.diff }}
    output:
      summary: { type: string }
      issues: { type: array }

    validator:
      model: claude-sonnet-4-5  # can be a cheaper/different model
      criteria: |
        Verify the review:
        1. Identifies all null-safety issues in the diff
        2. Each `issues[*].suggestion` is actionable (not just "fix it")
        3. No fabricated function names
      max_retries: 1  # default 1; only this loop counts, not retry: count
```

### Mechanics

1. Primary agent runs as usual, produces output.
2. If `validator:` is present, run a second LLM call. The validator receives:
   - The primary agent's prompt (rendered)
   - The primary agent's output
   - The `criteria:` text
   - A required JSON output shape: `{ passed: bool, issues: list[str] }`
3. If `passed: true`, the primary output flows downstream unchanged.
4. If `passed: false` and `max_retries > 0`, re-run the primary agent **once**, appending a `## Validation feedback` section to the prompt with the issues. Take the second output as final (whether or not it would pass a second validator check — we don't loop forever).
5. Emit `agent_validator_start` / `agent_validator_complete` events for dashboard visibility.

## Why now

- This is the single highest-leverage correctness pattern we don't support.
- Composes cleanly with everything we already have. Distinct from `retry:` (transient) and `output:` (shape).
- The implementation is a small wrapper around the existing `AgentExecutor`: validator is just another agent execution whose output triggers conditional re-execution of the primary.
- Works across providers (Copilot + Claude) via the existing `AgentProvider` interface.

## Open questions

- Should the validator receive prior context (other agents' outputs), or only the primary's prompt + output? Default to just the latter — keeps validation focused and cheap.
- Should retries be capped at 1, or allow `max_retries: 2`? Strong preference for **hard cap at 1** — beyond that you're fighting prompt design, not output noise.
- Cost reporting: validator tokens should be counted as part of the primary agent's row in cost output, not a separate row. Or a separate row? Probably separate so users see the validator cost explicitly.
- Should validator failures appear in the JSONL event log as a distinct event type? Yes — `agent_validation_failed` so the dashboard can surface them.

## Acceptance criteria

- [ ] `validator:` block accepted on `AgentDef` (provider-backed agents only)
- [ ] Schema rejects `validator:` on `script` / `human_gate` / `workflow` types
- [ ] Validator call runs after primary completes, before output is committed to context
- [ ] Failed validation triggers exactly one re-run of the primary with feedback appended
- [ ] Both validator and re-run are visible in the event stream and web dashboard
- [ ] Cost/usage tracking includes validator + re-run tokens
- [ ] Example under `examples/` (e.g. code review with hallucination check)
- [ ] Tests: validator passes, validator fails + retry succeeds, validator fails + retry also bad (commit second output), validator API error (treat as pass, log warning), interaction with existing `retry:` policy


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a `validator:` block for semantic output validation with retry-once #220

Summary

Motivation

Proposed shape

Mechanics

Why now

Open questions

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add a validator: block for semantic output validation with retry-once #220

Description

Summary

Motivation

Proposed shape

Mechanics

Why now

Open questions

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Add a `validator:` block for semantic output validation with retry-once #220