|
| 1 | +AG-UI Decision Logic Spec (Tight Loop) v1 |
| 2 | +======================================== |
| 3 | + |
| 4 | +Status: Draft |
| 5 | +Owner: Clara Design Assistant |
| 6 | +Scope: Decision logic for switching from chat to rich UI (tables, process maps) |
| 7 | + |
| 8 | +This spec is standalone. Do not rely on other design artifacts. |
| 9 | + |
| 10 | +--- |
| 11 | + |
| 12 | +1. Goals |
| 13 | +-------- |
| 14 | +- Decide when to switch from chat to a rich UI based on user input and session state. |
| 15 | +- Minimize user typing for structured data capture. |
| 16 | +- Prevent tool thrash and ambiguous loops. |
| 17 | +- Provide deterministic, testable decision rules. |
| 18 | +- Use small models for routing and larger models for content. |
| 19 | + |
| 20 | +2. Non-Goals |
| 21 | +------------ |
| 22 | +- UI styling, layout, or frontend implementation details. |
| 23 | +- Domain-specific ontology or schema design beyond table/map scaffolds. |
| 24 | +- File upload and ingestion pipelines. |
| 25 | + |
| 26 | +3. Key Patterns Adopted (from claude-code analysis) |
| 27 | +--------------------------------------------------- |
| 28 | +- Self-refinement loop with completion promise and max iterations (Ralph loop). |
| 29 | +- Multi-phase gating with explicit approvals (feature-dev flow). |
| 30 | +- Confidence thresholding to reduce noise (code-review plugin). |
| 31 | +- Stateful checkpointing for resumption (advanced workflows). |
| 32 | +- User-defined guardrails derived from friction (hookify patterns). |
| 33 | +- Decision-point user contribution requests (learning output style). |
| 34 | + |
| 35 | +These patterns are explicitly incorporated in Sections 6-9. |
| 36 | + |
| 37 | +4. Architecture Overview |
| 38 | +------------------------ |
| 39 | +Components: |
| 40 | +- Router (small model): decides tool vs chat and generates tool parameters. |
| 41 | +- Orchestrator (Sonnet): handles natural language responses, tool explanations, and follow-up. |
| 42 | +- Tool Menu: enumerated UI tools with strict schema. |
| 43 | +- UI State Store: tracks tool status, completion criteria, and checkpoints. |
| 44 | + |
| 45 | +Data flow: |
| 46 | +User Input -> Router -> (Tool Call or Chat) -> UI -> Tool Result -> Orchestrator |
| 47 | + |
| 48 | +5. Router Inputs and Outputs |
| 49 | +---------------------------- |
| 50 | +Inputs: |
| 51 | +- user_message (string) |
| 52 | +- session_state (see Section 7) |
| 53 | +- last_tool (name or null) |
| 54 | +- last_tool_status (open, completed, canceled) |
| 55 | +- user_preferences (opt-out rules, preferred UI) |
| 56 | + |
| 57 | +Output JSON: |
| 58 | +{ |
| 59 | + "action": "tool" | "chat" | "clarify", |
| 60 | + "tool_name": "request_data_table" | "request_process_map" | null, |
| 61 | + "confidence": 0.0-1.0, |
| 62 | + "params": { ...tool schema... } | null, |
| 63 | + "rationale": "short string for logs" |
| 64 | +} |
| 65 | + |
| 66 | +Routing thresholds: |
| 67 | +- confidence >= 0.75: execute tool call immediately |
| 68 | +- 0.45 <= confidence < 0.75: ask one clarifying question, then re-route |
| 69 | +- confidence < 0.45: forward to Orchestrator for standard chat |
| 70 | + |
| 71 | +6. Tool Menu (v1) |
| 72 | +----------------- |
| 73 | +6.1 request_data_table |
| 74 | +Use when the user needs to provide a list of items, structured data, or bulk entries. |
| 75 | +Do NOT use for single items. |
| 76 | + |
| 77 | +Parameters: |
| 78 | +{ |
| 79 | + "title": "string", |
| 80 | + "columns": [ |
| 81 | + { "name": "string", "type": "text|number|enum|date|url", "required": true|false } |
| 82 | + ], |
| 83 | + "min_rows": number, |
| 84 | + "starter_rows": number, |
| 85 | + "input_modes": ["paste", "inline", "import"], |
| 86 | + "summary_prompt": "string" |
| 87 | +} |
| 88 | + |
| 89 | +6.2 request_process_map |
| 90 | +Use when the user describes a workflow, sequence of steps, timeline, or migration path. |
| 91 | +Capture "Step A -> Step B" relationships. |
| 92 | + |
| 93 | +Parameters: |
| 94 | +{ |
| 95 | + "title": "string", |
| 96 | + "required_fields": ["step_name", "owner", "outcome"], |
| 97 | + "edge_types": ["sequence", "approval", "parallel"], |
| 98 | + "min_steps": number, |
| 99 | + "seed_nodes": ["string"] |
| 100 | +} |
| 101 | + |
| 102 | +7. Session State (UI Checkpoints) |
| 103 | +--------------------------------- |
| 104 | +Persisted fields: |
| 105 | +{ |
| 106 | + "last_tool": "string|null", |
| 107 | + "last_tool_status": "open|completed|canceled", |
| 108 | + "ui_checkpoint": { |
| 109 | + "tool": "string", |
| 110 | + "payload": { ...tool params... }, |
| 111 | + "opened_at": "timestamp", |
| 112 | + "completion_criteria": { ... }, |
| 113 | + "iteration_count": number, |
| 114 | + "max_iterations": number |
| 115 | + }, |
| 116 | + "clarifying_question_pending": boolean, |
| 117 | + "user_opt_out": { |
| 118 | + "all_tools": boolean, |
| 119 | + "tools": ["request_data_table", "request_process_map"], |
| 120 | + "expires_at": "timestamp|null" |
| 121 | + } |
| 122 | +} |
| 123 | + |
| 124 | +Checkpointing rule: |
| 125 | +- When a tool is opened, create a ui_checkpoint. |
| 126 | +- If the session resumes and a checkpoint is open, re-open that UI without re-asking. |
| 127 | + |
| 128 | +8. Decision Rules (Tight Loop) |
| 129 | +------------------------------ |
| 130 | +Hard triggers (no clarifying question): |
| 131 | +- List size >= 3 (explicit or inferred: "we have 12 stakeholders"). |
| 132 | +- User mentions bulk entry, spreadsheet, or "paste a list". |
| 133 | +- Workflow markers: "first/then/after/before/next", "approval process", "pipeline", "migration steps". |
| 134 | + |
| 135 | +Soft triggers (ask once, then tool): |
| 136 | +- Mentions "process" without steps -> ask: "Want to map the steps now?" |
| 137 | +- Mentions "stakeholders/risks/issues" without quantity -> ask for count. |
| 138 | +- Mentions multiple roles + any process language -> prefer process map. |
| 139 | + |
| 140 | +Anti-thrash rules: |
| 141 | +- If last_tool_status == "open", do not call another tool. |
| 142 | +- If user_opt_out.all_tools is true, do not call tools unless user re-enables. |
| 143 | +- If user cancels a tool twice in a row, switch to chat for the next turn. |
| 144 | + |
| 145 | +Chat-only cases: |
| 146 | +- Greetings, explanations, clarifications, or coaching. |
| 147 | +- Single-item edits or short statements. |
| 148 | + |
| 149 | +9. Completion Logic and Self-Refinement Loop |
| 150 | +-------------------------------------------- |
| 151 | +Completion criteria: |
| 152 | +- Data table: min_rows satisfied AND required columns filled. |
| 153 | +- Process map: min_steps satisfied AND required_fields filled. |
| 154 | + |
| 155 | +Self-refinement loop (Ralph pattern): |
| 156 | +- After tool submission, run validation checks. |
| 157 | +- If criteria not met, ask a targeted fix question and re-open tool. |
| 158 | +- Max iterations: 2 (configurable per tool). |
| 159 | +- Never loop without explicit user feedback. |
| 160 | + |
| 161 | +10. Validation and Confidence Gating |
| 162 | +------------------------------------ |
| 163 | +Run parallel validations (code-review pattern): |
| 164 | +- missing_required_fields |
| 165 | +- duplicate_entries |
| 166 | +- contradictory_sequences (for process map) |
| 167 | +- low_coverage (too few rows/steps vs stated scope) |
| 168 | + |
| 169 | +Each validation returns a confidence score. Only surface issues with |
| 170 | +confidence >= 0.80 to avoid noisy warnings. |
| 171 | + |
| 172 | +11. User Guardrails (Hookify pattern) |
| 173 | +------------------------------------- |
| 174 | +Allow users to set local rules: |
| 175 | +- "Never show tables for stakeholder lists" |
| 176 | +- "Always map approval processes" |
| 177 | + |
| 178 | +Rule format: |
| 179 | +{ |
| 180 | + "intent_pattern": "regex", |
| 181 | + "action": "force_tool|suppress_tool", |
| 182 | + "tool": "request_data_table|request_process_map" |
| 183 | +} |
| 184 | + |
| 185 | +Rules are applied before Router decisions. |
| 186 | + |
| 187 | +12. Examples |
| 188 | +------------ |
| 189 | +Example A: |
| 190 | +User: "We have 20 stakeholders across finance, ops, and IT." |
| 191 | +Router: action=tool, tool=request_data_table, confidence=0.86 |
| 192 | +Params: columns=[Name, Role, Influence], min_rows=20, input_modes=["paste","inline"] |
| 193 | + |
| 194 | +Example B: |
| 195 | +User: "First finance reviews the invoice, then IT signs off, then CFO approves." |
| 196 | +Router: action=tool, tool=request_process_map, confidence=0.91 |
| 197 | +Params: required_fields=[step_name, owner, outcome], min_steps=3 |
| 198 | + |
| 199 | +Example C: |
| 200 | +User: "We have some risks." |
| 201 | +Router: action=clarify, confidence=0.58 |
| 202 | +Clarify: "How many risks are we capturing?" |
| 203 | +Then: tool if count >=3, else chat. |
| 204 | + |
| 205 | +13. Model Routing |
| 206 | +----------------- |
| 207 | +Router model options: |
| 208 | +- Claude 3.5 Haiku (default): fast, strong tool selection. |
| 209 | +- Llama 3.1 8B fine-tuned: edge router with 500+ tool-call examples. |
| 210 | + |
| 211 | +Fallback: |
| 212 | +- If Router output is invalid or confidence < 0.45, send to Sonnet. |
| 213 | +- Orchestrator always validates tool parameters before execution. |
| 214 | + |
| 215 | +14. Telemetry |
| 216 | +------------- |
| 217 | +Log events (privacy-safe): |
| 218 | +- router_decision (action, tool, confidence) |
| 219 | +- tool_opened, tool_submitted, tool_canceled |
| 220 | +- validation_warning_shown (type, confidence) |
| 221 | +- user_opt_out_changed |
| 222 | + |
| 223 | +--- |
| 224 | + |
| 225 | +End of spec. |
0 commit comments