Skip to content

feat: Register SDK hooks for tool monitoring and error handling #493

@PureWeen

Description

@PureWeen

Problem

PolyPilot registers zero SDK hooks despite 6 being available (PreToolUse, PostToolUse, UserPromptSubmitted, SessionStart, SessionEnd, ErrorOccurred). Instead, tool monitoring is done by counting ToolExecutionStartEvent/ToolExecutionCompleteEvent events, and error handling is scattered across multiple catch blocks.

SDK Hooks Available

Hook Input Key Outputs Potential Use
OnPreToolUse ToolName, ToolArgs, Cwd PermissionDecision (approve/deny), ModifiedArgs, AdditionalContext Accurate tool call tracking, per-worker tool restrictions, inject context before tools
OnPostToolUse ToolName, ToolArgs, ToolResult ModifiedResult, AdditionalContext, SuppressOutput Post-process tool results, accurate completion tracking
OnUserPromptSubmitted Prompt, Cwd ModifiedPrompt, AdditionalContext Prompt rewriting, context injection
OnSessionStart Cwd, Source, InitialPrompt AdditionalContext, ModifiedConfig Custom initialization
OnSessionEnd Reason, FinalMessage, Error CleanupActions, SessionSummary Custom cleanup, session summaries
OnErrorOccurred Error, ErrorContext, Recoverable ErrorHandling, RetryCount, UserNotification Centralized error recovery instead of scattered try/catch

JS-Only Hooks (not in .NET SDK yet):

  • AgentStop — fires when agent stops; can return decision: "block" to force continuation
  • SubagentStop — same for subagents; could prevent workers from stopping too early

What to Change

Phase 1 (low risk):

  1. Register OnPreToolUse and OnPostToolUse for accurate tool call counting — replace ActiveToolCallCount event-based tracking
  2. Register OnErrorOccurred for centralized error handling with retry support

Phase 2 (medium risk):

  1. Use OnPreToolUse with SessionConfig.AvailableTools/ExcludedTools to enforce tool restrictions per worker role
  2. Use OnUserPromptSubmitted to inject dynamic context (e.g., squad decisions.md) at prompt time instead of in BuildWorkerPrompt

Phase 3 (when .NET SDK adds them):

  1. Use AgentStop/SubagentStop to supplement watchdog — could reduce some timeout tiers

Note

Hooks supplement but do NOT replace the custom watchdog (see processing-state-safety skill SDK migration matrix for why).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions