From badba704c880f131a3ce873f88ebf9b5696ac3e1 Mon Sep 17 00:00:00 2001 From: jdnichollsc Date: Sun, 24 Aug 2025 23:47:56 -0500 Subject: [PATCH 1/4] feat(docs): add docs about openai agents with temporal --- docs/openai_agents/AGENT_PATTERNS.md | 1069 +++++++++++++++++ docs/openai_agents/BASIC.md | 741 ++++++++++++ docs/openai_agents/CUSTOMER_SERVICE.md | 503 ++++++++ .../openai_agents/FINANCIAL_RESEARCH_AGENT.md | 547 +++++++++ docs/openai_agents/HANDOFFS.md | 586 +++++++++ docs/openai_agents/HOSTED_MCP.md | 562 +++++++++ docs/openai_agents/MODEL_PROVIDERS.md | 550 +++++++++ docs/openai_agents/README.md | 139 +++ docs/openai_agents/REASONING_CONTENT.md | 541 +++++++++ docs/openai_agents/RESEARCH_BOT.md | 324 +++++ docs/openai_agents/TOOLS.md | 792 ++++++++++++ 11 files changed, 6354 insertions(+) create mode 100644 docs/openai_agents/AGENT_PATTERNS.md create mode 100644 docs/openai_agents/BASIC.md create mode 100644 docs/openai_agents/CUSTOMER_SERVICE.md create mode 100644 docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md create mode 100644 docs/openai_agents/HANDOFFS.md create mode 100644 docs/openai_agents/HOSTED_MCP.md create mode 100644 docs/openai_agents/MODEL_PROVIDERS.md create mode 100644 docs/openai_agents/README.md create mode 100644 docs/openai_agents/REASONING_CONTENT.md create mode 100644 docs/openai_agents/RESEARCH_BOT.md create mode 100644 docs/openai_agents/TOOLS.md diff --git a/docs/openai_agents/AGENT_PATTERNS.md b/docs/openai_agents/AGENT_PATTERNS.md new file mode 100644 index 000000000..f21e8eeb7 --- /dev/null +++ b/docs/openai_agents/AGENT_PATTERNS.md @@ -0,0 +1,1069 @@ +# Agent Patterns + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Agent Patterns service demonstrates advanced agentic patterns extended with Temporal's durable execution capabilities. This service showcases sophisticated patterns like deterministic flows, parallelization, LLM-as-a-judge, agents-as-tools, and guardrails, all implemented within Temporal's reliable workflow framework. + +The system is designed for developers and engineering teams who want to: +- Learn advanced agent orchestration patterns with Temporal +- Implement complex multi-agent workflows with validation gates +- Build parallel execution systems for improved quality and efficiency +- Create safety mechanisms through input and output guardrails +- Compose agents as tools within other agents for specialized task delegation + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Complex Task Decomposition**: Breaking down complex tasks into manageable, validated steps +- **Quality Improvement**: Using feedback loops and multiple agents to enhance output quality +- **Parallel Processing**: Running multiple agents concurrently for efficiency and redundancy +- **Safety and Validation**: Implementing guardrails for input validation and output safety +- **Agent Composition**: Building complex systems from simpler, specialized agents +- **Non-Streaming Adaptation**: Adapting streaming patterns to Temporal's non-streaming workflow model + +### Our Approach +- **Pattern-Based Design**: Establish reusable patterns for common agent interactions +- **Validation Gates**: Use agents to validate and improve outputs from other agents +- **Parallel Execution**: Leverage Temporal's capabilities for concurrent agent execution +- **Safety First**: Implement comprehensive guardrails for production safety +- **Composability**: Enable agents to work as tools within other agents +- **Temporal Integration**: Leverage workflow durability and state management + +## ⚡ System Constraints & Features + +### Key Features +- **Deterministic Flows**: Sequential agent execution with validation gates and Pydantic models +- **Parallelization**: Multiple agents running concurrently with result selection +- **LLM-as-a-Judge**: Iterative improvement using feedback loops with structured evaluation +- **Agents as Tools**: Use agents as callable tools within other agents via `as_tool()` +- **Agent Routing**: Route requests to specialized agents based on content analysis +- **Input Guardrails**: Pre-execution validation using `@input_guardrail` decorator +- **Output Guardrails**: Post-execution validation using `@output_guardrail` decorator +- **Forcing Tool Use**: Control tool execution strategies with `ModelSettings(tool_choice="required")` + +### System Constraints +- **No Streaming**: Temporal workflows don't support streaming responses +- **Deterministic Execution**: All workflow code must be deterministic +- **Activity-Based I/O**: External calls must be wrapped in activities +- **State Persistence**: Automatic state management through Temporal +- **Task Queue**: Uses `"openai-agents-patterns-task-queue"` for all workflows +- **Pydantic Models**: Output validation requires structured Pydantic models + +## 🏗️ System Overview + +```mermaid +graph TB + A[Client Request] --> B[Temporal Workflow] + B --> C[Pattern Orchestrator] + C --> D[Agent 1] + C --> E[Agent 2] + C --> F[Agent 3] + C --> G[Validation Agent] + + D --> H[Result 1] + E --> I[Result 2] + F --> J[Result 3] + + H --> G + I --> G + J --> G + + G --> K[Final Output] + K --> B + B --> L[Client Response] + + M[Temporal Server] --> B + N[Worker Process] --> B + O[OpenAI API] --> D + O --> E + O --> F + O --> G + + P[Guardrails] --> C + Q[Tool Integration] --> C + R[Pydantic Validation] --> G +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant C as Client + participant T as Temporal Workflow + participant O as Orchestrator + participant A1 as Agent 1 + participant A2 as Agent 2 + participant V as Validator + participant OAI as OpenAI API + participant G as Guardrails + + C->>T: Start Pattern Workflow + T->>O: Initialize Pattern + G->>O: Input Validation + + alt Parallel Execution + O->>A1: Execute Task + O->>A2: Execute Task (Parallel) + + A1->>OAI: Generate Response + OAI->>A1: Response 1 + A2->>OAI: Generate Response + OAI->>A2: Response 2 + + A1->>O: Return Result 1 + A2->>O: Return Result 2 + else Sequential Execution + O->>A1: Execute Task + A1->>OAI: Generate Response + OAI->>A1: Response + A1->>O: Return Result + O->>V: Validate Result + V->>OAI: Evaluate Quality + OAI->>V: Validation Result + V->>O: Validation Complete + end + + O->>T: Return Final Result + T->>C: Workflow Complete +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflows for pattern orchestration with `@workflow.defn` +2. **Pattern Layer**: Specific pattern implementations (deterministic, parallel, etc.) +3. **Agent Layer**: Specialized agents with Pydantic output types and guardrails +4. **Validation Layer**: Quality assurance using structured evaluation and feedback +5. **Tool Layer**: Function tools and agent-as-tool integration via `as_tool()` + +### Key Components +- **Pattern Workflows**: Implement specific agent patterns with Temporal integration +- **Agent Orchestrator**: Coordinates multiple agents and their interactions +- **Validation Agents**: Ensure quality and safety using structured feedback +- **Tool Integration**: Function tools and agent tools via `as_tool()` method +- **Result Selector**: Choose best results from parallel execution +- **Guardrail System**: Input and output validation with tripwire mechanisms + +## 🔗 Interaction Flow + +### Internal Communication +- Workflows orchestrate pattern execution using `Runner.run()` with `RunConfig` +- Agents communicate through structured data exchange with Pydantic models +- Validation agents provide feedback using structured evaluation classes +- Results are synthesized through `ItemHelpers.text_message_outputs()` +- Guardrails intercept execution using `@input_guardrail` and `@output_guardrail` + +### External Dependencies +- **OpenAI API**: For agent responses and validation via `OpenAIAgentsPlugin` +- **Temporal Server**: For workflow orchestration and state management +- **Pydantic**: For output validation and structured data models +- **Function Tools**: Custom tools defined with `@function_tool` decorator + +## 💻 Development Guidelines + +### Code Organization +- **Pattern Workflows**: One file per pattern type in `workflows/` directory +- **Agent Definitions**: Specialized agents with specific instructions and output types +- **Runner Scripts**: Individual execution scripts for each pattern in root directory +- **Worker**: Central worker supporting all patterns in `run_worker.py` + +### Design Patterns +- **Deterministic Pattern**: Sequential execution with validation gates using Pydantic models +- **Parallel Pattern**: Concurrent execution using `asyncio.gather()` with result selection +- **Feedback Pattern**: Iterative improvement using structured evaluation classes +- **Composition Pattern**: Agents working as tools via `as_tool()` method +- **Guardrail Pattern**: Input and output validation with tripwire mechanisms + +### Error Handling +- **Validation Failures**: Handle cases where agents don't meet quality standards +- **Parallel Failures**: Continue execution when some agents fail +- **Guardrail Triggers**: Handle safety violations via `GuardrailTripwireTriggered` exceptions +- **Workflow Retries**: Automatic retry with exponential backoff via Temporal +- **Pydantic Validation**: Structured output validation with clear error messages + +## 📝 Code Examples & Best Practices + +### Deterministic Flow Pattern +**File**: `openai_agents/agent_patterns/workflows/deterministic_workflow.py` + +This pattern demonstrates sequential agent execution with validation gates. Each step must pass validation before proceeding to the next, ensuring quality and consistency. + +```python +from agents import Agent, RunConfig, Runner, trace +from pydantic import BaseModel +from temporalio import workflow + +# Define structured output for validation - this ensures type safety and clear validation logic +class OutlineCheckerOutput(BaseModel): + good_quality: bool # Quality gate: must be True to proceed + is_scifi: bool # Genre gate: must be True for this example + +# Agent 1: Generates the initial story outline +def story_outline_agent() -> Agent: + return Agent( + name="story_outline_agent", + instructions="Generate a very short story outline based on the user's input.", + ) + +# Agent 2: Validates the outline quality and genre - this is the validation gate +def outline_checker_agent() -> Agent: + return Agent( + name="outline_checker_agent", + instructions="Read the given story outline, and judge the quality. Also, determine if it is a scifi story.", + output_type=OutlineCheckerOutput, # Enforces structured output for validation + ) + +# Agent 3: Writes the final story (only called if validation passes) +def story_agent() -> Agent: + return Agent( + name="story_agent", + instructions="Write a short story based on the given outline.", + output_type=str, + ) + +@workflow.defn +class DeterministicWorkflow: + @workflow.run + async def run(self, input_prompt: str) -> str: + # RunConfig ensures consistent execution settings across all agent calls + config = RunConfig() + + # Use trace() for observability - this creates a single trace span for the entire workflow + with trace("Deterministic story flow"): + # Step 1: Generate an outline using the first agent + outline_result = await Runner.run( + story_outline_agent(), + input_prompt, + run_config=config, + ) + workflow.logger.info("Outline generated") + + # Step 2: Validate the outline using the second agent with structured output + outline_checker_result = await Runner.run( + outline_checker_agent(), + outline_result.final_output, + run_config=config, + ) + + # Step 3: Validation gate using Pydantic model - this is where we enforce quality + assert isinstance(outline_checker_result.final_output, OutlineCheckerOutput) + + # Quality gate: stop if outline quality is insufficient + if not outline_checker_result.final_output.good_quality: + return "Story generation stopped: Outline quality insufficient." + + # Genre gate: stop if not science fiction (business rule enforcement) + if not outline_checker_result.final_output.is_scifi: + return "Story generation stopped: Outline is not science fiction." + + # Step 4: Continue only if validation passes - this ensures quality output + workflow.logger.info("Outline validation passed, generating story") + story_result = await Runner.run( + story_agent(), + outline_result.final_output, + run_config=config, + ) + + return f"Final story: {story_result.final_output}" +``` + +**Key Benefits**: +- **Quality Assurance**: Multiple validation gates ensure output quality +- **Business Rule Enforcement**: Genre and quality requirements are enforced programmatically +- **Observability**: Single trace span makes debugging easier +- **Type Safety**: Pydantic models prevent runtime errors + +### Parallelization Pattern +**File**: `openai_agents/agent_patterns/workflows/parallelization_workflow.py` + +This pattern runs multiple agents concurrently and selects the best result, improving both quality and efficiency through redundancy. + +```python +import asyncio +from agents import Agent, ItemHelpers, RunConfig, Runner, trace +from temporalio import workflow + +# Agent that performs the actual work (translation in this case) +def spanish_agent() -> Agent: + return Agent( + name="spanish_agent", + instructions="You translate the user's message to Spanish", + ) + +# Judge agent that selects the best result from multiple candidates +def translation_picker() -> Agent: + return Agent( + name="translation_picker", + instructions="You pick the best Spanish translation from the given options.", + ) + +@workflow.defn +class ParallelizationWorkflow: + @workflow.run + async def run(self, msg: str) -> str: + config = RunConfig() + + with trace("Parallel translation"): + # Run three translation agents in parallel using asyncio.gather() + # This improves efficiency and provides redundancy for better quality + res_1, res_2, res_3 = await asyncio.gather( + Runner.run(spanish_agent(), msg, run_config=config), + Runner.run(spanish_agent(), msg, run_config=config), + Runner.run(spanish_agent(), msg, run_config=config), + ) + + # Extract text outputs using ItemHelpers - this handles the message structure properly + # ItemHelpers.text_message_outputs() extracts all text content from the agent's response + outputs = [ + ItemHelpers.text_message_outputs(res_1.new_items), + ItemHelpers.text_message_outputs(res_2.new_items), + ItemHelpers.text_message_outputs(res_3.new_items), + ] + + # Combine all translations for the judge agent to evaluate + translations = "\n\n".join(outputs) + workflow.logger.info(f"Generated translations:\n{translations}") + + # Use a judge agent to select the best result from multiple candidates + # This provides quality improvement through competition + best_translation = await Runner.run( + translation_picker(), + f"Input: {msg}\n\nTranslations:\n{translations}", + run_config=config, + ) + + return f"Best translation: {best_translation.final_output}" +``` + +**Key Benefits**: +- **Efficiency**: Parallel execution reduces total latency +- **Quality Improvement**: Multiple candidates allow selection of the best result +- **Redundancy**: If one agent fails, others can still succeed +- **Competition**: Agents compete to produce better results + +### LLM-as-a-Judge Pattern +**File**: `openai_agents/agent_patterns/workflows/llm_as_a_judge_workflow.py` + +This pattern implements iterative improvement using feedback loops, where one agent generates content and another evaluates and provides feedback for improvement. + +```python +from dataclasses import dataclass +from typing import Literal +from agents import Agent, ItemHelpers, RunConfig, Runner, TResponseInputItem, trace +from temporalio import workflow + +# Structured feedback for evaluation - ensures consistent evaluation criteria +@dataclass +class EvaluationFeedback: + feedback: str # Specific improvement suggestions + score: Literal["pass", "needs_improvement", "fail"] # Clear evaluation outcome + +# Agent that generates content and incorporates feedback for improvement +def story_outline_generator() -> Agent: + return Agent( + name="story_outline_generator", + instructions=( + "You generate a very short story outline based on the user's input." + "If there is any feedback provided, use it to improve the outline." + ), + ) + +# Judge agent that evaluates quality and provides structured feedback +def evaluator() -> Agent: + return Agent( + name="evaluator", + instructions=( + "You evaluate a story outline and decide if it's good enough." + "If it's not good enough, you provide feedback on what needs to be improved." + "Never give it a pass on the first try. After 5 attempts, you can give it a pass if story outline is good enough - do not go for perfection" + ), + output_type=EvaluationFeedback, # Enforces structured feedback + ) + +@workflow.defn +class LLMAsAJudgeWorkflow: + @workflow.run + async def run(self, msg: str) -> str: + config = RunConfig() + + # Initialize conversation context with the user's message + input_items: list[TResponseInputItem] = [{"content": msg, "role": "user"}] + latest_outline: str | None = None + + with trace("LLM as a judge"): + while True: # Iterative improvement loop + # Generate or improve the story outline + story_outline_result = await Runner.run( + story_outline_generator(), + input_items, # Pass conversation history for context + run_config=config, + ) + + # Update conversation context with the new outline + input_items = story_outline_result.to_input_list() + latest_outline = ItemHelpers.text_message_outputs( + story_outline_result.new_items + ) + workflow.logger.info("Story outline generated") + + # Evaluate the outline quality using the judge agent + evaluator_result = await Runner.run( + evaluator(), + input_items, + run_config=config, + ) + result: EvaluationFeedback = evaluator_result.final_output + + # Check if quality threshold is met + if result.score == "pass": + workflow.logger.info("Story outline is good enough, exiting.") + break + + # Add feedback to conversation context for the next iteration + # This creates a continuous improvement loop + input_items.append( + {"content": f"Feedback: {result.feedback}", "role": "user"} + ) + + return f"Final story outline: {latest_outline}" +``` + +**Key Benefits**: +- **Continuous Improvement**: Feedback loops lead to better quality output +- **Structured Evaluation**: Clear pass/fail criteria with specific feedback +- **Context Preservation**: Conversation history maintains improvement context +- **Quality Control**: Prevents poor quality output from being accepted + +### Agents as Tools Pattern +**File**: `openai_agents/agent_patterns/workflows/agents_as_tools_workflow.py` + +This pattern allows agents to use other agents as callable tools, enabling composition and specialized task delegation within a single workflow. + +```python +from agents import Agent, ItemHelpers, MessageOutputItem, RunConfig, Runner, trace +from temporalio import workflow + +def orchestrator_agent() -> Agent: + # Create specialized translation agents with clear handoff descriptions + spanish_agent = Agent( + name="spanish_agent", + instructions="You translate the user's message to Spanish", + handoff_description="An english to spanish translator", # Helps with agent selection + ) + + french_agent = Agent( + name="french_agent", + instructions="You translate the user's message to French", + handoff_description="An english to french translator", + ) + + # Main orchestrator agent that coordinates other agents as tools + orchestrator_agent = Agent( + name="orchestrator_agent", + instructions=( + "You are a translation agent. You use the tools given to you to translate." + "If asked for multiple translations, you call the relevant tools in order." + "You never translate on your own, you always use the provided tools." + ), + tools=[ + # Convert agents to tools using as_tool() - this enables composition + spanish_agent.as_tool( + tool_name="translate_to_spanish", + tool_description="Translate the user's message to Spanish", + ), + french_agent.as_tool( + tool_name="translate_to_french", + tool_description="Translate the user's message to French", + ), + ], + ) + return orchestrator_agent + +# Agent that synthesizes and validates the final output +def synthesizer_agent() -> Agent: + return Agent( + name="synthesizer_agent", + instructions="You inspect translations, correct them if needed, and produce a final concatenated response.", + ) + +@workflow.defn +class AgentsAsToolsWorkflow: + @workflow.run + async def run(self, msg: str) -> str: + config = RunConfig() + + with trace("Orchestrator evaluator"): + orchestrator = orchestrator_agent() + synthesizer = synthesizer_agent() + + # The orchestrator agent uses other agents as tools to perform translations + orchestrator_result = await Runner.run(orchestrator, msg, run_config=config) + + # Log each translation step for observability and debugging + for item in orchestrator_result.new_items: + if isinstance(item, MessageOutputItem): + text = ItemHelpers.text_message_output(item) + if text: + workflow.logger.info(f" - Translation step: {text}") + + # Use a synthesizer agent to combine and validate the final output + # This ensures quality and consistency of the final result + synthesizer_result = await Runner.run( + synthesizer, orchestrator_result.to_input_list(), run_config=config + ) + + return synthesizer_result.final_output +``` + +**Key Benefits**: +- **Agent Composition**: Build complex systems from simpler, specialized agents +- **Tool Reusability**: Agents can be used as tools in multiple contexts +- **Specialized Delegation**: Each agent focuses on its specific expertise +- **Coordination**: Orchestrator agent manages the overall workflow + +### Agent Routing Pattern +**File**: `openai_agents/agent_patterns/workflows/routing_workflow.py` + +This pattern demonstrates intelligent routing based on content analysis, where a triage agent determines which specialized agent should handle the request. + +```python +from agents import Agent, RunConfig, Runner, TResponseInputItem, trace +from temporalio import workflow + +# Specialized agents for different languages - each has a specific domain expertise +def french_agent() -> Agent: + return Agent( + name="french_agent", + instructions="You only speak French", # Clear specialization + ) + +def spanish_agent() -> Agent: + return Agent( + name="spanish_agent", + instructions="You only speak Spanish", # Clear specialization + ) + +def english_agent() -> Agent: + return Agent( + name="english_agent", + instructions="You only speak English", # Clear specialization + ) + +# Triage agent that analyzes the request and routes to appropriate specialized agent +def triage_agent() -> Agent: + return Agent( + name="triage_agent", + instructions="Handoff to the appropriate agent based on the language of the request.", + handoffs=[french_agent(), spanish_agent(), english_agent()], # Available routing options + ) + +@workflow.defn +class RoutingWorkflow: + @workflow.run + async def run(self, msg: str) -> str: + config = RunConfig() + + with trace("Routing example"): + # Initialize conversation context with the user's message + inputs: list[TResponseInputItem] = [{"content": msg, "role": "user"}] + + # Run the triage agent to determine which language agent to handoff to + # The triage agent analyzes the message content and selects the appropriate specialist + result = await Runner.run( + triage_agent(), + input=inputs, + run_config=config, + ) + + # Log successful handoff for observability + workflow.logger.info("Handoff completed") + + # Convert result to proper input format for the next agent + # This maintains conversation context across the handoff + inputs = result.to_input_list() + + # Return the result from the handoff (either the handoff agent's response or triage response) + return f"Response: {result.final_output}" +``` + +**Key Benefits**: +- **Intelligent Routing**: Automatic selection of the most appropriate agent +- **Specialization**: Each agent focuses on its specific domain +- **Scalability**: Easy to add new specialized agents +- **Context Preservation**: Conversation context maintained across handoffs + +### Input Guardrails Pattern +**File**: `openai_agents/agent_patterns/workflows/input_guardrails_workflow.py` + +This pattern implements pre-execution validation to prevent unwanted or unsafe requests from being processed by the main agent. + +```python +from agents import ( + Agent, GuardrailFunctionOutput, InputGuardrailTripwireTriggered, + RunConfig, RunContextWrapper, Runner, TResponseInputItem, input_guardrail +) +from pydantic import BaseModel +from temporalio import workflow + +# Structured output for guardrail validation - ensures clear decision criteria +class MathHomeworkOutput(BaseModel): + reasoning: str # Explanation of why the input was flagged + is_math_homework: bool # Boolean flag for tripwire decision + +# Dedicated agent for input validation - specialized for safety checks +guardrail_agent = Agent( + name="Guardrail check", + instructions="Check if the user is asking you to do their math homework.", + output_type=MathHomeworkOutput, # Enforces structured validation output +) + +# Input guardrail decorator - this intercepts all input before it reaches the main agent +@input_guardrail +async def math_guardrail( + context: RunContextWrapper[None], # Runtime context for the guardrail + agent: Agent, # The agent being protected + input: str | list[TResponseInputItem], # The input to validate +) -> GuardrailFunctionOutput: + """Input guardrail function that calls an agent to check if input is math homework.""" + + # Use a specialized agent to perform the validation check + result = await Runner.run(guardrail_agent, input, context=context.context) + final_output = result.final_output_as(MathHomeworkOutput) + + # Return guardrail result with tripwire decision + return GuardrailFunctionOutput( + output_info=final_output, # Detailed validation information + tripwire_triggered=final_output.is_math_homework, # Decision to block execution + ) + +@workflow.defn +class InputGuardrailsWorkflow: + @workflow.run + async def run(self, user_input: str) -> str: + config = RunConfig() + + # Main agent with input guardrails attached + agent = Agent( + name="Customer support agent", + instructions="You are a customer support agent. You help customers with their questions.", + input_guardrails=[math_guardrail], # Attach the guardrail for protection + ) + + # Format input for the agent + input_data: list[TResponseInputItem] = [ + {"role": "user", "content": user_input} + ] + + try: + # Attempt to run the agent - guardrail will intercept if needed + result = await Runner.run(agent, input_data, run_config=config) + return str(result.final_output) + except InputGuardrailTripwireTriggered: + # Handle guardrail tripwire - provide safe fallback response + workflow.logger.info( + "Input guardrail triggered - refusing to help with math homework" + ) + return "Sorry, I can't help you with your math homework." +``` + +**Key Benefits**: +- **Pre-Execution Safety**: Blocks unsafe requests before processing +- **Specialized Validation**: Dedicated agents for specific safety checks +- **Graceful Degradation**: Provides safe fallback responses +- **Audit Trail**: Logs all guardrail activations for compliance + +### Output Guardrails Pattern +**File**: `openai_agents/agent_patterns/workflows/output_guardrails_workflow.py` + +This pattern implements post-execution validation to detect sensitive or inappropriate content in agent responses before they are returned to users. + +```python +from agents import ( + Agent, GuardrailFunctionOutput, OutputGuardrailTripwireTriggered, + RunConfig, RunContextWrapper, Runner, output_guardrail +) +from pydantic import BaseModel, Field +from temporalio import workflow + +# Structured output for the main agent - includes reasoning for transparency +class MessageOutput(BaseModel): + reasoning: str = Field(description="Thoughts on how to respond to the user's message") + response: str = Field(description="The response to the user's message") + user_name: str | None = Field(description="The name of the user who sent the message, if known") + +# Output guardrail decorator - this intercepts all output before it's returned +@output_guardrail +async def sensitive_data_check( + context: RunContextWrapper, # Runtime context for the guardrail + agent: Agent, # The agent being monitored + output: MessageOutput # The output to validate +) -> GuardrailFunctionOutput: + """Output guardrail that checks for sensitive data like phone numbers.""" + + # Check for sensitive data in both reasoning and response + phone_number_in_response = "650" in output.response + phone_number_in_reasoning = "650" in output.reasoning + + # Return guardrail result with detailed information about what was detected + return GuardrailFunctionOutput( + output_info={ + "phone_number_in_response": phone_number_in_response, + "phone_number_in_reasoning": phone_number_in_reasoning, + }, + tripwire_triggered=phone_number_in_response or phone_number_in_reasoning, + ) + +@workflow.defn +class OutputGuardrailsWorkflow: + @workflow.run + async def run(self, user_input: str) -> str: + config = RunConfig() + + # Main agent with output guardrails attached + agent = Agent( + name="Assistant", + instructions="You are a helpful assistant.", + output_type=MessageOutput, # Enforces structured output + output_guardrails=[sensitive_data_check], # Attach output validation + ) + + try: + # Attempt to get response from agent + result = await Runner.run(agent, user_input, run_config=config) + output = result.final_output_as(MessageOutput) + return f"Response: {output.response}" + except OutputGuardrailTripwireTriggered as e: + # Handle output guardrail tripwire - provide safe response with detection info + workflow.logger.info( + f"Output guardrail triggered. Info: {e.guardrail_result.output.output_info}" + ) + return f"Output guardrail triggered due to sensitive data detection. Info: {e.guardrail_result.output.output_info}" +``` + +**Key Benefits**: +- **Post-Execution Safety**: Catches sensitive content after generation +- **Structured Validation**: Clear criteria for what constitutes sensitive data +- **Transparency**: Provides detailed information about what was detected +- **Compliance**: Ensures responses meet safety and privacy requirements + +### Forcing Tool Use Pattern +**File**: `openai_agents/agent_patterns/workflows/forcing_tool_use_workflow.py` + +This pattern demonstrates different strategies for controlling how agents use tools, from optional usage to forced tool execution with custom behavior. + +```python +from typing import Any, Literal +from agents import ( + Agent, FunctionToolResult, ModelSettings, RunConfig, RunContextWrapper, + Runner, ToolsToFinalOutputFunction, function_tool +) +from pydantic import BaseModel +from temporalio import workflow + +# Pydantic model for tool output - ensures structured data +class Weather(BaseModel): + city: str + temperature_range: str + conditions: str + +# Function tool that provides weather information +@function_tool +def get_weather(city: str) -> Weather: + workflow.logger.info("[debug] get_weather called") + return Weather(city=city, temperature_range="14-20C", conditions="Sunny with wind") + +# Custom tool use behavior function - provides fine-grained control over tool execution +async def custom_tool_use_behavior( + context: RunContextWrapper[Any], # Runtime context + results: list[FunctionToolResult] # Results from all tool executions +) -> ToolsToFinalOutputResult: + """Custom behavior that processes tool results and generates final output.""" + + # Extract weather data from the first tool result + weather: Weather = results[0].output + + # Generate custom formatted output + return ToolsToFinalOutputResult( + is_final_output=True, # Mark this as the final output + final_output=f"{weather.city} is {weather.conditions}." + ) + +@workflow.defn +class ForcingToolUseWorkflow: + @workflow.run + async def run(self, tool_use_behavior: str = "default") -> str: + config = RunConfig() + + # Configure different tool use behaviors based on the parameter + if tool_use_behavior == "default": + # Default behavior: send tool output back to LLM for final processing + behavior: Literal["run_llm_again", "stop_on_first_tool"] | ToolsToFinalOutputFunction = "run_llm_again" + elif tool_use_behavior == "first_tool": + # First tool behavior: use first tool result as final output + behavior = "stop_on_first_tool" + elif tool_use_behavior == "custom": + # Custom behavior: use custom function to process tool results + behavior = custom_tool_use_behavior + + # Create agent with configured tool use behavior + agent = Agent( + name="Weather agent", + instructions="You are a helpful agent.", + tools=[get_weather], # Available tools + tool_use_behavior=behavior, # How to handle tool usage + model_settings=ModelSettings( + # Force tool usage when not using default behavior + tool_choice="required" if tool_use_behavior != "default" else None + ), + ) + + # Execute the agent with a weather-related query + result = await Runner.run( + agent, input="What's the weather in Tokyo?", run_config=config + ) + return str(result.final_output) +``` + +**Key Benefits**: +- **Flexible Tool Control**: Different strategies for different use cases +- **Custom Processing**: Fine-grained control over how tool results are handled +- **Forced Execution**: Ensures tools are used when required +- **Behavior Customization**: Easy to implement custom tool handling logic + +### Worker Configuration +**File**: `openai_agents/agent_patterns/run_worker.py` + +This is the central worker that supports all agent pattern workflows, providing a single execution environment for the entire system. + +```python +from __future__ import annotations + +import asyncio +from datetime import timedelta + +from temporalio.client import Client +from temporalio.contrib.openai_agents import ModelActivityParameters, OpenAIAgentsPlugin +from temporalio.worker import Worker + +# Import all workflow classes for registration +from openai_agents.agent_patterns.workflows.agents_as_tools_workflow import ( + AgentsAsToolsWorkflow, +) +from openai_agents.agent_patterns.workflows.deterministic_workflow import ( + DeterministicWorkflow, +) +from openai_agents.agent_patterns.workflows.forcing_tool_use_workflow import ( + ForcingToolUseWorkflow, +) +from openai_agents.agent_patterns.workflows.input_guardrails_workflow import ( + InputGuardrailsWorkflow, +) +from openai_agents.agent_patterns.workflows.llm_as_a_judge_workflow import ( + LLMAsAJudgeWorkflow, +) +from openai_agents.agent_patterns.workflows.output_guardrails_workflow import ( + OutputGuardrailsWorkflow, +) +from openai_agents.agent_patterns.workflows.parallelization_workflow import ( + ParallelizationWorkflow, +) +from openai_agents.agent_patterns.workflows.routing_workflow import RoutingWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) # Timeout for OpenAI API calls + ) + ), + ], + ) + + # Create worker that supports all agent pattern workflows + worker = Worker( + client, + task_queue="openai-agents-patterns-task-queue", # Single task queue for all patterns + workflows=[ + # Register all workflow classes for execution + AgentsAsToolsWorkflow, + DeterministicWorkflow, + ParallelizationWorkflow, + LLMAsAJudgeWorkflow, + ForcingToolUseWorkflow, + InputGuardrailsWorkflow, + OutputGuardrailsWorkflow, + RoutingWorkflow, + ], + ) + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Centralized Execution**: Single worker for all patterns +- **Consistent Configuration**: Unified settings for all workflows +- **Easy Deployment**: Single process to manage and monitor +- **Resource Efficiency**: Shared resources across all patterns + +### Runner Script Pattern +**File**: `openai_agents/agent_patterns/run_deterministic_workflow.py` (example) + +Runner scripts provide individual execution of specific patterns for testing and demonstration purposes. + +```python +import asyncio +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute specific workflow with test input + result = await client.execute_workflow( + DeterministicWorkflow.run, # Workflow to execute + "Write a science fiction story about time travel", # Test input + id="deterministic-workflow-example", # Unique workflow ID + task_queue="openai-agents-patterns-task-queue", # Task queue for execution + ) + print(f"Result: {result}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Individual Testing**: Test specific patterns in isolation +- **Easy Demonstration**: Simple execution for demos and presentations +- **Development Workflow**: Quick iteration during development +- **Clear Examples**: Shows exactly how to execute each pattern + +### Testing Approach +This section shows how to test agent pattern workflows using mocking and assertions. + +```python +async def test_agent_patterns(): + """Test agent pattern workflows.""" + with mock.patch('agents.Runner.run') as mock_run: + # Mock deterministic flow with realistic responses + mock_run.side_effect = [ + MockResult("Detailed outline here"), + MockResult(OutlineCheckerOutput(good_quality=True, is_scifi=True)), + MockResult("Final content based on outline") + ] + + # Execute workflow with mocked dependencies + result = await client.execute_workflow( + DeterministicWorkflow.run, + "Write a story about AI", + id="test-deterministic", + task_queue="openai-agents-patterns-task-queue" + ) + + # Assert expected behavior + assert "Final content" in result +``` + +**Key Benefits**: +- **Isolated Testing**: Test workflows without external dependencies +- **Predictable Results**: Mocked responses ensure consistent test behavior +- **Fast Execution**: No actual API calls during testing +- **Clear Assertions**: Verify expected workflow behavior + +## 🎯 Key Benefits of This Structure + +1. **Advanced Patterns**: Demonstrates sophisticated agent interaction patterns with Temporal +2. **Production Ready**: Shows how to build reliable, scalable agent systems +3. **Quality Assurance**: Implements validation and improvement mechanisms using Pydantic +4. **Safety First**: Comprehensive guardrails for production use with tripwire mechanisms +5. **Composability**: Enables building complex systems from simple components via `as_tool()` +6. **Structured Validation**: Pydantic models ensure type safety and validation +7. **Parallel Execution**: Leverages Temporal's capabilities for concurrent processing +8. **Iterative Improvement**: Feedback loops for continuous quality enhancement + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"openai-agents-patterns-task-queue"` +- **All Runner Scripts**: Use the same task queue for consistency +- **Note**: Unlike the basic examples, all patterns use the same task queue + +### Specific Examples Implemented +- **Deterministic Flow**: Science fiction story generation with outline validation +- **Parallelization**: Spanish translation with multiple agents and quality selection +- **LLM-as-a-Judge**: Story outline improvement with structured feedback +- **Agents as Tools**: Translation orchestration using agents as callable tools +- **Routing**: Language-based agent handoffs for specialized responses +- **Input Guardrails**: Math homework detection with tripwire mechanism +- **Output Guardrails**: Phone number detection in responses +- **Forcing Tool Use**: Weather tool usage with different behavior strategies + +### Architecture Patterns +- **Pydantic-First Design**: All outputs use structured Pydantic models for validation +- **Trace-Based Monitoring**: Comprehensive tracing using `trace()` context managers +- **ItemHelpers Integration**: Proper handling of message items and outputs +- **Guardrail System**: Input and output validation with configurable tripwires +- **Tool Integration**: Both function tools and agent tools via `as_tool()` + +### File Organization +``` +openai_agents/agent_patterns/ +├── workflows/ # Core pattern implementations +│ ├── deterministic_workflow.py # Sequential execution with validation +│ ├── parallelization_workflow.py # Concurrent execution with selection +│ ├── llm_as_a_judge_workflow.py # Iterative improvement with feedback +│ ├── agents_as_tools_workflow.py # Agent composition via tools +│ ├── routing_workflow.py # Intelligent agent routing +│ ├── input_guardrails_workflow.py # Pre-execution validation +│ ├── output_guardrails_workflow.py # Post-execution validation +│ └── forcing_tool_use_workflow.py # Tool usage control strategies +├── run_worker.py # Central worker for all patterns +├── run_*.py # Individual pattern runners +└── README.md # Pattern overview and usage +``` + +### Common Development Patterns +- **Always use `RunConfig()`** for consistent execution settings +- **Wrap workflows in `trace()`** for observability and debugging +- **Use Pydantic models** for structured output validation +- **Implement proper error handling** for guardrail tripwires +- **Log key workflow steps** for monitoring and debugging +- **Use `ItemHelpers`** for proper message handling + +### Troubleshooting Common Issues +- **Guardrail Tripwires**: Always handle `GuardrailTripwireTriggered` exceptions gracefully +- **Pydantic Validation**: Ensure all agent outputs use proper `output_type` for validation +- **Message Handling**: Use `ItemHelpers.text_message_outputs()` for extracting text content +- **Context Preservation**: Use `result.to_input_list()` to maintain conversation context across handoffs +- **Tool Integration**: Remember to use `as_tool()` for agent composition, not direct calls +- **Parallel Execution**: Use `asyncio.gather()` for concurrent agent execution, not sequential loops + +This structure ensures developers can understand: +- **Advanced agent orchestration** patterns with Temporal integration +- **Quality improvement** through structured feedback and validation +- **Parallel execution** strategies using `asyncio.gather()` +- **Safety and validation** mechanisms with guardrails and tripwires +- **Agent composition** techniques via `as_tool()` method +- **Structured data handling** with Pydantic models and ItemHelpers + +The patterns serve as building blocks for production-ready AI agent systems while maintaining the reliability, observability, and type safety that Temporal and Pydantic provide. diff --git a/docs/openai_agents/BASIC.md b/docs/openai_agents/BASIC.md new file mode 100644 index 000000000..df5bca460 --- /dev/null +++ b/docs/openai_agents/BASIC.md @@ -0,0 +1,741 @@ +# Basic Agent Examples + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Basic Agent Examples service provides foundational examples for getting started with OpenAI Agents SDK integrated with Temporal workflows. This service demonstrates core agent capabilities including simple responses, tool usage, lifecycle management, dynamic prompts, image processing, and conversation continuity, all wrapped in Temporal's durable execution framework. + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Learning Curve**: Developers need simple examples to understand the integration +- **Foundation Building**: Establish basic patterns for more complex agent workflows +- **Tool Integration**: Demonstrate how agents can use external tools and APIs +- **State Management**: Show how Temporal maintains agent state across executions +- **Agent Lifecycle**: Understand agent events, hooks, and handoffs +- **Dynamic Behavior**: Implement context-aware agent instructions +- **Image Processing**: Handle both local and remote image analysis +- **Output Validation**: Manage different output schema requirements + +### Our Approach +- **Progressive Complexity**: Start with simple agents, build up to complex workflows +- **Real-World Tools**: Use practical examples like weather APIs and image processing +- **Durability First**: Ensure all examples work reliably with Temporal's execution model +- **Pattern Consistency**: Establish consistent patterns across all basic examples +- **Event-Driven Design**: Use hooks and events for observability and control +- **Modular Architecture**: Separate activities, workflows, and execution logic + +## ⚡ System Constraints & Features + +### Key Features +- **Hello World Agent**: Simple agent that responds in haikus +- **Tools Integration**: Weather API, math operations, and image processing +- **Lifecycle Management**: Agent lifecycle events and handoffs with custom hooks +- **Dynamic Prompts**: Context-aware instruction generation (haiku/pirate/robot) +- **Image Processing**: Local file and remote URL analysis +- **Conversation Continuity**: Response ID tracking for context +- **Usage Tracking**: Detailed monitoring of requests, tokens, and API usage +- **Agent Handoffs**: Seamless transition between specialized agents +- **Output Schema Management**: Strict and non-strict JSON validation + +### System Constraints +- **No Streaming**: Temporal workflows don't support streaming responses +- **Activity-Based I/O**: All external calls must be wrapped in activities +- **Deterministic Execution**: Workflow code must be deterministic +- **State Persistence**: Automatic state management through Temporal +- **Task Queue Mismatch**: Worker uses different queue than runner scripts + +## 🏗️ System Overview + +```mermaid +graph TB + A[Client Request] --> B[Temporal Workflow] + B --> C[OpenAI Agent] + C --> D[Tools & Activities] + D --> E[Weather API] + D --> F[Math Operations] + D --> G[Image Processing] + C --> H[Response Generation] + H --> B + + I[Agent Hooks] --> C + I --> J[Lifecycle Events] + I --> K[Usage Tracking] + + L[Agent Handoffs] --> C + L --> M[Specialized Agents] + + B --> N[Client Response] + + O[Temporal Server] --> B + P[Worker Process] --> B + Q[OpenAI API] --> C + R[Local Files] --> G + S[Remote URLs] --> G +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant C as Client + participant T as Temporal Workflow + participant A as OpenAI Agent + participant O as OpenAI API + participant T as Tools + participant H as Hooks + + C->>T: Start Basic Workflow + T->>A: Initialize Agent + H->>A: on_start event + + A->>O: Generate Response + O->>A: AI Response + + alt Tool Usage Required + H->>A: on_tool_start event + A->>T: Execute Tool + T->>A: Tool Result + H->>A: on_tool_end event + A->>O: Generate Final Response + end + + H->>A: on_end event + A->>T: Return Result + T->>C: Workflow Complete +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Activity Layer**: Temporal activities wrapping external API calls and file operations +2. **Workflow Layer**: Temporal workflows for agent orchestration and state management +3. **Agent Layer**: OpenAI agents with specific instructions, tools, and hooks +4. **Hook Layer**: Custom event handlers for lifecycle monitoring and usage tracking +5. **Execution Layer**: Runner scripts and worker processes for deployment + +### Key Components +- **Temporal Activities**: Weather, math, and image processing functions +- **Agent Workflows**: Simple to complex orchestration patterns +- **Custom Hooks**: Event handlers for agent lifecycle monitoring +- **Function Tools**: Workflow-level tools for agent usage +- **Worker Process**: Central execution engine for all workflows +- **Runner Scripts**: Individual execution scripts for testing + +## 🔗 Interaction Flow + +### Internal Communication +- Activities handle external I/O operations (file reading, API calls) +- Workflows orchestrate agent execution and state transitions +- Agents communicate through tool calls and response generation +- Hooks monitor and log agent lifecycle events +- State is maintained across workflow executions + +### External Dependencies +- **OpenAI API**: For agent responses and reasoning +- **Temporal Server**: For workflow orchestration and state management +- **Local File System**: For image processing and data access +- **Remote URLs**: For remote image analysis +- **Random Number Generation**: For workflow-level tools + +## 💻 Development Guidelines + +### Code Organization +- **Activities**: Grouped by functionality (weather, math, images) in `activities/` directory +- **Workflows**: One file per workflow type in `workflows/` directory +- **Runner Scripts**: Individual execution scripts in root directory +- **Worker**: Central worker supporting all examples in `run_worker.py` +- **Hooks**: Custom event handlers embedded in workflow files + +### Design Patterns +- **Activity Pattern**: Wrap external operations in Temporal activities +- **Hook Pattern**: Use event handlers for monitoring and control +- **Tool Pattern**: Provide agents with function-based tools +- **Handoff Pattern**: Seamless agent transitions with context preservation +- **Context Pattern**: Dynamic instruction generation based on runtime context +- **Schema Pattern**: Flexible output validation with strict/non-strict options + +### Error Handling +- **Activity Timeouts**: Configurable timeouts for external operations +- **Schema Validation**: Handle output type mismatches gracefully +- **Workflow Retries**: Automatic retry with exponential backoff +- **Graceful Degradation**: Continue operation when possible +- **Hook Error Handling**: Graceful handling of hook failures + +## 📝 Code Examples & Best Practices + +### Activity Definition Pattern +**File**: `openai_agents/basic/activities/get_weather_activity.py` + +This pattern demonstrates how to wrap external API calls in Temporal activities for reliable execution. + +```python +from dataclasses import dataclass +from temporalio import activity + +@dataclass +class Weather: + city: str + temperature_range: str + conditions: str + +@activity.defn +async def get_weather(city: str) -> Weather: + """Get the weather for a given city.""" + return Weather( + city=city, + temperature_range="14-20C", + conditions="Sunny with wind." + ) +``` + +**Key Benefits**: +- **Reliable Execution**: Activities provide automatic retries and error handling +- **External I/O Wrapping**: Safely handle API calls and file operations +- **Type Safety**: Use Pydantic models for structured data validation +- **Temporal Integration**: Seamlessly integrate with workflow orchestration +- **Timeout Control**: Configurable timeouts for external operations + +### Basic Agent Workflow +**File**: `openai_agents/basic/workflows/hello_world_workflow.py` + +This pattern demonstrates the simplest form of agent creation and execution within a Temporal workflow. + +```python +from temporalio import workflow +from agents import Agent, Runner + +@workflow.defn +class HelloWorldAgent: + @workflow.run + async def run(self, prompt: str) -> str: + agent = Agent( + name="Assistant", + instructions="You only respond in haikus.", + ) + + result = await Runner.run(agent, input=prompt) + return result.final_output +``` + +**Key Benefits**: +- **Simple Agent Creation**: Minimal configuration for basic agent functionality +- **Workflow Integration**: Seamless integration with Temporal's execution model +- **Clear Instructions**: Simple, focused agent instructions for specific tasks +- **Easy Testing**: Straightforward workflow for development and testing +- **Foundation Pattern**: Base pattern for more complex agent workflows + +### Agent with Lifecycle Hooks +**File**: `openai_agents/basic/workflows/agent_lifecycle_workflow.py` + +This pattern demonstrates comprehensive agent lifecycle monitoring using custom hooks for observability and control. + +```python +class CustomAgentHooks(AgentHooks): + def __init__(self, display_name: str): + self.event_counter = 0 + self.display_name = display_name + + async def on_start(self, context: RunContextWrapper, agent: Agent) -> None: + self.event_counter += 1 + print(f"### ({self.display_name}) {self.event_counter}: Agent {agent.name} started") + + async def on_end(self, context: RunContextWrapper, agent: Agent, output: Any) -> None: + self.event_counter += 1 + print(f"### ({self.display_name}) {self.event_counter}: Agent {agent.name} ended with output {output}") + + async def on_handoff(self, context: RunContextWrapper, agent: Agent, source: Agent) -> None: + self.event_counter += 1 + print(f"### ({self.display_name}) {self.event_counter}: Agent {source.name} handed off to {agent.name}") + + async def on_tool_start(self, context: RunContextWrapper, agent: Agent, tool) -> None: + self.event_counter += 1 + print(f"### ({self.display_name}) {self.event_counter}: Agent {agent.name} started tool {tool.name}") + + async def on_tool_end(self, context: RunContextWrapper, agent: Agent, tool, result: str) -> None: + self.event_counter += 1 + print(f"### ({self.display_name}) {self.event_counter}: Agent {agent.name} ended tool {tool.name} with result {result}") +``` + +**Key Benefits**: +- **Complete Observability**: Monitor every stage of agent execution +- **Event Tracking**: Sequential event counting for debugging and monitoring +- **Custom Naming**: Clear identification of different agent instances +- **Lifecycle Control**: Intercept and control agent behavior at key points +- **Debugging Support**: Detailed logging for troubleshooting agent issues + +### Agent with Handoffs +**File**: `openai_agents/basic/workflows/agent_lifecycle_workflow.py` + +This pattern demonstrates seamless agent transitions with context preservation and specialized agent delegation. + +```python +@workflow.defn +class AgentLifecycleWorkflow: + @workflow.run + async def run(self, max_number: int) -> FinalResult: + multiply_agent = Agent( + name="Multiply Agent", + instructions="Multiply the number by 2 and then return the final result.", + tools=[multiply_by_two_tool], + output_type=FinalResult, + hooks=CustomAgentHooks(display_name="Multiply Agent"), + ) + + start_agent = Agent( + name="Start Agent", + instructions="Generate a random number. If it's even, stop. If it's odd, hand off to the multiply agent.", + tools=[random_number_tool], + output_type=FinalResult, + handoffs=[multiply_agent], + hooks=CustomAgentHooks(display_name="Start Agent"), + ) + + result = await Runner.run( + start_agent, + input=f"Generate a random number between 0 and {max_number}.", + ) + + return result.final_output +``` + +**Key Benefits**: +- **Agent Specialization**: Each agent has focused responsibilities and tools +- **Seamless Transitions**: Automatic handoffs based on business logic +- **Context Preservation**: Maintain conversation context across agent boundaries +- **Tool Distribution**: Distribute tools across specialized agents +- **Workflow Orchestration**: Complex logic through simple agent coordination + +### Dynamic System Prompts +**File**: `openai_agents/basic/workflows/dynamic_system_prompt_workflow.py` + +This pattern demonstrates context-aware instruction generation that adapts agent behavior based on runtime conditions. + +```python +class CustomContext: + def __init__(self, style: Literal["haiku", "pirate", "robot"]): + self.style = style + +def custom_instructions( + run_context: RunContextWrapper[CustomContext], agent: Agent[CustomContext] +) -> str: + context = run_context.context + if context.style == "haiku": + return "Only respond in haikus." + elif context.style == "pirate": + return "Respond as a pirate." + else: + return "Respond as a robot and say 'beep boop' a lot." + +@workflow.defn +class DynamicSystemPromptWorkflow: + @workflow.run + async def run(self, user_message: str, style: Optional[str] = None) -> str: + if style is None: + selected_style: Literal["haiku", "pirate", "robot"] = workflow.random().choice(["haiku", "pirate", "robot"]) + else: + if style not in ["haiku", "pirate", "robot"]: + raise ValueError(f"Invalid style: {style}. Must be one of: haiku, pirate, robot") + selected_style = style + + context = CustomContext(style=selected_style) + agent = Agent( + name="Chat agent", + instructions=custom_instructions, + ) + + result = await Runner.run(agent, user_message, context=context) + return f"Style: {selected_style}\nResponse: {result.final_output}" +``` + +**Key Benefits**: +- **Runtime Adaptation**: Agent behavior changes based on context +- **Random Selection**: Automatic style selection when none specified +- **Type Safety**: Literal types ensure valid style values +- **Context Injection**: Runtime context passed to instruction functions +- **Flexible Behavior**: Single agent with multiple personality modes + +### Usage Tracking with RunHooks +**File**: `openai_agents/basic/workflows/agent_lifecycle_workflow.py` + +This pattern demonstrates comprehensive usage monitoring for API consumption, token tracking, and cost optimization. + +```python +class ExampleHooks(RunHooks): + def __init__(self): + self.event_counter = 0 + + def _usage_to_str(self, usage: Usage) -> str: + return f"{usage.requests} requests, {usage.input_tokens} input tokens, {usage.output_tokens} output tokens, {usage.total_tokens} total tokens" + + async def on_agent_start(self, context: RunContextWrapper, agent: Agent) -> None: + self.event_counter += 1 + print(f"### {self.event_counter}: Agent {agent.name} started. Usage: {self._usage_to_str(context.usage)}") + + async def on_agent_end(self, context: RunContextWrapper, agent: Agent, output: Any) -> None: + self.event_counter += 1 + print(f"### {self.event_counter}: Agent {agent.name} ended with output {output}. Usage: {self._usage_to_str(context.usage)}") + + async def on_tool_start(self, context: RunContextWrapper, agent: Agent, tool: Tool) -> None: + self.event_counter += 1 + print(f"### {self.event_counter}: Tool {tool.name} started. Usage: {self._usage_to_str(context.usage)}") + + async def on_tool_end(self, context: RunContextWrapper, agent: Agent, tool: Tool, result: str) -> None: + self.event_counter += 1 + print(f"### {self.event_counter}: Tool {tool.name} ended with result {result}. Usage: {self._usage_to_str(context.usage)}") + + async def on_handoff(self, context: RunContextWrapper, from_agent: Agent, to_agent: Agent) -> None: + self.event_counter += 1 + print(f"### {self.event_counter}: Handoff from {from_agent.name} to {to_agent.name}. Usage: {self._usage_to_str(context.usage)}") +``` + +**Key Benefits**: +- **Cost Monitoring**: Track token usage and API requests for budgeting +- **Performance Analysis**: Monitor execution time and resource consumption +- **Event Sequencing**: Sequential event tracking for debugging workflows +- **Usage Optimization**: Identify expensive operations for optimization +- **Comprehensive Coverage**: Monitor all agent lifecycle events and tool usage + +### Function Tools in Workflows +**File**: `openai_agents/basic/workflows/agent_lifecycle_workflow.py` + +This pattern demonstrates how to create simple, workflow-level tools that agents can use for basic operations. + +```python +@function_tool +def random_number_tool(max: int) -> int: + """Generate a random number up to the provided maximum.""" + return workflow.random().randint(0, max) + +@function_tool +def multiply_by_two_tool(x: int) -> int: + """Simple multiplication by two.""" + return x * 2 +``` + +**Key Benefits**: +- **Workflow Integration**: Tools directly available within workflow context +- **Deterministic Execution**: Use Temporal's random number generation +- **Simple Operations**: Lightweight tools for basic computational tasks +- **Agent Access**: Agents can use these tools without external dependencies +- **Easy Testing**: Simple tools that are easy to test and validate + +### Image Processing Workflow +**File**: `openai_agents/basic/workflows/local_image_workflow.py` + +This pattern demonstrates how to process local images by converting them to base64 and analyzing them with AI agents. + +```python +@workflow.defn +class LocalImageWorkflow: + @workflow.run + async def run(self, image_path: str, question: str = "What do you see in this image?") -> str: + # Convert image to base64 using activity + b64_image = await workflow.execute_activity( + read_image_as_base64, + image_path, + start_to_close_timeout=workflow.timedelta(seconds=30), + ) + + agent = Agent( + name="Assistant", + instructions="You are a helpful assistant.", + ) + + result = await Runner.run( + agent, + [ + { + "role": "user", + "content": [ + { + "type": "input_image", + "detail": "auto", + "image_url": f"data:image/jpeg;base64,{b64_image}", + } + ], + }, + { + "role": "user", + "content": question, + }, + ], + ) + return result.final_output +``` + +**Key Benefits**: +- **Local File Processing**: Handle images stored on the local filesystem +- **Base64 Conversion**: Convert images to format suitable for AI analysis +- **Activity Integration**: Use Temporal activities for file I/O operations +- **Flexible Questions**: Customizable questions for image analysis +- **Timeout Control**: Configurable timeouts for image processing operations + +### Non-Strict Output Schema +**File**: `openai_agents/basic/workflows/non_strict_output_workflow.py` + +This pattern demonstrates flexible output validation with both strict and non-strict schema enforcement options. + +```python +@dataclass +class OutputType: + jokes: dict[int, str] + """A list of jokes, indexed by joke number.""" + +@workflow.defn +class NonStrictOutputWorkflow: + @workflow.run + async def run(self, input_text: str) -> dict[str, Any]: + results = {} + + agent = Agent( + name="Assistant", + instructions="You are a helpful assistant.", + output_type=OutputType, + ) + + # Try with strict output type (this should fail) + try: + result = await Runner.run(agent, input_text) + results["strict_result"] = "Unexpected success" + except Exception as e: + results["strict_error"] = str(e) + + # Try with non-strict output type + try: + agent.output_type = AgentOutputSchema(OutputType, strict_json_schema=False) + result = await Runner.run(agent, input_text) + results["non_strict_result"] = result.final_output + except Exception as e: + results["non_strict_error"] = str(e) + + return results +``` + +**Key Benefits**: +- **Schema Flexibility**: Choose between strict and relaxed validation +- **Error Handling**: Gracefully handle validation failures +- **Testing Support**: Compare strict vs. non-strict behavior +- **Production Safety**: Use strict validation in production, relaxed in development +- **Schema Evolution**: Adapt to changing output requirements + +### Conversation Continuity +**File**: `openai_agents/basic/workflows/previous_response_id_workflow.py` + +This pattern demonstrates how to maintain conversation context across multiple agent interactions using response IDs. + +```python +@workflow.defn +class PreviousResponseIdWorkflow: + @workflow.run + async def run(self, first_question: str, follow_up_question: str) -> Tuple[str, str]: + agent = Agent( + name="Assistant", + instructions="You are a helpful assistant. be VERY concise.", + ) + + # First question + result1 = await Runner.run(agent, first_question) + first_response = result1.final_output + + # Follow-up question using previous response ID + result2 = await Runner.run( + agent, + follow_up_question, + previous_response_id=result1.last_response_id, + ) + second_response = result2.final_output + + return first_response, second_response +``` + +**Key Benefits**: +- **Context Preservation**: Maintain conversation history across interactions +- **Follow-up Support**: Enable natural conversation flow with context +- **Memory Management**: Use OpenAI's built-in conversation memory +- **Stateful Interactions**: Build multi-turn conversations with agents +- **Concise Instructions**: Control response length for better user experience + +### Tools Integration with Activities +**File**: `openai_agents/basic/workflows/tools_workflow.py` + +This pattern demonstrates how to integrate Temporal activities as tools for agents, enabling external I/O operations. + +```python +@workflow.defn +class ToolsWorkflow: + @workflow.run + async def run(self, question: str) -> str: + agent = Agent( + name="Hello world", + instructions="You are a helpful agent.", + tools=[ + temporal_agents.workflow.activity_as_tool( + get_weather, start_to_close_timeout=timedelta(seconds=10) + ) + ], + ) + + result = await Runner.run(agent, input=question) + return result.final_output +``` + +**Key Benefits**: +- **Activity Integration**: Convert Temporal activities into agent tools +- **External I/O**: Enable agents to access external APIs and services +- **Timeout Control**: Configurable timeouts for tool execution +- **Error Handling**: Leverage Temporal's built-in error handling +- **Reliable Execution**: Activities provide automatic retries and durability + +### Worker Configuration +**File**: `openai_agents/basic/run_worker.py` + +This is the central worker that supports all basic agent workflows, providing a single execution environment. + +```python +async def main(): + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) + ) + ), + ], + ) + + worker = Worker( + client, + task_queue="openai-agents-task-queue", + workflows=[ + HelloWorldAgent, + ToolsWorkflow, + AgentLifecycleWorkflow, + DynamicSystemPromptWorkflow, + NonStrictOutputWorkflow, + LocalImageWorkflow, + RemoteImageWorkflow, + LifecycleWorkflow, + PreviousResponseIdWorkflow, + ], + activities=[ + get_weather, + multiply_by_two, + random_number, + read_image_as_base64, + ], + ) + await worker.run() +``` + +**Key Benefits**: +- **Centralized Execution**: Single worker for all basic agent workflows +- **Workflow Registration**: Register all workflow classes in one place +- **Activity Registration**: Register all activities for tool integration +- **Plugin Configuration**: Configure OpenAI integration with proper timeouts +- **Easy Deployment**: Single process to manage and monitor all workflows + +### Testing Approach +**File**: `tests/hello/` (example test structure) + +This pattern demonstrates how to test agent workflows using mocking and validation techniques. + +```python +async def test_basic_agent(): + """Test basic agent workflows.""" + with mock.patch('agents.Runner.run') as mock_run: + mock_run.return_value.final_output = "Expected haiku response" + + result = await client.execute_workflow( + HelloWorldAgent.run, + "Tell me about the weather", + id="test-hello-world", + task_queue="test-queue" + ) + + assert "Expected haiku response" in result + +async def test_agent_lifecycle(): + """Test agent lifecycle with hooks.""" + with mock.patch('agents.Runner.run') as mock_run: + mock_run.return_value.final_output = FinalResult(number=42) + + result = await client.execute_workflow( + AgentLifecycleWorkflow.run, + 100, + id="test-lifecycle", + task_queue="test-queue" + ) + + assert result.number == 42 +``` + +**Key Benefits**: +- **Mocked Dependencies**: Test workflows without external API calls +- **Isolated Testing**: Test individual workflow components +- **Validation**: Ensure workflows produce expected outputs +- **Development Workflow**: Quick iteration during development +- **Quality Assurance**: Catch issues before production deployment + +## 🎯 Key Benefits of This Structure + +1. **Learning Foundation**: Provides essential patterns for new developers +2. **Progressive Complexity**: Builds from simple to advanced concepts +3. **Real-World Examples**: Uses practical tools and APIs +4. **Pattern Consistency**: Establishes reusable patterns across examples +5. **Easy Onboarding**: Simple examples that demonstrate core concepts +6. **Lifecycle Management**: Comprehensive event handling and monitoring +7. **Dynamic Behavior**: Context-aware agent instructions +8. **Usage Tracking**: Detailed monitoring of API consumption +9. **Image Processing**: Both local and remote image analysis +10. **Output Validation**: Flexible schema handling for different requirements + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"openai-agents-task-queue"` +- **Runner Scripts**: Use task queue `"openai-agents-basic-task-queue"` +- **Note**: This mismatch means runner scripts won't connect to the worker unless the worker is started with the correct task queue + +### Specific Examples Implemented +- **Hello World**: Asks about recursion in programming +- **Dynamic Prompts**: Tests with "Tell me a joke" and specific "pirate" style +- **Image Processing**: + - Local: Uses `media/image_bison.jpg` with "What do you see in this image?" + - Remote: Uses Golden Gate Bridge URL with same question +- **Non-Strict Output**: Asks for "3 short jokes" +- **Previous Response ID**: Uses South America geography questions +- **Tools**: Asks about weather in "Tokio" (note: typo in original code) + +### Architecture Patterns +- **Activity-First Design**: All external operations wrapped in Temporal activities +- **Hook-Based Monitoring**: Comprehensive event tracking for observability +- **Tool Integration**: Both function tools and activity-based tools +- **Context-Aware Instructions**: Dynamic prompt generation based on runtime state +- **Schema Flexibility**: Support for strict and non-strict output validation + +This structure ensures developers can quickly understand: +- **Basic agent creation** and configuration +- **Tool integration** patterns with activities +- **Workflow orchestration** with Temporal +- **Activity wrapping** for external calls +- **State management** and durability +- **Agent lifecycle** events and hooks +- **Dynamic prompt** generation +- **Usage monitoring** and optimization +- **Image processing** workflows +- **Output validation** strategies + +The examples serve as building blocks for more complex agent workflows while maintaining the reliability and observability that Temporal provides. The comprehensive coverage of activities, workflows, hooks, and execution patterns demonstrates how to build production-ready agent systems with proper monitoring, error handling, and state management. diff --git a/docs/openai_agents/CUSTOMER_SERVICE.md b/docs/openai_agents/CUSTOMER_SERVICE.md new file mode 100644 index 000000000..ae9906f78 --- /dev/null +++ b/docs/openai_agents/CUSTOMER_SERVICE.md @@ -0,0 +1,503 @@ +# Customer Service + +## 📑 Table of Contents +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) +- [Key Benefits of This Structure](#key-benefits-of-this-structure) +- [Important Implementation Notes](#important-implementation-notes) +- [Architecture Patterns](#architecture-patterns) +- [File Organization](#file-organization) +- [Common Development Patterns](#common-development-patterns) + +## 🎯 Introduction +The Customer Service system demonstrates how to build persistent, stateful conversations using OpenAI Agents SDK with Temporal's durable conversational workflows. It provides an interactive customer service experience with intelligent agent handoffs, maintaining conversation state across multiple interactions and surviving system restarts and failures. + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Persistent Conversations**: Maintaining conversation context across multiple interactions +- **Intelligent Routing**: Automatically directing customer queries to specialized agents +- **Stateful Workflows**: Preserving conversation state through system restarts +- **Multi-Agent Coordination**: Seamless handoffs between specialized service agents + +### Our Approach +- **Durable Execution**: Using Temporal workflows to ensure conversation persistence +- **Agent Specialization**: Creating focused agents for specific service domains +- **Context Preservation**: Maintaining customer context throughout the conversation flow +- **Graceful Degradation**: Fallback mechanisms for handling edge cases + +## ⚡ System Constraints & Features + +### Key Features +- **Multi-Agent Handoffs**: Intelligent routing between FAQ, seat booking, and triage agents +- **Persistent State**: Conversation history and context preserved in Temporal workflows +- **Real-time Updates**: Live conversation updates through workflow queries and updates +- **Context-Aware Tools**: Tools that maintain and update conversation context +- **Validation & Error Handling**: Input validation and stale conversation detection + +### System Constraints +- **Input Length**: User messages limited to 1000 characters +- **Conversation Freshness**: Stale chat history detection prevents out-of-order updates +- **Workflow Continuation**: Uses continue-as-new pattern for long-running conversations +- **Timeout Management**: 30-second timeout for model activities + +## 🏗️ System Overview + +```mermaid +graph TB + A[Customer Input] --> B[Triage Agent] + B --> C{Query Type} + C -->|FAQ| D[FAQ Agent] + C -->|Seat Booking| E[Seat Booking Agent] + D --> F[FAQ Lookup Tool] + E --> G[Update Seat Tool] + F --> H[Response Generation] + G --> I[Context Update] + H --> J[Conversation History] + I --> J + J --> K[Temporal Workflow State] + K --> L[Persistent Storage] +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant C as Customer Client + participant W as Workflow + participant T as Triage Agent + participant F as FAQ Agent + participant S as Seat Agent + participant DB as Temporal State + + C->>W: Start Conversation + W->>DB: Initialize State + C->>W: Send Message + W->>T: Route Query + T->>F: Handoff to FAQ + F->>W: Generate Response + W->>DB: Update History + W->>C: Return Response + C->>W: Send Follow-up + W->>DB: Retrieve State + W->>S: Handoff to Seat Agent + S->>W: Update Seat + W->>DB: Persist Changes + W->>C: Confirm Update +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Input Layer**: Customer client interface with conversation management +2. **Routing Layer**: Triage agent for intelligent query distribution +3. **Specialized Layer**: Domain-specific agents (FAQ, seat booking) +4. **Tool Layer**: Function tools for data lookup and updates +5. **State Layer**: Temporal workflow for persistent conversation state + +### Key Components +- **[CustomerServiceWorkflow](./workflows/customer_service_workflow.py)**: Main workflow orchestrating conversation flow +- **[AirlineAgentContext](./customer_service.py)**: Pydantic model for conversation context +- **[Triage Agent](./customer_service.py)**: Intelligent routing agent for query distribution +- **[FAQ Agent](./customer_service.py)**: Specialized agent for answering common questions +- **[Seat Booking Agent](./customer_service.py)**: Specialized agent for seat management +- **[Function Tools](./customer_service.py)**: Domain-specific tools for data operations + +## 🔗 Interaction Flow + +### Internal Communication +- **Agent Handoffs**: Seamless transitions between specialized agents +- **Context Propagation**: Shared context maintained across agent transitions +- **Tool Integration**: Agents use function tools for external data access +- **State Synchronization**: Workflow state synchronized with conversation progress + +### External Dependencies +- **OpenAI API**: For agent responses and reasoning +- **Temporal Server**: For workflow orchestration and state persistence +- **Customer Client**: For user interaction and conversation management + +## 💻 Development Guidelines + +### Code Organization +- **Workflow-Centric**: Main business logic in Temporal workflows +- **Agent Separation**: Clear separation of concerns between agent types +- **Tool Encapsulation**: Function tools for external data operations +- **Context Management**: Centralized context handling through Pydantic models + +### Design Patterns +- **Multi-Agent Pattern**: Specialized agents with handoff capabilities +- **State Machine Pattern**: Workflow state management for conversation flow +- **Tool Pattern**: Function tools for external system integration +- **Validation Pattern**: Input validation and error handling + +### Error Handling +- **Input Validation**: User input validation with clear error messages +- **Stale Detection**: Conversation freshness validation to prevent conflicts +- **Graceful Fallbacks**: Fallback mechanisms for handling edge cases +- **Context Recovery**: State recovery for interrupted conversations + +## 📝 Code Examples & Best Practices + +### Multi-Agent Handoff Pattern +**File**: `openai_agents/customer_service/customer_service.py` + +```python +def init_agents() -> Tuple[ + Agent[AirlineAgentContext], Dict[str, Agent[AirlineAgentContext]] +]: + """Initialize the agents for the airline customer service workflow.""" + + # FAQ Agent with specialized knowledge + faq_agent = Agent[AirlineAgentContext]( + name="FAQ Agent", + handoff_description="A helpful agent that can answer questions about the airline.", + instructions=f"""{RECOMMENDED_PROMPT_PREFIX} + You are an FAQ agent. If you are speaking to a customer, you probably were transferred to from the triage agent. + Use the following routine to support the customer. + # Routine + 1. Identify the last question asked by the customer. + 2. Use the faq lookup tool to answer the question. Do not rely on your own knowledge. + 3. If you cannot answer the question, transfer back to the triage agent.""", + tools=[faq_lookup_tool], + ) + + # Seat Booking Agent with context-aware tools + seat_booking_agent = Agent[AirlineAgentContext]( + name="Seat Booking Agent", + handoff_description="A helpful agent that can update a seat on a flight.", + instructions=f"""{RECOMMENDED_PROMPT_PREFIX} + You are a seat booking agent. If you are speaking to a customer, you probably were transferred to from the triage agent. + Use the following routine to support the customer. + # Routine + 1. Ask for their confirmation number. + 2. Ask the customer what their desired seat number is. + 3. Use the update seat tool to update the seat on the flight. + If the customer asks a question that is not related to the routine, transfer back to the triage agent.""", + tools=[update_seat], + ) + + # Triage Agent with intelligent routing + triage_agent = Agent[AirlineAgentContext]( + name="Triage Agent", + handoff_description="A triage agent that can delegate a customer's request to the appropriate agent.", + instructions=( + f"{RECOMMENDED_PROMPT_PREFIX} " + "You are a helpful triaging agent. You can use your tools to delegate questions to other appropriate agents." + ), + handoffs=[ + faq_agent, + handoff(agent=seat_booking_agent, on_handoff=on_seat_booking_handoff), + ], + ) + + # Bidirectional handoffs for seamless navigation + faq_agent.handoffs.append(triage_agent) + seat_booking_agent.handoffs.append(triage_agent) + return triage_agent, { + agent.name: agent for agent in [faq_agent, seat_booking_agent, triage_agent] + } +``` + +**Key Benefits**: +- **Intelligent Routing**: Triage agent automatically directs queries to appropriate specialists +- **Bidirectional Handoffs**: Agents can return to triage for re-routing +- **Context Preservation**: Shared context maintained across all agent transitions +- **Specialized Expertise**: Each agent focuses on specific domain knowledge + +### Context-Aware Function Tools +**File**: `openai_agents/customer_service/customer_service.py` + +```python +@function_tool +async def update_seat( + context: RunContextWrapper[AirlineAgentContext], + confirmation_number: str, + new_seat: str, +) -> str: + """Update the seat for a given confirmation number.""" + + # Update the context based on the customer's input + context.context.confirmation_number = confirmation_number + context.context.seat_number = new_seat + + # Ensure that the flight number has been set by the incoming handoff + assert context.context.flight_number is not None, "Flight number is required" + + return f"Updated seat to {new_seat} for confirmation number {confirmation_number}" + +@function_tool( + name_override="faq_lookup_tool", + description_override="Lookup frequently asked questions.", +) +async def faq_lookup_tool(question: str) -> str: + """Lookup frequently asked questions with intelligent keyword matching.""" + + question_lower = question.lower() + if "bag" in question_lower or "baggage" in question_lower: + return ( + "You are allowed to bring one bag on the plane. " + "It must be under 50 pounds and 22 inches x 14 inches x 9 inches." + ) + elif "seats" in question_lower or "plane" in question_lower: + return ( + "There are 120 seats on the plane. " + "There are 22 business class seats and 98 economy seats. " + "Exit rows are rows 4 and 16. " + "Rows 5-8 are Economy Plus, with extra legroom." + ) + elif "wifi" in question_lower: + return "We have free wifi on the plane, join Airline-Wifi" + return "I'm sorry, I don't know the answer to that question." +``` + +**Key Benefits**: +- **Context Integration**: Tools can read and update conversation context +- **Intelligent Matching**: Keyword-based FAQ lookup for relevant responses +- **Data Validation**: Assertions ensure required context is available +- **Clear Responses**: Structured, informative responses for customer queries + +### Persistent Workflow State Management +**File**: `openai_agents/customer_service/workflows/customer_service_workflow.py` + +```python +@workflow.defn +class CustomerServiceWorkflow: + @workflow.init + def __init__( + self, customer_service_state: CustomerServiceWorkflowState | None = None + ): + """Initialize workflow with optional state restoration.""" + + self.run_config = RunConfig() + starting_agent, self.agent_map = init_agents() + + # Restore state or start fresh + self.current_agent = ( + self.agent_map[customer_service_state.current_agent_name] + if customer_service_state + else starting_agent + ) + self.context = ( + customer_service_state.context + if customer_service_state + else AirlineAgentContext() + ) + self.printed_history: list[str] = ( + customer_service_state.printed_history if customer_service_state else [] + ) + self.input_items = ( + customer_service_state.input_items if customer_service_state else [] + ) + + @workflow.update + async def process_user_message(self, input: ProcessUserMessageInput) -> list[str]: + """Process user message and maintain conversation state.""" + + length = len(self.printed_history) + self.printed_history.append(f"User: {input.user_input}") + + with trace("Customer service", group_id=workflow.info().workflow_id): + # Add user input to conversation context + self.input_items.append({"content": input.user_input, "role": "user"}) + + # Execute agent with current context + result = await Runner.run( + self.current_agent, + self.input_items, + context=self.context, + run_config=self.run_config, + ) + + # Process all response items for history + for new_item in result.new_items: + agent_name = new_item.agent.name + if isinstance(new_item, MessageOutputItem): + self.printed_history.append( + f"{agent_name}: {ItemHelpers.text_message_output(new_item)}" + ) + elif isinstance(new_item, HandoffOutputItem): + self.printed_history.append( + f"Handed off from {new_item.source_agent.name} to {new_item.target_agent.name}" + ) + elif isinstance(new_item, ToolCallItem): + self.printed_history.append(f"{agent_name}: Calling a tool") + elif isinstance(new_item, ToolCallOutputItem): + self.printed_history.append( + f"{agent_name}: Tool call output: {new_item.output}" + ) + + # Update conversation state + self.input_items = result.to_input_list() + self.current_agent = result.last_agent + + # Update workflow details for monitoring + workflow.set_current_details("\n\n".join(self.printed_history)) + return self.printed_history[length:] +``` + +**Key Benefits**: +- **State Persistence**: Conversation state maintained across system restarts +- **Context Restoration**: Seamless resumption of interrupted conversations +- **Comprehensive History**: All interaction types logged for debugging +- **Workflow Monitoring**: Real-time conversation state visible in Temporal UI + +### Conversation Client with State Management +**File**: `openai_agents/customer_service/run_customer_service_client.py` + +```python +async def main(): + parser = argparse.ArgumentParser() + parser.add_argument("--conversation-id", type=str, required=True) + args = parser.parse_args() + + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + handle = client.get_workflow_handle(args.conversation_id) + + # Query existing workflow or start new conversation + start = False + history = [] + try: + history = await handle.query( + CustomerServiceWorkflow.get_chat_history, + reject_condition=QueryRejectCondition.NOT_OPEN, + ) + except WorkflowQueryRejectedError: + start = True + except RPCError as e: + if e.status == RPCStatusCode.NOT_FOUND: + start = True + else: + raise e + + # Start new workflow if needed + if start: + await client.start_workflow( + CustomerServiceWorkflow.run, + id=args.conversation_id, + task_queue="openai-agents-task-queue", + ) + history = [] + + print(*history, sep="\n") + + # Interactive conversation loop + while True: + user_input = input("Enter your message: ") + message_input = ProcessUserMessageInput( + user_input=user_input, chat_length=len(history) + ) + + try: + new_history = await handle.execute_update( + CustomerServiceWorkflow.process_user_message, message_input + ) + history.extend(new_history) + print(*new_history[1:], sep="\n") + except WorkflowUpdateFailedError: + print("** Stale conversation. Reloading...") + # Refresh conversation state + length = len(history) + history = await handle.query( + CustomerServiceWorkflow.get_chat_history, + reject_condition=QueryRejectCondition.NOT_OPEN, + ) + print(*history[length:], sep="\n") +``` + +**Key Benefits**: +- **Conversation Persistence**: Unique conversation IDs for long-running chats +- **State Recovery**: Automatic detection and recovery of stale conversations +- **Interactive Interface**: Real-time conversation updates and responses +- **Error Handling**: Graceful handling of workflow failures and state conflicts + +### Worker Configuration with Extended Timeouts +**File**: `openai_agents/customer_service/run_worker.py` + +```python +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) + ) + ), + ], + ) + + # Create worker for customer service workflows + worker = Worker( + client, + task_queue="openai-agents-task-queue", + workflows=[CustomerServiceWorkflow], + ) + await worker.run() +``` + +**Key Benefits**: +- **Extended Timeouts**: 30-second timeout for complex agent interactions +- **Dedicated Task Queue**: Isolated processing for customer service workflows +- **Plugin Integration**: OpenAI Agents plugin for seamless agent execution +- **Scalable Processing**: Worker can handle multiple concurrent conversations + +## 🎯 **Key Benefits of This Structure:** + +1. **Persistent Conversations**: Conversations survive system restarts and failures +2. **Intelligent Routing**: Automatic query distribution to specialized agents +3. **Context Preservation**: Shared context maintained across agent transitions +4. **Stateful Workflows**: Temporal workflows provide durable conversation state +5. **Scalable Architecture**: Multiple workers can handle concurrent conversations +6. **Developer Experience**: Clear separation of concerns and comprehensive error handling + +## ⚠️ **Important Implementation Notes:** + +- **Task Queue**: Uses `"openai-agents-task-queue"` for all customer service workflows +- **State Persistence**: Implements continue-as-new pattern for long-running conversations +- **Input Validation**: Strict validation prevents stale conversations and invalid inputs +- **Agent Handoffs**: Bidirectional handoffs enable seamless navigation between specialists +- **Context Management**: Pydantic models ensure type safety and data validation + +## 🏗️ **Architecture Patterns:** + +- **Multi-Agent Pattern**: Specialized agents with handoff capabilities +- **State Machine Pattern**: Workflow state management for conversation flow +- **Tool Pattern**: Function tools for external system integration +- **Validation Pattern**: Input validation and error handling +- **Persistence Pattern**: Temporal workflows for durable state management + +## 📁 **File Organization:** + +``` +openai_agents/customer_service/ +├── README.md # Usage instructions and examples +├── customer_service.py # Agent definitions and function tools +├── run_customer_service_client.py # Interactive conversation client +├── run_worker.py # Temporal worker configuration +└── workflows/ + └── customer_service_workflow.py # Main workflow orchestration +``` + +## 🔧 **Common Development Patterns:** + +- **Agent Initialization**: Centralized agent creation with handoff configuration +- **Context Propagation**: Shared context objects for state management +- **Tool Integration**: Function tools for external system operations +- **State Restoration**: Workflow state recovery for interrupted conversations +- **Error Handling**: Comprehensive validation and graceful degradation +- **Monitoring**: Workflow tracing and conversation state visibility + +This structure ensures new developers can quickly understand how to build persistent, multi-agent customer service systems with intelligent routing and state management using Temporal workflows and OpenAI Agents SDK. diff --git a/docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md b/docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md new file mode 100644 index 000000000..87bb77fe4 --- /dev/null +++ b/docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md @@ -0,0 +1,547 @@ +# Financial Research Agent + +## 📑 Table of Contents +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) +- [Key Benefits of This Structure](#key-benefits-of-this-structure) +- [Important Implementation Notes](#important-implementation-notes) +- [Architecture Patterns](#architecture-patterns) +- [File Organization](#file-organization) +- [Common Development Patterns](#common-development-patterns) + +## 🎯 Introduction +The Financial Research Agent system demonstrates how to build a sophisticated multi-agent financial research system using OpenAI Agents SDK with Temporal's durable execution. It orchestrates specialized agents for planning, searching, analysis, writing, and verification to produce comprehensive financial research reports with proper sourcing and quality assurance. + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Complex Research Workflows**: Breaking down financial analysis into specialized, manageable tasks +- **Multi-Agent Coordination**: Orchestrating diverse agents with different expertise areas +- **Quality Assurance**: Ensuring research reports are accurate, consistent, and well-sourced +- **Parallel Processing**: Optimizing research speed through concurrent web searches +- **Structured Output**: Producing standardized financial reports with executive summaries + +### Our Approach +- **Agent Specialization**: Creating focused agents for specific research domains +- **Pipeline Architecture**: Sequential workflow with parallel execution where possible +- **Tool Integration**: Exposing specialist agents as tools for inline analysis +- **Verification Layer**: Final quality check to ensure report consistency and accuracy +- **Durable Execution**: Using Temporal workflows to handle long-running research tasks + +## ⚡ System Constraints & Features + +### Key Features +- **Multi-Agent Orchestration**: 6 specialized agents working in coordinated sequence +- **Parallel Web Searches**: Concurrent execution of multiple search queries +- **Specialist Analysis Tools**: Financial and risk analysis agents exposed as tools +- **Structured Output**: Pydantic models for consistent data structures +- **Quality Verification**: Final audit step for report consistency and sourcing +- **Temporal Integration**: Durable execution with proper workflow management + +### System Constraints +- **Search Limit**: 5-15 search terms per research request +- **Summary Length**: Search results limited to 300 words maximum +- **Analysis Length**: Financial and risk analysis limited to 2 paragraphs +- **Model Requirements**: Specific model assignments for different agent types +- **Tool Choice**: Web search agent requires tool usage for all queries + +## 🏗️ System Overview + +```mermaid +graph TB + A[User Query] --> B[Planner Agent] + B --> C[Financial Search Plan] + C --> D[Search Agent] + D --> E[Web Search Tool] + E --> F[Search Results] + F --> G[Writer Agent] + G --> H[Financials Tool] + G --> I[Risk Tool] + H --> J[Financial Analysis] + I --> K[Risk Analysis] + J --> L[Report Synthesis] + K --> L + L --> M[Verifier Agent] + M --> N[Final Report] + N --> O[Executive Summary] + N --> P[Markdown Report] + N --> Q[Follow-up Questions] +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant U as User + participant W as Workflow + participant P as Planner Agent + participant S as Search Agent + participant F as Financials Agent + participant R as Risk Agent + participant Wr as Writer Agent + participant V as Verifier Agent + + U->>W: Financial Research Query + W->>P: Plan Search Strategy + P->>W: Return Search Plan + W->>S: Execute Parallel Searches + S->>W: Return Search Results + W->>Wr: Synthesize Report + Wr->>F: Request Financial Analysis + F->>Wr: Return Analysis Summary + Wr->>R: Request Risk Analysis + R->>Wr: Return Risk Summary + Wr->>W: Return Complete Report + W->>V: Verify Report Quality + V->>W: Return Verification Result + W->>U: Final Research Report +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Orchestration Layer**: FinancialResearchManager coordinating all agents +2. **Planning Layer**: Planner agent creating search strategies +3. **Research Layer**: Search agent with web search capabilities +4. **Analysis Layer**: Specialist agents for financial and risk analysis +5. **Synthesis Layer**: Writer agent combining all research into reports +6. **Quality Layer**: Verifier agent ensuring report accuracy + +### Key Components +- **[FinancialResearchManager](./financial_research_manager.py)**: Main orchestrator managing the entire research workflow +- **[Planner Agent](./agents/planner_agent.py)**: Creates strategic search plans with 5-15 search terms +- **[Search Agent](./agents/search_agent.py)**: Executes web searches with WebSearchTool integration +- **[Financials Agent](./agents/financials_agent.py)**: Analyzes company fundamentals and financial metrics +- **[Risk Agent](./agents/risk_agent.py)**: Identifies potential red flags and risk factors +- **[Writer Agent](./agents/writer_agent.py)**: Synthesizes research into comprehensive reports +- **[Verifier Agent](./agents/verifier_agent.py)**: Audits final reports for consistency and accuracy + +## 🔗 Interaction Flow + +### Internal Communication +- **Sequential Planning**: Planner → Search → Analysis → Writing → Verification +- **Parallel Execution**: Multiple web searches executed concurrently +- **Tool Integration**: Specialist agents exposed as tools for inline analysis +- **Data Flow**: Structured Pydantic models ensuring type safety throughout pipeline + +### External Dependencies +- **Web Search API**: For retrieving financial information and news +- **OpenAI API**: For agent reasoning and content generation +- **Temporal Server**: For workflow orchestration and durable execution + +## 💻 Development Guidelines + +### Code Organization +- **Agent-Centric**: Each agent in separate file with focused responsibilities +- **Manager Pattern**: Central orchestrator coordinating all agent interactions +- **Tool Integration**: Specialist agents exposed as tools for complex workflows +- **Model-Driven**: Pydantic models for consistent data structures and validation + +### Design Patterns +- **Pipeline Pattern**: Sequential workflow with parallel execution optimization +- **Tool Pattern**: Agents exposed as tools for inline analysis +- **Factory Pattern**: Agent creation through factory functions +- **Orchestrator Pattern**: Central manager coordinating complex workflows + +### Error Handling +- **Graceful Degradation**: Failed searches don't stop entire workflow +- **Exception Handling**: Proper error handling in search operations +- **Validation**: Pydantic models ensure data integrity +- **Fallback Mechanisms**: Default queries and error recovery + +## 📝 Code Examples & Best Practices + +### Multi-Agent Orchestration Pattern +**File**: `openai_agents/financial_research_agent/financial_research_manager.py` + +```python +class FinancialResearchManager: + """Orchestrates the full flow: planning, searching, sub-analysis, writing, and verification.""" + + def __init__(self) -> None: + self.run_config = RunConfig() + self.planner_agent = new_planner_agent() + self.search_agent = new_search_agent() + self.financials_agent = new_financials_agent() + self.risk_agent = new_risk_agent() + self.writer_agent = new_writer_agent() + self.verifier_agent = new_verifier_agent() + + async def run(self, query: str) -> str: + with trace("Financial research trace"): + # Execute research pipeline sequentially + search_plan = await self._plan_searches(query) + search_results = await self._perform_searches(search_plan) + report = await self._write_report(query, search_results) + verification = await self._verify_report(report) + + # Return formatted output with all components + result = f"""=====REPORT===== + +{report.markdown_report} + +=====FOLLOW UP QUESTIONS===== + +{chr(10).join(report.follow_up_questions)} + +=====VERIFICATION===== + +Verified: {verification.verified} +Issues: {verification.issues}""" + + return result +``` + +**Key Benefits**: +- **Centralized Coordination**: Single manager orchestrates entire research workflow +- **Sequential Logic**: Clear pipeline from planning to verification +- **Comprehensive Output**: Structured report with all research components +- **Traceability**: Temporal tracing for monitoring and debugging + +### Parallel Search Execution Pattern +**File**: `openai_agents/financial_research_agent/financial_research_manager.py` + +```python +async def _perform_searches( + self, search_plan: FinancialSearchPlan +) -> Sequence[str]: + with custom_span("Search the web"): + # Create concurrent search tasks for all search terms + tasks = [ + asyncio.create_task(self._search(item)) for item in search_plan.searches + ] + results: list[str] = [] + + # Process completed tasks as they finish + for task in workflow.as_completed(tasks): + result = await task + if result is not None: + results.append(result) + return results + +async def _search(self, item: FinancialSearchItem) -> str | None: + input_data = f"Search term: {item.query}\nReason: {item.reason}" + try: + result = await Runner.run( + self.search_agent, + input_data, + run_config=self.run_config, + ) + return str(result.final_output) + except Exception: + return None +``` + +**Key Benefits**: +- **Parallel Execution**: Multiple searches run concurrently for speed +- **Fault Tolerance**: Failed searches don't block successful ones +- **Efficient Processing**: as_completed handles tasks as they finish +- **Error Isolation**: Individual search failures don't crash workflow + +### Specialist Agent Tool Integration Pattern +**File**: `openai_agents/financial_research_agent/financial_research_manager.py` + +```python +async def _write_report( + self, query: str, search_results: Sequence[str] +) -> FinancialReportData: + # Expose specialist analysts as tools for inline analysis + fundamentals_tool = self.financials_agent.as_tool( + tool_name="fundamentals_analysis", + tool_description="Use to get a short write-up of key financial metrics", + custom_output_extractor=_summary_extractor, + ) + risk_tool = self.risk_agent.as_tool( + tool_name="risk_analysis", + tool_description="Use to get a short write-up of potential red flags", + custom_output_extractor=_summary_extractor, + ) + + # Clone writer agent with integrated specialist tools + writer_with_tools = self.writer_agent.clone( + tools=[fundamentals_tool, risk_tool] + ) + + input_data = ( + f"Original query: {query}\nSummarized search results: {search_results}" + ) + result = await Runner.run( + writer_with_tools, + input_data, + run_config=self.run_config, + ) + return result.final_output_as(FinancialReportData) +``` + +**Key Benefits**: +- **Tool Integration**: Specialist agents accessible as inline tools +- **Custom Extractors**: _summary_extractor provides clean output formatting +- **Agent Cloning**: Writer agent enhanced with specialist capabilities +- **Seamless Workflow**: Complex analysis integrated into writing process + +### Structured Data Models Pattern +**File**: `openai_agents/financial_research_agent/agents/planner_agent.py` + +```python +class FinancialSearchItem(BaseModel): + reason: str + """Your reasoning for why this search is relevant.""" + query: str + """The search term to feed into a web (or file) search.""" + +class FinancialSearchPlan(BaseModel): + searches: list[FinancialSearchItem] + """A list of searches to perform.""" + +def new_planner_agent() -> Agent: + return Agent( + name="FinancialPlannerAgent", + instructions=PROMPT, + model="o3-mini", + output_type=FinancialSearchPlan, + ) +``` + +**Key Benefits**: +- **Type Safety**: Pydantic models ensure data validation +- **Clear Structure**: Well-defined data models for search planning +- **Model Assignment**: Specific model (o3-mini) for cost-effective planning +- **Structured Output**: Guaranteed output format for downstream processing + +### Web Search Integration Pattern +**File**: `openai_agents/financial_research_agent/agents/search_agent.py` + +```python +def new_search_agent() -> Agent: + return Agent( + name="FinancialSearchAgent", + instructions=INSTRUCTIONS, + tools=[WebSearchTool()], + model_settings=ModelSettings(tool_choice="required"), + ) +``` + +**Key Benefits**: +- **Web Search Integration**: Built-in WebSearchTool for real-time information +- **Forced Tool Usage**: tool_choice="required" ensures web search execution +- **Specialized Instructions**: Financial-focused search result processing +- **Concise Output**: 300-word limit for focused, relevant summaries + +### Specialist Analysis Pattern +**File**: `openai_agents/financial_research_agent/agents/financials_agent.py` + +```python +class AnalysisSummary(BaseModel): + summary: str + """Short text summary for this aspect of the analysis.""" + +def new_financials_agent() -> Agent: + return Agent( + name="FundamentalsAnalystAgent", + instructions=FINANCIALS_PROMPT, + output_type=AnalysisSummary, + ) +``` + +**Key Benefits**: +- **Focused Analysis**: Specialized in financial fundamentals and metrics +- **Structured Output**: AnalysisSummary model for consistent formatting +- **Concise Results**: 2-paragraph limit for focused insights +- **Tool Integration**: Can be exposed as tool for inline analysis + +### Report Synthesis Pattern +**File**: `openai_agents/financial_research_agent/agents/writer_agent.py` + +```python +class FinancialReportData(BaseModel): + short_summary: str + """A short 2-3 sentence executive summary.""" + markdown_report: str + """The full markdown report.""" + follow_up_questions: list[str] + """Suggested follow-up questions for further research.""" + +def new_writer_agent() -> Agent: + return Agent( + name="FinancialWriterAgent", + instructions=WRITER_PROMPT, + model="gpt-4.1-2025-04-14", + output_type=FinancialReportData, + ) +``` + +**Key Benefits**: +- **Comprehensive Output**: Multiple report formats (summary, full, questions) +- **High-Quality Model**: GPT-4.1 for sophisticated report writing +- **Markdown Format**: Structured, readable report output +- **Follow-up Generation**: Automatic generation of research questions + +### Quality Verification Pattern +**File**: `openai_agents/financial_research_agent/agents/verifier_agent.py` + +```python +class VerificationResult(BaseModel): + verified: bool + """Whether the report seems coherent and plausible.""" + issues: str + """If not verified, describe the main issues or concerns.""" + +def new_verifier_agent() -> Agent: + return Agent( + name="VerificationAgent", + instructions=VERIFIER_PROMPT, + model="gpt-4o", + output_type=VerificationResult, + ) +``` + +**Key Benefits**: +- **Quality Assurance**: Final verification step for report accuracy +- **Issue Identification**: Clear reporting of problems or inconsistencies +- **High-Quality Model**: GPT-4o for thorough verification +- **Structured Results**: Boolean verification with detailed issue descriptions + +### Temporal Workflow Integration Pattern +**File**: `openai_agents/financial_research_agent/workflows/financial_research_workflow.py` + +```python +@workflow.defn +class FinancialResearchWorkflow: + @workflow.run + async def run(self, query: str) -> str: + manager = FinancialResearchManager() + return await manager.run(query) +``` + +**Key Benefits**: +- **Simple Integration**: Clean workflow interface to complex research system +- **Durable Execution**: Temporal handles long-running research tasks +- **State Management**: Workflow state preserved across system restarts +- **Scalable Processing**: Multiple research workflows can run concurrently + +### Client Execution Pattern +**File**: `openai_agents/financial_research_agent/run_financial_research_workflow.py` + +```python +async def main(): + # Get the query from user input + query = input("Enter a financial research query: ") + if not query.strip(): + query = "Write up an analysis of Apple Inc.'s most recent quarter." + print(f"Using default query: {query}") + + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + print(f"Starting financial research for: {query}") + print("This may take several minutes to complete...\n") + + result = await client.execute_workflow( + FinancialResearchWorkflow.run, + query, + id=f"financial-research-{hash(query)}", + task_queue="financial-research-task-queue", + ) + + print(result) +``` + +**Key Benefits**: +- **User Interaction**: Interactive query input with default fallback +- **Progress Feedback**: Clear indication of expected processing time +- **Unique Workflow IDs**: Hash-based IDs for workflow identification +- **Dedicated Task Queue**: Isolated processing for financial research workflows + +### Worker Configuration Pattern +**File**: `openai_agents/financial_research_agent/run_worker.py` + +```python +async def main(): + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + worker = Worker( + client, + task_queue="financial-research-task-queue", + workflows=[FinancialResearchWorkflow], + ) + + print("Starting financial research worker...") + await worker.run() +``` + +**Key Benefits**: +- **Dedicated Processing**: Isolated task queue for financial research +- **Plugin Integration**: OpenAI Agents plugin for seamless agent execution +- **Workflow Registration**: Clear workflow registration for worker +- **Scalable Architecture**: Multiple workers can handle concurrent research requests + +## 🎯 **Key Benefits of This Structure:** + +1. **Specialized Expertise**: Each agent focuses on specific research domains +2. **Parallel Processing**: Concurrent web searches optimize research speed +3. **Quality Assurance**: Verification step ensures report accuracy and consistency +4. **Tool Integration**: Specialist agents accessible as inline tools +5. **Structured Output**: Consistent data models throughout the pipeline +6. **Scalable Architecture**: Multiple workers can handle concurrent research requests +7. **Durable Execution**: Temporal workflows ensure research completion + +## ⚠️ **Important Implementation Notes:** + +- **Task Queue**: Uses `"financial-research-task-queue"` for all financial research workflows +- **Model Assignment**: Different agents use specific models (o3-mini, gpt-4.1, gpt-4o) +- **Parallel Execution**: Web searches use asyncio.create_task with workflow.as_completed +- **Tool Requirements**: Search agent requires tool usage with ModelSettings(tool_choice="required") +- **Output Extraction**: Custom _summary_extractor for clean specialist agent output +- **Error Handling**: Failed searches return None without stopping the workflow + +## 🏗️ **Architecture Patterns:** + +- **Pipeline Pattern**: Sequential workflow with parallel execution optimization +- **Tool Pattern**: Specialist agents exposed as tools for inline analysis +- **Factory Pattern**: Agent creation through factory functions +- **Orchestrator Pattern**: Central manager coordinating complex workflows +- **Parallel Processing Pattern**: Concurrent execution of independent tasks +- **Quality Gate Pattern**: Final verification step for output validation + +## 📁 **File Organization:** + +``` +openai_agents/financial_research_agent/ +├── README.md # Usage instructions and architecture overview +├── financial_research_manager.py # Main orchestrator and workflow coordinator +├── run_financial_research_workflow.py # Client for executing research workflows +├── run_worker.py # Temporal worker configuration +├── agents/ # Specialized agent implementations +│ ├── planner_agent.py # Search strategy planning agent +│ ├── search_agent.py # Web search execution agent +│ ├── financials_agent.py # Financial analysis specialist +│ ├── risk_agent.py # Risk assessment specialist +│ ├── writer_agent.py # Report synthesis agent +│ └── verifier_agent.py # Quality verification agent +└── workflows/ # Temporal workflow definitions + └── financial_research_workflow.py # Main research workflow +``` + +## 🔧 **Common Development Patterns:** + +- **Agent Factory Functions**: Centralized agent creation with consistent configuration +- **Tool Integration**: Specialist agents exposed as tools for complex workflows +- **Parallel Task Execution**: asyncio.create_task with workflow.as_completed +- **Structured Data Models**: Pydantic models for type safety and validation +- **Custom Output Extractors**: Specialized formatting for tool integration +- **Error Isolation**: Individual component failures don't crash entire workflow +- **Quality Gates**: Verification steps ensure output quality and consistency + +This structure ensures new developers can quickly understand how to build sophisticated multi-agent research systems with parallel processing, quality assurance, and seamless tool integration using Temporal workflows and OpenAI Agents SDK. diff --git a/docs/openai_agents/HANDOFFS.md b/docs/openai_agents/HANDOFFS.md new file mode 100644 index 000000000..7ebb3c505 --- /dev/null +++ b/docs/openai_agents/HANDOFFS.md @@ -0,0 +1,586 @@ +# Agent Handoffs + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Agent Handoffs service demonstrates sophisticated agent transition patterns with intelligent message filtering and context management. This service showcases how agents can seamlessly hand off conversations to specialized agents while maintaining conversation continuity and selectively filtering message history. + +The system is designed for developers and engineering teams who want to: +- Learn how to implement intelligent agent handoffs in Temporal workflows +- Understand message filtering and context management during transitions +- Build multi-agent conversation systems with specialized capabilities +- Implement context-aware handoff triggers and message processing +- Maintain conversation flow across different agent specializations +- Control what information is preserved during agent transitions + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Agent Specialization**: Different agents have different expertise and capabilities +- **Conversation Continuity**: Maintaining context across agent transitions +- **Message Filtering**: Selective preservation of conversation history +- **Context Management**: Controlling what information each agent receives +- **Handoff Triggers**: Intelligent detection of when handoffs should occur +- **Tool Cleanup**: Removing tool-related messages during transitions +- **Selective Context**: Demonstrating different filtering strategies + +### Our Approach +- **Intelligent Handoffs**: Trigger handoffs based on content analysis +- **Message Filtering**: Use specialized filters to clean conversation history +- **Context Preservation**: Maintain essential conversation context across transitions +- **Specialized Agents**: Create agents with focused capabilities and handoff descriptions +- **Tool Integration**: Demonstrate handoffs with function tools and message processing +- **Workflow Orchestration**: Use Temporal workflows to manage complex conversation flows + +## ⚡ System Constraints & Features + +### Key Features +- **Multi-Agent Conversations**: Seamless transitions between specialized agents +- **Message Filtering**: Intelligent cleanup of conversation history during handoffs +- **Context Preservation**: Maintain conversation flow across agent boundaries +- **Tool Integration**: Function tools that work across agent transitions +- **Language Detection**: Automatic handoff to language-specialized agents +- **Selective Context**: Demonstrate different filtering strategies +- **Complete Message History**: Return full conversation flow for inspection + +### System Constraints +- **No Streaming**: Temporal workflows don't support streaming responses +- **Message History Limits**: OpenAI API has conversation length constraints +- **Tool Message Cleanup**: Tool-related messages must be filtered during handoffs +- **Context Window**: Each agent has limited context window for processing +- **Task Queue**: Uses `"openai-agents-handoffs-task-queue"` for all workflows +- **Extended Timeouts**: 60-second timeouts for complex conversation flows + +## 🏗️ System Overview + +```mermaid +graph TB + A[User Input] --> B[First Agent] + B --> C[Tool Usage] + C --> D[Second Agent] + D --> E[Content Analysis] + E --> F{Spanish Detected?} + + F -->|Yes| G[Message Filtering] + F -->|No| H[Continue Conversation] + + G --> I[Spanish Agent] + I --> J[Filtered Context] + J --> K[Final Response] + + H --> L[General Response] + L --> K + + B --> M[Conversation History] + C --> M + D --> M + I --> M + M --> N[Message Filtering] + N --> O[Filtered History] +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant U as User + participant A1 as First Agent + participant A2 as Second Agent + participant AS as Spanish Agent + participant F as Message Filter + + U->>A1: Greeting with name + A1->>A1: Generate random number + A1->>A2: Handoff conversation + U->>A2: Ask about NYC population + U->>A2: Speak in Spanish + A2->>F: Trigger handoff filter + F->>F: Remove tool messages + F->>F: Drop first 2 messages + F->>AS: Filtered conversation + AS->>U: Spanish response +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflow orchestrating the 4-step conversation +2. **Agent Layer**: Specialized agents with handoff capabilities and descriptions +3. **Filter Layer**: Message filtering functions for context cleanup +4. **Tool Layer**: Function tools that work across agent boundaries +5. **Execution Layer**: Runner scripts and worker processes for deployment + +### Key Components +- **MessageFilterWorkflow**: Main workflow orchestrating the conversation flow +- **Specialized Agents**: First agent, second agent, and Spanish specialist +- **Message Filters**: Custom filtering functions for context management +- **Function Tools**: Random number generation tool +- **Handoff Management**: Intelligent handoff triggers and transitions +- **Context Preservation**: Maintaining conversation flow across transitions + +## 🔗 Interaction Flow + +### Internal Communication +- Workflows orchestrate agent transitions using `Runner.run()` with context +- Agents communicate through handoff descriptions and triggers +- Message filters process conversation history during transitions +- Context is preserved using `result.to_input_list()` for continuity +- Tool usage is tracked and can be filtered during handoffs + +### External Dependencies +- **OpenAI API**: For agent responses and conversation management +- **Temporal Server**: For workflow orchestration and state management +- **Message Processing**: For filtering and context management +- **Tool Execution**: For function tool integration across agents + +## 💻 Development Guidelines + +### Code Organization +- **Workflow Files**: One file per handoff pattern in `workflows/` directory +- **Runner Scripts**: Individual execution scripts in root directory +- **Worker**: Central worker supporting all handoff workflows in `run_worker.py` +- **Message Filters**: Custom filtering functions embedded in workflow files + +### Design Patterns +- **Handoff Pattern**: Seamless agent transitions with context preservation +- **Message Filtering Pattern**: Intelligent cleanup of conversation history +- **Context Continuity Pattern**: Maintaining conversation flow across agents +- **Specialized Agent Pattern**: Agents with focused capabilities and handoff descriptions +- **Tool Integration Pattern**: Function tools that work across agent boundaries + +### Error Handling +- **Handoff Failures**: Handle cases where handoffs fail or are rejected +- **Message Filter Errors**: Gracefully handle filtering failures +- **Context Loss**: Prevent loss of essential conversation context +- **Tool Integration**: Handle tool-related message cleanup failures +- **Timeout Management**: Extended timeouts for complex conversation flows + +## 📝 Code Examples & Best Practices + +### Message Filter Workflow Pattern +**File**: `openai_agents/handoffs/workflows/message_filter_workflow.py` + +This pattern demonstrates sophisticated agent handoffs with intelligent message filtering and context management. + +```python +from __future__ import annotations + +from dataclasses import dataclass +from typing import List + +from agents import Agent, HandoffInputData, Runner, function_tool, handoff +from agents.extensions import handoff_filters +from agents.items import TResponseInputItem +from temporalio import workflow + +# Structured output for handoff results +@dataclass +class MessageFilterResult: + final_output: str # Final agent response + final_messages: List[TResponseInputItem] # Complete conversation history + +# Function tool for random number generation +@function_tool +def random_number_tool(max: int) -> int: + """Return a random integer between 0 and the given maximum.""" + return workflow.random().randint(0, max) + +# Custom message filter for Spanish handoffs +def spanish_handoff_message_filter( + handoff_message_data: HandoffInputData, +) -> HandoffInputData: + # First, remove any tool-related messages from the message history + handoff_message_data = handoff_filters.remove_all_tools(handoff_message_data) + + # Second, remove the first two items from the history for demonstration + history = ( + tuple(handoff_message_data.input_history[2:]) + if isinstance(handoff_message_data.input_history, tuple) + else handoff_message_data.input_history + ) + + # Return filtered handoff data with cleaned history + return HandoffInputData( + input_history=history, + pre_handoff_items=tuple(handoff_message_data.pre_handoff_items), + new_items=tuple(handoff_message_data.new_items), + ) + +@workflow.defn +class MessageFilterWorkflow: + @workflow.run + async def run(self, user_name: str = "Sora") -> MessageFilterResult: + # Agent 1: Concise assistant with tool capabilities + first_agent = Agent( + name="Assistant", + instructions="Be extremely concise.", + tools=[random_number_tool], + ) + + # Agent 3: Spanish specialist for language-specific queries + spanish_agent = Agent( + name="Spanish Assistant", + instructions="You only speak Spanish and are extremely concise.", + handoff_description="A Spanish-speaking assistant.", + ) + + # Agent 2: General assistant with handoff capabilities + second_agent = Agent( + name="Assistant", + instructions=( + "Be a helpful assistant. If the user speaks Spanish, handoff to the Spanish assistant." + ), + handoffs=[ + handoff(spanish_agent, input_filter=spanish_handoff_message_filter) + ], + ) + + # Step 1: Initial greeting with the first agent + result = await Runner.run(first_agent, input=f"Hi, my name is {user_name}.") + + # Step 2: Tool usage demonstration with random number generation + result = await Runner.run( + first_agent, + input=result.to_input_list() # Preserve conversation context + + [ + { + "content": "Can you generate a random number between 0 and 100?", + "role": "user", + } + ], + ) + + # Step 3: Handoff to second agent for general questions + result = await Runner.run( + second_agent, + input=result.to_input_list() # Maintain conversation flow + + [ + { + "content": "I live in New York City. What's the population of the city?", + "role": "user", + } + ], + ) + + # Step 4: Trigger Spanish handoff with message filtering + result = await Runner.run( + second_agent, + input=result.to_input_list() # Continue conversation context + + [ + { + "content": "Por favor habla en español. ¿Cuál es mi nombre y dónde vivo?", + "role": "user", + } + ], + ) + + # Return both final response and complete message history for inspection + return MessageFilterResult( + final_output=result.final_output, + final_messages=result.to_input_list() + ) +``` + +**Key Benefits**: +- **Intelligent Handoffs**: Automatic detection of Spanish language triggers handoff +- **Message Filtering**: Custom filters remove tool messages and selective context +- **Context Preservation**: Maintains conversation flow across agent boundaries +- **Tool Integration**: Function tools work seamlessly across agent transitions +- **Complete History**: Returns full conversation flow for debugging and inspection + +### Message Filtering Pattern +**File**: `openai_agents/handoffs/workflows/message_filter_workflow.py` + +This pattern demonstrates how to create custom message filters for intelligent context management during handoffs. + +```python +def spanish_handoff_message_filter( + handoff_message_data: HandoffInputData, +) -> HandoffInputData: + # Remove tool-related messages to clean up conversation history + handoff_message_data = handoff_filters.remove_all_tools(handoff_message_data) + + # Demonstrate selective context removal by dropping first two messages + history = ( + tuple(handoff_message_data.input_history[2:]) + if isinstance(handoff_message_data.input_history, tuple) + else handoff_message_data.input_history + ) + + # Return filtered data with cleaned history and preserved structure + return HandoffInputData( + input_history=history, + pre_handoff_items=tuple(handoff_message_data.pre_handoff_items), + new_items=tuple(handoff_message_data.new_items), + ) +``` + +**Key Benefits**: +- **Tool Cleanup**: Removes tool-related messages that aren't relevant to next agent +- **Selective Context**: Demonstrates different filtering strategies for context management +- **Type Safety**: Handles both tuple and list history formats safely +- **Structure Preservation**: Maintains handoff data structure while filtering content +- **Customizable Filtering**: Easy to adapt for different handoff scenarios + +### Specialized Agent Configuration +**File**: `openai_agents/handoffs/workflows/message_filter_workflow.py` + +This pattern demonstrates how to configure agents with handoff capabilities and specialized instructions. + +```python +# Agent with tool capabilities for initial interactions +first_agent = Agent( + name="Assistant", + instructions="Be extremely concise.", + tools=[random_number_tool], +) + +# Specialized agent for language-specific queries +spanish_agent = Agent( + name="Spanish Assistant", + instructions="You only speak Spanish and are extremely concise.", + handoff_description="A Spanish-speaking assistant.", # Description for handoff detection +) + +# Agent with handoff capabilities and trigger logic +second_agent = Agent( + name="Assistant", + instructions=( + "Be a helpful assistant. If the user speaks Spanish, handoff to the Spanish assistant." + ), + handoffs=[ + handoff(spanish_agent, input_filter=spanish_handoff_message_filter) + ], +) +``` + +**Key Benefits**: +- **Specialized Instructions**: Each agent has focused capabilities and behavior +- **Handoff Descriptions**: Clear descriptions help with handoff detection +- **Trigger Logic**: Instructions include handoff detection criteria +- **Filter Integration**: Custom filters can be attached to handoffs +- **Tool Distribution**: Tools are distributed based on agent capabilities + +### Context Continuity Pattern +**File**: `openai_agents/handoffs/workflows/message_filter_workflow.py` + +This pattern demonstrates how to maintain conversation context across agent transitions. + +```python +# Step 1: Initial interaction +result = await Runner.run(first_agent, input=f"Hi, my name is {user_name}.") + +# Step 2: Continue conversation with context preservation +result = await Runner.run( + first_agent, + input=result.to_input_list() # Preserve conversation context + + [ + { + "content": "Can you generate a random number between 0 and 100?", + "role": "user", + } + ], +) + +# Step 3: Handoff to second agent with full context +result = await Runner.run( + second_agent, + input=result.to_input_list() # Maintain conversation flow + + [ + { + "content": "I live in New York City. What's the population of the city?", + "role": "user", + } + ], +) +``` + +**Key Benefits**: +- **Context Preservation**: `result.to_input_list()` maintains conversation history +- **Seamless Transitions**: Agents can continue conversations naturally +- **State Management**: Temporal workflows maintain conversation state +- **History Accumulation**: Each interaction builds on previous context +- **Natural Flow**: Users experience continuous conversation across agents + +### Worker Configuration +**File**: `openai_agents/handoffs/run_worker.py` + +This is the central worker that supports the handoff workflow, providing extended timeouts for complex conversation flows. + +```python +from __future__ import annotations + +import asyncio +from datetime import timedelta + +from temporalio.client import Client +from temporalio.contrib.openai_agents import ModelActivityParameters, OpenAIAgentsPlugin +from temporalio.worker import Worker + +from openai_agents.handoffs.workflows.message_filter_workflow import ( + MessageFilterWorkflow, +) + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=60) # Extended timeout for conversations + ) + ), + ], + ) + + # Create worker supporting the handoff workflow + worker = Worker( + client, + task_queue="openai-agents-handoffs-task-queue", # Dedicated task queue for handoffs + workflows=[ + MessageFilterWorkflow, # Register the handoff workflow + ], + activities=[ + # No custom activities needed for this workflow + ], + ) + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Extended Timeouts**: 60-second timeouts for complex conversation flows +- **Dedicated Task Queue**: Separate queue for handoff-specific workflows +- **Workflow Registration**: Single workflow registration for the handoff pattern +- **Plugin Configuration**: OpenAI integration with appropriate timeout settings +- **Easy Deployment**: Single process to manage and monitor handoff workflows + +### Runner Script Pattern +**File**: `openai_agents/handoffs/run_message_filter_workflow.py` + +This pattern demonstrates how to execute the handoff workflow and inspect the complete conversation history. + +```python +import asyncio +import json + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.handoffs.workflows.message_filter_workflow import ( + MessageFilterWorkflow, +) + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute the handoff workflow with user name + result = await client.execute_workflow( + MessageFilterWorkflow.run, + "Sora", # User name parameter + id="message-filter-workflow", + task_queue="openai-agents-handoffs-task-queue", + ) + + # Display final response + print(f"Final output: {result.final_output}") + print("\n===Final messages===\n") + + # Print complete message history to inspect filtering effects + for message in result.final_messages: + print(json.dumps(message, indent=2)) + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Workflow Execution**: Simple execution of complex handoff workflows +- **Parameter Passing**: User name can be customized for different test scenarios +- **Result Inspection**: Access to both final response and complete message history +- **Debugging Support**: JSON formatting for easy message history analysis +- **Filtering Verification**: See exactly how message filters affected conversation + +## 🎯 Key Benefits of This Structure + +1. **Intelligent Handoffs**: Automatic detection and execution of agent transitions +2. **Message Filtering**: Intelligent cleanup of conversation history during handoffs +3. **Context Preservation**: Maintains conversation flow across agent boundaries +4. **Specialized Agents**: Agents with focused capabilities and clear handoff descriptions +5. **Tool Integration**: Function tools that work seamlessly across agent transitions +6. **Complete History**: Full conversation flow for debugging and inspection +7. **Customizable Filters**: Easy to adapt filtering strategies for different scenarios +8. **Workflow Orchestration**: Temporal workflows manage complex conversation flows + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"openai-agents-handoffs-task-queue"` +- **Runner Scripts**: Use the same task queue for consistency +- **Note**: Dedicated task queue for handoff-specific workflows + +### Handoff Dependencies and Setup +- **OpenAI API Key**: Required for agent conversations and handoffs +- **Message Filtering**: Custom filters for context cleanup during transitions +- **Context Management**: Careful handling of conversation history across agents +- **Tool Integration**: Function tools must be accessible to relevant agents + +### Specific Examples Implemented +- **4-Step Conversation**: Greeting → Tool Usage → General Question → Spanish Handoff +- **Message Filtering**: Tool removal + selective context dropping +- **Context Preservation**: `result.to_input_list()` for conversation continuity +- **Language Detection**: Automatic handoff to Spanish specialist +- **Complete History**: Return full conversation flow for inspection + +### Architecture Patterns +- **Handoff-First Design**: Handoffs are primary components, not afterthoughts +- **Message Filtering**: Intelligent cleanup of conversation history during transitions +- **Context Continuity**: Maintaining conversation flow across agent boundaries +- **Specialized Agents**: Agents with focused capabilities and clear handoff descriptions +- **Tool Integration**: Function tools that work across agent transitions + +### File Organization +``` +openai_agents/handoffs/ +├── workflows/ # Core handoff implementations +│ └── message_filter_workflow.py # Message filtering with handoffs +├── run_worker.py # Central worker for handoff workflows +├── run_message_filter_workflow.py # Individual workflow runner +└── README.md # Handoff overview and usage +``` + +### Common Development Patterns +- **Always use `result.to_input_list()`** to preserve conversation context +- **Implement custom message filters** for context cleanup during handoffs +- **Provide clear handoff descriptions** for better agent coordination +- **Test message filtering effects** by inspecting complete conversation history +- **Use extended timeouts** for complex conversation flows +- **Handle both tuple and list** history formats in message filters + +This structure ensures developers can understand: +- **Handoff implementation patterns** with intelligent message filtering +- **Context management** across agent transitions +- **Message filtering strategies** for different handoff scenarios +- **Tool integration** across agent boundaries +- **Conversation continuity** maintenance +- **Complete conversation flow** inspection and debugging + +The handoffs serve as building blocks for complex multi-agent conversation systems while maintaining the reliability, observability, and error handling that Temporal provides. Each pattern demonstrates specific handoff strategies that can be adapted for custom conversation flows and agent coordination. diff --git a/docs/openai_agents/HOSTED_MCP.md b/docs/openai_agents/HOSTED_MCP.md new file mode 100644 index 000000000..e7d8ad828 --- /dev/null +++ b/docs/openai_agents/HOSTED_MCP.md @@ -0,0 +1,562 @@ +# Hosted MCP Integration + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Hosted MCP Integration service demonstrates how to integrate OpenAI agents with external Model Context Protocol (MCP) servers in Temporal workflows. This service showcases two key patterns: simple MCP connections for trusted servers and approval-based MCP connections with callback workflows for enhanced security and control. + +The system is designed for developers and engineering teams who want to: +- Learn how to integrate external MCP servers with OpenAI agents in Temporal +- Understand approval workflows for MCP tool execution +- Build secure integrations with external data sources and APIs +- Implement callback-based approval systems within Temporal workflows +- Connect to hosted MCP servers for enhanced agent capabilities +- Manage tool execution permissions and approval workflows +- Extend agent capabilities through external MCP server integrations + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **External Tool Integration**: Connecting agents to external MCP servers for enhanced capabilities +- **Security and Control**: Managing tool execution permissions through approval workflows +- **Server Trust Management**: Differentiating between trusted and approval-required MCP servers +- **Workflow Integration**: Seamlessly integrating MCP tools within Temporal workflows +- **Approval Callbacks**: Implementing approval workflows for sensitive tool operations +- **Server Configuration**: Flexible configuration for different MCP server endpoints +- **Tool Execution Control**: Granular control over when and how MCP tools are executed + +### Our Approach +- **Dual Pattern Support**: Simple connections for trusted servers, approval workflows for others +- **Callback-Based Approvals**: Approval callbacks execute within Temporal workflows +- **Flexible Configuration**: Easy server URL configuration for different MCP endpoints +- **Workflow Integration**: MCP tools work seamlessly within Temporal workflow contexts +- **Approval Workflow Integration**: Use Temporal signals and updates for user approval +- **Server Labeling**: Clear identification of MCP servers for debugging and monitoring + +## ⚡ System Constraints & Features + +### Key Features +- **Hosted MCP Integration**: Connect to external MCP servers from Temporal workflows +- **Approval Workflows**: Callback-based approval system for tool execution +- **Flexible Server Configuration**: Easy switching between different MCP server endpoints +- **Trusted Server Support**: Simple connections for pre-approved MCP servers +- **Workflow Integration**: MCP tools work seamlessly within Temporal contexts +- **Approval Callbacks**: Approval functions execute within workflow context +- **Server Labeling**: Clear identification and labeling of MCP server connections + +### System Constraints +- **MCP Server Availability**: External MCP servers must be accessible and operational +- **Approval Workflow Complexity**: Approval callbacks add complexity to tool execution +- **Network Dependencies**: External server connectivity affects workflow reliability +- **Approval Response Time**: User approval delays can impact workflow performance +- **Task Queue**: Uses `"openai-agents-hosted-mcp-task-queue"` for all workflows +- **Extended Timeouts**: 60-second timeouts for MCP server interactions +- **External API Limits**: MCP server rate limits and availability constraints + +## 🏗️ System Overview + +```mermaid +graph TB + A[Temporal Workflow] --> B[OpenAI Agent] + B --> C[HostedMCPTool] + C --> D{MCP Server Type} + + D -->|Trusted| E[Simple Connection] + D -->|Approval Required| F[Approval Workflow] + + E --> G[External MCP Server] + F --> H[Approval Callback] + H --> I[User Approval] + I --> G + + G --> J[Tool Execution] + J --> K[Result Processing] + K --> L[Workflow Response] + + subgraph "MCP Server Options" + M[GitMCP Server] + N[Custom MCP Server] + O[Other MCP Servers] + end + + C --> M + C --> N + C --> O +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant W as Workflow + participant A as Agent + participant M as MCP Tool + participant S as MCP Server + participant C as Approval Callback + + W->>A: Execute with question + A->>M: Use MCP tool + M->>S: Connect to server + + alt Simple Connection (Trusted) + S->>M: Execute tool + M->>A: Return result + else Approval Required + M->>C: Request approval + C->>C: Process approval + C->>M: Approval decision + M->>S: Execute if approved + S->>M: Return result + M->>A: Return result + end + + A->>W: Final output +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflows orchestrating MCP tool usage +2. **Agent Layer**: OpenAI agents with MCP tool integration +3. **Tool Layer**: HostedMCPTool with approval workflow support +4. **Approval Layer**: Callback functions for tool execution approval +5. **Server Layer**: External MCP servers providing enhanced capabilities +6. **Execution Layer**: Runner scripts and worker processes for deployment + +### Key Components +- **SimpleMCPWorkflow**: Workflow for trusted MCP server connections +- **ApprovalMCPWorkflow**: Workflow with approval callbacks for MCP tools +- **HostedMCPTool**: Tool wrapper for MCP server integration +- **Approval Callbacks**: Functions for managing tool execution permissions +- **MCP Server Integration**: Connection to external MCP servers +- **Workflow Orchestration**: Temporal workflow management of MCP interactions + +## 🔗 Interaction Flow + +### Internal Communication +- Workflows orchestrate MCP tool usage through OpenAI agents +- Agents communicate with MCP tools using standard tool interfaces +- Approval callbacks execute within Temporal workflow context +- MCP tools handle server communication and response processing +- Workflows manage the complete MCP tool execution lifecycle + +### External Dependencies +- **MCP Servers**: External servers providing enhanced tool capabilities +- **OpenAI API**: For agent responses and tool integration +- **Temporal Server**: For workflow orchestration and state management +- **Network Infrastructure**: For MCP server connectivity and communication +- **Approval Systems**: For user approval workflows and decision making + +## 💻 Development Guidelines + +### Code Organization +- **Workflow Files**: One file per MCP pattern in `workflows/` directory +- **Runner Scripts**: Individual execution scripts in root directory +- **Worker**: Central worker supporting all MCP workflows in `run_worker.py` +- **Approval Callbacks**: Approval functions embedded in workflow files + +### Design Patterns +- **Simple MCP Pattern**: Direct connections to trusted MCP servers +- **Approval MCP Pattern**: Callback-based approval workflows for tool execution +- **Server Configuration Pattern**: Flexible server URL configuration +- **Tool Integration Pattern**: Seamless MCP tool integration within agents +- **Workflow Orchestration Pattern**: Temporal workflows managing MCP interactions + +### Error Handling +- **MCP Server Failures**: Handle cases where external servers are unavailable +- **Approval Rejections**: Manage cases where tool execution is denied +- **Network Timeouts**: Handle connectivity issues with external servers +- **Tool Execution Errors**: Gracefully handle MCP tool failures +- **Approval Callback Errors**: Handle approval workflow failures + +## 📝 Code Examples & Best Practices + +### Simple MCP Connection Pattern +**File**: `openai_agents/hosted_mcp/workflows/simple_mcp_workflow.py` + +This pattern demonstrates direct connection to trusted MCP servers without approval requirements. + +```python +from __future__ import annotations + +from agents import Agent, HostedMCPTool, Runner +from temporalio import workflow + +@workflow.defn +class SimpleMCPWorkflow: + @workflow.run + async def run( + self, question: str, server_url: str = "https://gitmcp.io/openai/codex" + ) -> str: + # Create agent with MCP tool integration + agent = Agent( + name="Assistant", + tools=[ + HostedMCPTool( + tool_config={ + "type": "mcp", # Specify MCP tool type + "server_label": "gitmcp", # Label for server identification + "server_url": server_url, # MCP server endpoint + "require_approval": "never", # No approval required for trusted servers + } + ) + ], + ) + + # Execute agent with MCP tool capabilities + result = await Runner.run(agent, question) + return result.final_output +``` + +**Key Benefits**: +- **Simple Integration**: Direct connection to trusted MCP servers +- **No Approval Overhead**: Immediate tool execution without approval delays +- **Flexible Configuration**: Easy server URL customization for different endpoints +- **Server Labeling**: Clear identification for debugging and monitoring +- **Seamless Workflow Integration**: MCP tools work naturally within Temporal workflows + +### MCP with Approval Workflow Pattern +**File**: `openai_agents/hosted_mcp/workflows/approval_mcp_workflow.py` + +This pattern demonstrates approval-based MCP tool execution with callback workflows for enhanced security. + +```python +from __future__ import annotations + +from agents import ( + Agent, + HostedMCPTool, + MCPToolApprovalFunctionResult, + MCPToolApprovalRequest, + Runner, +) +from temporalio import workflow + +def approval_callback(request: MCPToolApprovalRequest) -> MCPToolApprovalFunctionResult: + """Simple approval callback that logs the request and approves by default. + + In a real application, user input would be provided through a UI or API. + The approval callback executes within the Temporal workflow, so the application + can use signals or updates to receive user input. + """ + # Log the approval request for monitoring and debugging + workflow.logger.info(f"MCP tool approval requested for: {request.data.name}") + + # Default approval result - in production, this would come from user input + result: MCPToolApprovalFunctionResult = {"approve": True} + return result + +@workflow.defn +class ApprovalMCPWorkflow: + @workflow.run + async def run( + self, question: str, server_url: str = "https://gitmcp.io/openai/codex" + ) -> str: + # Create agent with approval-required MCP tool + agent = Agent( + name="Assistant", + tools=[ + HostedMCPTool( + tool_config={ + "type": "mcp", # Specify MCP tool type + "server_label": "gitmcp", # Label for server identification + "server_url": server_url, # MCP server endpoint + "require_approval": "always", # Always require approval for this server + }, + on_approval_request=approval_callback, # Approval callback function + ) + ], + ) + + # Execute agent with approval workflow + result = await Runner.run(agent, question) + return result.final_output +``` + +**Key Benefits**: +- **Security Control**: Approval required for all MCP tool executions +- **Callback Integration**: Approval callbacks execute within Temporal workflow context +- **User Input Support**: Can integrate with UI/API for real user approval +- **Monitoring**: Logs all approval requests for audit and debugging +- **Flexible Approval Logic**: Easy to customize approval decision logic + +### Approval Callback Pattern +**File**: `openai_agents/hosted_mcp/workflows/approval_mcp_workflow.py` + +This pattern demonstrates how to implement approval callbacks for MCP tool execution within Temporal workflows. + +```python +def approval_callback(request: MCPToolApprovalRequest) -> MCPToolApprovalFunctionResult: + """Simple approval callback that logs the request and approves by default. + + In a real application, user input would be provided through a UI or API. + The approval callback executes within the Temporal workflow, so the application + can use signals or updates to receive user input. + """ + # Log the approval request for monitoring and debugging + workflow.logger.info(f"MCP tool approval requested for: {request.data.name}") + + # Default approval result - in production, this would come from user input + result: MCPToolApprovalFunctionResult = {"approve": True} + return result +``` + +**Key Benefits**: +- **Workflow Integration**: Approval callbacks execute within Temporal workflow context +- **User Input Support**: Can use Temporal signals and updates for user approval +- **Monitoring**: Logs all approval requests for audit and debugging +- **Flexible Logic**: Easy to customize approval decision logic +- **Production Ready**: Can integrate with real user approval systems + +### HostedMCPTool Configuration Pattern +**File**: `openai_agents/hosted_mcp/workflows/approval_mcp_workflow.py` + +This pattern demonstrates how to configure MCP tools with different approval requirements and server settings. + +```python +# Simple MCP tool for trusted servers +HostedMCPTool( + tool_config={ + "type": "mcp", # Specify MCP tool type + "server_label": "gitmcp", # Label for server identification + "server_url": server_url, # MCP server endpoint + "require_approval": "never", # No approval required for trusted servers + } +) + +# Approval-required MCP tool with callback +HostedMCPTool( + tool_config={ + "type": "mcp", # Specify MCP tool type + "server_label": "gitmcp", # Label for server identification + "server_url": server_url, # MCP server endpoint + "require_approval": "always", # Always require approval for this server + }, + on_approval_request=approval_callback, # Approval callback function +) +``` + +**Key Benefits**: +- **Flexible Configuration**: Easy to switch between approval modes +- **Server Labeling**: Clear identification for debugging and monitoring +- **URL Customization**: Easy to connect to different MCP server endpoints +- **Approval Integration**: Seamless integration with approval workflows +- **Type Safety**: Strong typing for tool configuration and approval callbacks + +### Worker Configuration +**File**: `openai_agents/hosted_mcp/run_worker.py` + +This is the central worker that supports both MCP workflow patterns, providing extended timeouts for MCP server interactions. + +```python +from __future__ import annotations + +import asyncio +from datetime import timedelta + +from temporalio.client import Client +from temporalio.contrib.openai_agents import ModelActivityParameters, OpenAIAgentsPlugin +from temporalio.worker import Worker + +from openai_agents.hosted_mcp.workflows.approval_mcp_workflow import ApprovalMCPWorkflow +from openai_agents.hosted_mcp.workflows.simple_mcp_workflow import SimpleMCPWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=60) # Extended timeout for MCP interactions + ) + ), + ], + ) + + # Create worker supporting both MCP workflow patterns + worker = Worker( + client, + task_queue="openai-agents-hosted-mcp-task-queue", # Dedicated task queue for MCP workflows + workflows=[ + SimpleMCPWorkflow, # Simple MCP connections + ApprovalMCPWorkflow, # Approval-based MCP connections + ], + activities=[ + # No custom activities needed for these workflows + ], + ) + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Dual Pattern Support**: Supports both simple and approval-based MCP workflows +- **Extended Timeouts**: 60-second timeouts for MCP server interactions +- **Dedicated Task Queue**: Separate queue for MCP-specific workflows +- **Workflow Registration**: Both MCP workflow patterns registered +- **Plugin Configuration**: OpenAI integration with appropriate timeout settings + +### Simple MCP Runner Pattern +**File**: `openai_agents/hosted_mcp/run_simple_mcp_workflow.py` + +This pattern demonstrates how to execute the simple MCP workflow for trusted server connections. + +```python +import asyncio + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.hosted_mcp.workflows.simple_mcp_workflow import SimpleMCPWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute the simple MCP workflow with repository analysis question + result = await client.execute_workflow( + SimpleMCPWorkflow.run, + "Which language is this repo written in?", # Question for MCP server + id="simple-mcp-workflow", + task_queue="openai-agents-hosted-mcp-task-queue", + ) + + print(f"Result: {result}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Simple Execution**: Easy execution of MCP tool workflows +- **Question Customization**: Repository analysis questions can be customized +- **Workflow Identification**: Clear workflow ID for monitoring and debugging +- **Result Display**: Simple output display for MCP tool results +- **Task Queue Consistency**: Uses the same task queue as the worker + +### Approval MCP Runner Pattern +**File**: `openai_agents/hosted_mcp/run_approval_mcp_workflow.py` + +This pattern demonstrates how to execute the approval-based MCP workflow with approval callbacks. + +```python +import asyncio + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.hosted_mcp.workflows.approval_mcp_workflow import ApprovalMCPWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute the approval MCP workflow with repository analysis question + result = await client.execute_workflow( + ApprovalMCPWorkflow.run, + "Which language is this repo written in?", # Question for MCP server + id="approval-mcp-workflow", + task_queue="openai-agents-hosted-mcp-task-queue", + ) + + print(f"Result: {result}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Approval Workflow**: Executes MCP tools with approval requirements +- **Callback Integration**: Approval callbacks execute within workflow context +- **Security Control**: All MCP tool executions require approval +- **Workflow Identification**: Clear workflow ID for monitoring and debugging +- **Result Display**: Simple output display for approved MCP tool results + +## 🎯 Key Benefits of This Structure + +1. **Dual Pattern Support**: Simple connections for trusted servers, approval workflows for others +2. **Security Control**: Approval-based tool execution for sensitive operations +3. **Flexible Configuration**: Easy switching between different MCP server endpoints +4. **Workflow Integration**: MCP tools work seamlessly within Temporal contexts +5. **Approval Callbacks**: Approval functions execute within workflow context +6. **Server Labeling**: Clear identification for debugging and monitoring +7. **External Integration**: Connect to hosted MCP servers for enhanced capabilities +8. **Production Ready**: Can integrate with real user approval systems + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"openai-agents-hosted-mcp-task-queue"` +- **Runner Scripts**: Use the same task queue for consistency +- **Note**: Dedicated task queue for MCP-specific workflows + +### MCP Server Dependencies and Setup +- **External MCP Servers**: Must be accessible and operational +- **Network Connectivity**: Stable connection required for MCP server interactions +- **Approval Workflows**: Callback functions for tool execution permissions +- **Server Configuration**: Flexible server URL configuration for different endpoints + +### Specific Examples Implemented +- **Simple MCP Connection**: Direct connection to trusted GitMCP server +- **Approval MCP Workflow**: Callback-based approval for tool execution +- **Repository Analysis**: Questions about repository languages and structure +- **GitMCP Integration**: Default integration with `https://gitmcp.io/openai/codex` +- **Approval Callbacks**: Functions executing within Temporal workflow context + +### Architecture Patterns +- **Dual Pattern Design**: Simple and approval-based MCP integration patterns +- **Callback Integration**: Approval callbacks execute within workflow context +- **Server Configuration**: Flexible server URL and approval requirement configuration +- **Tool Integration**: Seamless MCP tool integration within OpenAI agents +- **Workflow Orchestration**: Temporal workflows managing MCP interactions + +### File Organization +``` +openai_agents/hosted_mcp/ +├── workflows/ # Core MCP implementations +│ ├── simple_mcp_workflow.py # Simple MCP connections +│ └── approval_mcp_workflow.py # Approval-based MCP connections +├── run_worker.py # Central worker for MCP workflows +├── run_simple_mcp_workflow.py # Simple MCP workflow runner +├── run_approval_mcp_workflow.py # Approval MCP workflow runner +└── README.md # MCP integration overview and usage +``` + +### Common Development Patterns +- **Use `require_approval: "never"`** for trusted MCP servers +- **Use `require_approval: "always"`** for servers requiring approval +- **Implement approval callbacks** for enhanced security and control +- **Configure server URLs** for different MCP server endpoints +- **Use server labels** for clear identification and debugging +- **Handle approval rejections** gracefully in production systems +- **Integrate with user approval systems** using Temporal signals and updates + +This structure ensures developers can understand: +- **MCP integration patterns** with OpenAI agents in Temporal workflows +- **Approval workflow implementation** for enhanced security and control +- **Server configuration** for different MCP server endpoints +- **Tool execution control** through approval workflows and callbacks +- **Workflow integration** of external MCP server capabilities +- **Production deployment** considerations for MCP integrations + +The hosted MCP integration serves as a bridge between OpenAI agents and external data sources, APIs, and tools while maintaining the reliability, observability, and error handling that Temporal provides. Each pattern demonstrates specific MCP integration strategies that can be adapted for custom external tool integrations and approval workflows. diff --git a/docs/openai_agents/MODEL_PROVIDERS.md b/docs/openai_agents/MODEL_PROVIDERS.md new file mode 100644 index 000000000..528fb986d --- /dev/null +++ b/docs/openai_agents/MODEL_PROVIDERS.md @@ -0,0 +1,550 @@ +# Model Providers Integration + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Model Providers Integration service demonstrates how to integrate custom Large Language Model (LLM) providers with OpenAI agents in Temporal workflows. This service showcases two key patterns: LiteLLM integration for cloud-based providers and custom model providers for local/OSS models like GPT-OSS with Ollama. + +The system is designed for developers and engineering teams who want to: +- Learn how to integrate custom LLM providers with OpenAI agents in Temporal +- Understand LiteLLM integration for various cloud-based model providers +- Build custom model providers for local or open-source models +- Implement tool calling with non-OpenAI models +- Connect to local Ollama servers for cost-effective model execution +- Extend agent capabilities through custom model provider integrations +- Manage different model configurations and API endpoints + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Model Provider Diversity**: Different LLM providers have different APIs and capabilities +- **Cost Optimization**: Local models can reduce costs compared to cloud APIs +- **Custom Integration**: Need to integrate with non-standard model providers +- **Tool Calling Support**: Ensure custom models support function calling capabilities +- **Provider Abstraction**: Unified interface for different model providers +- **Local Deployment**: Support for models running on local infrastructure +- **API Compatibility**: Handle different API formats and authentication methods + +### Our Approach +- **Provider Abstraction**: Use ModelProvider interface for unified model access +- **LiteLLM Integration**: Leverage LiteLLM for cloud-based provider support +- **Custom Provider Implementation**: Build custom providers for specialized models +- **Tool Calling Support**: Ensure all models support function calling capabilities +- **Local Model Support**: Integrate with local Ollama servers for cost-effective execution +- **Workflow Integration**: Seamless integration within Temporal workflow contexts +- **Flexible Configuration**: Easy switching between different model providers + +## ⚡ System Constraints & Features + +### Key Features +- **LiteLLM Integration**: Built-in support for various cloud-based model providers +- **Custom Model Providers**: Custom provider implementation for specialized models +- **Local Model Support**: Integration with local Ollama servers +- **Tool Calling**: Function calling support across different model providers +- **Provider Abstraction**: Unified interface for different model types +- **Workflow Integration**: Custom models work seamlessly within Temporal contexts +- **Flexible Configuration**: Easy switching between different model providers + +### System Constraints +- **Model Compatibility**: Models must support function calling capabilities +- **Local Infrastructure**: Local models require sufficient hardware resources +- **API Rate Limits**: Cloud providers have rate limits and availability constraints +- **Model Download**: Local models require initial download and setup +- **Task Queue**: Uses `"openai-agents-model-providers-task-queue"` for all workflows +- **Timeout Management**: 30-second timeouts for model interactions +- **Provider Dependencies**: External model providers must be accessible and operational + +## 🏗️ System Overview + +```mermaid +graph TB + A[Temporal Workflow] --> B[OpenAI Agent] + B --> C{Model Provider} + + C -->|LiteLLM| D[Cloud Providers] + C -->|Custom| E[Local Models] + + D --> F[Anthropic Claude] + D --> G[Other LiteLLM Providers] + + E --> H[GPT-OSS with Ollama] + E --> I[Local Ollama Server] + + B --> J[Function Tools] + J --> K[Tool Execution] + K --> L[Model Response] + + subgraph "Provider Types" + M[LiteLLM Auto] + N[Custom Provider] + O[OpenAI Default] + end + + C --> M + C --> N + C --> O +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant W as Workflow + participant A as Agent + participant M as Model Provider + participant P as Provider Service + participant T as Function Tool + + W->>A: Execute with prompt + A->>M: Request model + M->>P: Get model instance + + alt LiteLLM Provider + P->>P: Cloud API call + else Custom Provider + P->>P: Local model call + end + + A->>T: Use function tool + T->>T: Execute tool + T->>A: Tool result + A->>M: Generate response + M->>A: Model response + A->>W: Final output +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflows orchestrating agent execution +2. **Agent Layer**: OpenAI agents with custom model provider integration +3. **Provider Layer**: Model provider abstraction for different model types +4. **Model Layer**: Specific model implementations and configurations +5. **Tool Layer**: Function tools for enhanced agent capabilities +6. **Execution Layer**: Runner scripts and worker processes for deployment + +### Key Components +- **LitellmAutoWorkflow**: Workflow using LiteLLM for cloud-based providers +- **GptOssWorkflow**: Workflow using custom GPT-OSS provider with Ollama +- **LitellmProvider**: Built-in LiteLLM integration for cloud providers +- **CustomModelProvider**: Custom provider for local GPT-OSS models +- **Function Tools**: Weather tool demonstrating tool calling capabilities +- **Workflow Orchestration**: Temporal workflow management of model interactions + +## 🔗 Interaction Flow + +### Internal Communication +- Workflows orchestrate agent execution through custom model providers +- Agents communicate with models using the ModelProvider abstraction +- Model providers handle model instantiation and configuration +- Function tools execute within the workflow context +- Workflows manage the complete model interaction lifecycle + +### External Dependencies +- **Cloud Model APIs**: Anthropic, OpenAI, and other cloud providers +- **Local Ollama Server**: For local model execution +- **OpenAI API**: For default model provider fallback +- **Temporal Server**: For workflow orchestration and state management +- **Network Infrastructure**: For cloud API connectivity and local server communication + +## 💻 Development Guidelines + +### Code Organization +- **Workflow Files**: One file per model provider pattern in `workflows/` directory +- **Runner Scripts**: Individual execution scripts in root directory +- **Worker Files**: Provider-specific worker configurations +- **Provider Implementation**: Custom model provider classes + +### Design Patterns +- **Provider Abstraction Pattern**: Unified interface for different model types +- **LiteLLM Integration Pattern**: Built-in support for cloud-based providers +- **Custom Provider Pattern**: Custom implementation for specialized models +- **Tool Integration Pattern**: Function tools working with custom models +- **Workflow Orchestration Pattern**: Temporal workflows managing model interactions + +### Error Handling +- **Model Provider Failures**: Handle cases where custom providers fail +- **API Rate Limits**: Manage cloud provider rate limiting +- **Local Model Failures**: Handle local Ollama server issues +- **Tool Execution Errors**: Gracefully handle function tool failures +- **Provider Configuration Errors**: Handle misconfigured model providers + +## 📝 Code Examples & Best Practices + +### LiteLLM Auto Integration Pattern +**File**: `openai_agents/model_providers/workflows/litellm_auto_workflow.py` + +This pattern demonstrates LiteLLM integration for cloud-based model providers with function tool support. + +```python +from __future__ import annotations + +from agents import Agent, Runner, function_tool, set_tracing_disabled +from temporalio import workflow + +@workflow.defn +class LitellmAutoWorkflow: + @workflow.run + async def run(self, prompt: str) -> str: + # Disable tracing for this workflow to reduce overhead + set_tracing_disabled(disabled=True) + + # Define a function tool for weather information + @function_tool + def get_weather(city: str): + return f"The weather in {city} is sunny." + + # Create agent with LiteLLM model and function tools + agent = Agent( + name="Assistant", + instructions="You only respond in haikus.", # Creative constraint for responses + model="anthropic/claude-3-5-sonnet-20240620", # LiteLLM model identifier + tools=[get_weather], # Function tools for enhanced capabilities + ) + + # Execute agent with custom model provider + result = await Runner.run(agent, prompt) + return result.final_output +``` + +**Key Benefits**: +- **Cloud Provider Integration**: Easy integration with various cloud-based model providers +- **Function Tool Support**: Full tool calling capabilities with custom models +- **Creative Constraints**: Model instructions can enforce specific response formats +- **Tracing Control**: Ability to disable tracing for performance optimization +- **Seamless Integration**: Works naturally within Temporal workflow contexts + +### GPT-OSS with Ollama Pattern +**File**: `openai_agents/model_providers/workflows/gpt_oss_workflow.py` + +This pattern demonstrates custom model provider integration with local GPT-OSS models running on Ollama. + +```python +from __future__ import annotations + +from agents import Agent, Runner, function_tool, set_tracing_disabled +from temporalio import workflow + +@workflow.defn +class GptOssWorkflow: + @workflow.run + async def run(self, prompt: str) -> str: + # Disable tracing for this workflow to reduce overhead + set_tracing_disabled(disabled=True) + + # Define a function tool for weather information with logging + @function_tool + def get_weather(city: str): + workflow.logger.debug(f"Getting weather for {city}") # Debug logging + return f"The weather in {city} is sunny." + + # Create agent with local GPT-OSS model and function tools + agent = Agent( + name="Assistant", + instructions="You only respond in haikus. When asked about the weather always use the tool to get the current weather.", + model="gpt-oss:20b", # Local Ollama model identifier + tools=[get_weather], # Function tools for enhanced capabilities + ) + + # Execute agent with local model provider + result = await Runner.run(agent, prompt) + return result.final_output +``` + +**Key Benefits**: +- **Local Model Execution**: Cost-effective model execution on local infrastructure +- **Function Tool Support**: Full tool calling capabilities with local models +- **Debug Logging**: Enhanced debugging capabilities within workflows +- **Creative Constraints**: Model instructions can enforce specific response formats +- **Tool Integration**: Seamless integration of function tools with local models + +### Custom Model Provider Pattern +**File**: `openai_agents/model_providers/run_gpt_oss_worker.py` + +This pattern demonstrates how to implement custom model providers for specialized models like GPT-OSS with Ollama. + +```python +import asyncio +import logging +from datetime import timedelta +from typing import Optional + +from agents import Model, ModelProvider, OpenAIChatCompletionsModel +from openai import AsyncOpenAI +from temporalio.client import Client +from temporalio.contrib.openai_agents import ModelActivityParameters, OpenAIAgentsPlugin +from temporalio.worker import Worker + +from openai_agents.model_providers.workflows.gpt_oss_workflow import GptOssWorkflow + +# Configure Ollama client for local model access +ollama_client = AsyncOpenAI( + base_url="http://localhost:11434/v1", # Local Ollama API endpoint + api_key="ollama", # Ignored by Ollama but required by OpenAI client +) + +class CustomModelProvider(ModelProvider): + """Custom model provider for GPT-OSS models running on local Ollama server.""" + + def get_model(self, model_name: Optional[str]) -> Model: + # Create OpenAI-compatible model wrapper for GPT-OSS + model = OpenAIChatCompletionsModel( + model=model_name if model_name else "gpt-oss:20b", # Default model + openai_client=ollama_client, # Use local Ollama client + ) + return model + +async def main(): + # Configure logging to show workflow debug messages + logging.basicConfig(level=logging.WARNING) + logging.getLogger("temporalio.workflow").setLevel(logging.DEBUG) + + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) # 30-second timeout + ), + model_provider=CustomModelProvider(), # Use custom model provider + ), + ], + ) + + # Create worker supporting GPT-OSS workflow + worker = Worker( + client, + task_queue="openai-agents-model-providers-task-queue", # Dedicated task queue + workflows=[ + GptOssWorkflow, # Register GPT-OSS workflow + ], + ) + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Custom Provider Implementation**: Full control over model instantiation and configuration +- **Local Model Support**: Integration with local Ollama servers for cost-effective execution +- **OpenAI Compatibility**: Uses OpenAI client interface for seamless integration +- **Debug Logging**: Enhanced debugging capabilities for workflow development +- **Flexible Configuration**: Easy customization of model settings and timeouts + +### LiteLLM Provider Worker Pattern +**File**: `openai_agents/model_providers/run_litellm_provider_worker.py` + +This pattern demonstrates how to configure workers with LiteLLM provider for cloud-based model access. + +```python +import asyncio +from datetime import timedelta + +from agents.extensions.models.litellm_provider import LitellmProvider +from temporalio.client import Client +from temporalio.contrib.openai_agents import ModelActivityParameters, OpenAIAgentsPlugin +from temporalio.worker import Worker + +from openai_agents.model_providers.workflows.litellm_auto_workflow import ( + LitellmAutoWorkflow, +) + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) # 30-second timeout + ), + model_provider=LitellmProvider(), # Use built-in LiteLLM provider + ), + ], + ) + + # Create worker supporting LiteLLM workflows + worker = Worker( + client, + task_queue="openai-agents-model-providers-task-queue", # Dedicated task queue + workflows=[ + LitellmAutoWorkflow, # Register LiteLLM workflow + ], + ) + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Built-in Integration**: Uses built-in LiteLLM provider for easy cloud model access +- **Provider Abstraction**: Seamless integration with various cloud-based model providers +- **Timeout Management**: Appropriate timeouts for cloud API interactions +- **Workflow Registration**: Easy registration of LiteLLM-based workflows +- **Minimal Configuration**: Simple setup for cloud-based model integration + +### LiteLLM Auto Runner Pattern +**File**: `openai_agents/model_providers/run_litellm_auto_workflow.py` + +This pattern demonstrates how to execute LiteLLM workflows with cloud-based model providers. + +```python +import asyncio + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.model_providers.workflows.litellm_auto_workflow import ( + LitellmAutoWorkflow, +) + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute LiteLLM workflow with weather question + result = await client.execute_workflow( + LitellmAutoWorkflow.run, + "What's the weather in Tokyo?", # Question for weather tool + id="litellm-auto-workflow-id", + task_queue="openai-agents-model-providers-task-queue", + ) + print(f"Result: {result}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Simple Execution**: Easy execution of LiteLLM-based workflows +- **Question Customization**: Weather questions can be customized for different cities +- **Workflow Identification**: Clear workflow ID for monitoring and debugging +- **Result Display**: Simple output display for model responses +- **Task Queue Consistency**: Uses the same task queue as the worker + +### GPT-OSS Runner Pattern +**File**: `openai_agents/model_providers/run_gpt_oss_workflow.py` + +This pattern demonstrates how to execute GPT-OSS workflows with local Ollama models. + +```python +import asyncio + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.model_providers.workflows.gpt_oss_workflow import GptOssWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute GPT-OSS workflow with weather question + result = await client.execute_workflow( + GptOssWorkflow.run, + "What's the weather in Tokyo?", # Question for weather tool + id="litellm-gpt-oss-workflow-id", + task_queue="openai-agents-model-providers-task-queue", + ) + print(f"Result: {result}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Local Model Execution**: Executes workflows using local GPT-OSS models +- **Cost Effectiveness**: No cloud API costs for model execution +- **Question Customization**: Weather questions can be customized for different cities +- **Workflow Identification**: Clear workflow ID for monitoring and debugging +- **Result Display**: Simple output display for local model responses + +## 🎯 Key Benefits of This Structure + +1. **Provider Abstraction**: Unified interface for different model types and providers +2. **Cost Optimization**: Local models reduce costs compared to cloud APIs +3. **Flexible Integration**: Easy switching between different model providers +4. **Tool Calling Support**: Full function tool capabilities across different models +5. **Local Deployment**: Support for models running on local infrastructure +6. **Cloud Provider Support**: LiteLLM integration for various cloud-based providers +7. **Custom Provider Implementation**: Full control over model configuration and behavior +8. **Workflow Integration**: Custom models work seamlessly within Temporal contexts + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"openai-agents-model-providers-task-queue"` +- **Runner Scripts**: Use the same task queue for consistency +- **Note**: Dedicated task queue for model provider-specific workflows + +### Model Provider Dependencies and Setup +- **LiteLLM Integration**: Built-in support for cloud-based model providers +- **Ollama Server**: Local Ollama server required for GPT-OSS models +- **Model Downloads**: Local models require initial download and setup +- **API Keys**: Cloud providers require appropriate API keys and configuration + +### Specific Examples Implemented +- **LiteLLM Auto Integration**: Cloud-based model providers with function tools +- **GPT-OSS with Ollama**: Local model execution with custom provider +- **Weather Tool Integration**: Function tools working with custom models +- **Haiku Response Format**: Creative constraints for model responses +- **Custom Model Providers**: Full control over model instantiation and configuration + +### Architecture Patterns +- **Provider Abstraction Design**: Unified interface for different model types +- **LiteLLM Integration**: Built-in support for cloud-based providers +- **Custom Provider Implementation**: Full control over model configuration +- **Tool Integration**: Function tools working seamlessly with custom models +- **Workflow Orchestration**: Temporal workflows managing model interactions + +### File Organization +``` +openai_agents/model_providers/ +├── workflows/ # Core model provider implementations +│ ├── litellm_auto_workflow.py # LiteLLM cloud provider integration +│ └── gpt_oss_workflow.py # Custom GPT-OSS provider integration +├── run_litellm_provider_worker.py # LiteLLM worker configuration +├── run_gpt_oss_worker.py # GPT-OSS worker configuration +├── run_litellm_auto_workflow.py # LiteLLM workflow runner +├── run_gpt_oss_workflow.py # GPT-OSS workflow runner +└── README.md # Model provider overview and usage +``` + +### Common Development Patterns +- **Use `set_tracing_disabled(disabled=True)`** for performance optimization +- **Implement custom ModelProvider classes** for specialized model integration +- **Configure appropriate timeouts** for different model provider types +- **Use function tools** to enhance agent capabilities across different models +- **Handle provider failures** gracefully in production systems +- **Monitor model performance** and adjust timeouts accordingly +- **Test tool calling capabilities** with different model providers + +This structure ensures developers can understand: +- **Model provider integration patterns** with OpenAI agents in Temporal workflows +- **Custom provider implementation** for specialized models and local deployment +- **LiteLLM integration** for cloud-based model providers +- **Tool calling support** across different model types +- **Local model deployment** with Ollama and custom providers +- **Production deployment** considerations for custom model integrations + +The model providers integration serves as a bridge between OpenAI agents and various LLM providers, enabling cost optimization, local deployment, and enhanced flexibility while maintaining the reliability, observability, and error handling that Temporal provides. Each pattern demonstrates specific model provider integration strategies that can be adapted for custom model deployments and provider integrations. diff --git a/docs/openai_agents/README.md b/docs/openai_agents/README.md new file mode 100644 index 000000000..1fe3c95af --- /dev/null +++ b/docs/openai_agents/README.md @@ -0,0 +1,139 @@ +# OpenAI Agents SDK Integration with Temporal + +⚠️ **Public Preview** - This integration is experimental and its interfaces may change prior to General Availability. + +This directory contains comprehensive examples demonstrating how to integrate the [OpenAI Agents SDK](https://github.com/openai/openai-agents-python) with Temporal's durable execution engine. These samples extend the OpenAI Agents SDK examples with Temporal's durability, orchestration, and observability capabilities. + +## 🏗️ **Architecture Overview** + +The integration creates a powerful synergy between two technologies: + +- **Temporal Workflows**: Provide durable execution, state management, and orchestration +- **OpenAI Agents SDK**: Deliver AI agent capabilities, tool integration, and LLM interactions + +This combination ensures that AI agent workflows are: +- **Durable**: Survive interruptions, restarts, and failures +- **Observable**: Full tracing, monitoring, and debugging capabilities +- **Scalable**: Handle complex multi-agent interactions and long-running conversations +- **Reliable**: Built-in retry mechanisms and error handling + +## 🔄 **Core Integration Patterns** + +### **Workflow-Orchestrated Agents** +Temporal workflows orchestrate the entire agent lifecycle, from initialization to completion, ensuring state persistence and fault tolerance. + +### **Agent State Management** +Workflows maintain conversation state, agent context, and execution history, enabling long-running, stateful AI interactions. + +### **Tool Integration** +Seamless integration of OpenAI's built-in tools (web search, code interpreter, file search) with custom Temporal activities for I/O operations. + +### **Multi-Agent Coordination** +Complex workflows can coordinate multiple specialized agents, each with distinct roles and responsibilities. + +## 📚 **Service Documentation** + +Each service demonstrates specific integration patterns and use cases: + +### **Core Services** +- **[Basic Examples](./BASIC.md)** - Fundamental agent patterns, lifecycle management, and tool integration +- **[Agent Patterns](./AGENT_PATTERNS.md)** - Advanced multi-agent architectures, routing, and coordination patterns +- **[Tools Integration](./TOOLS.md)** - Comprehensive tool usage including code interpreter, file search, and image generation + +### **Specialized Workflows** +- **[Handoffs](./HANDOFFS.md)** - Agent collaboration and message filtering patterns +- **[Hosted MCP](./HOSTED_MCP.md)** - Model Context Protocol integration for external tool access +- **[Model Providers](./MODEL_PROVIDERS.md)** - Custom LLM provider integration (LiteLLM, Ollama, GPT-OSS) + +### **Domain-Specific Applications** +- **[Research Bot](./RESEARCH_BOT.md)** - Multi-agent research system with planning, search, and synthesis +- **[Customer Service](./CUSTOMER_SERVICE.md)** - Conversational workflows with escalation and state management +- **[Financial Research](./FINANCIAL_RESEARCH_AGENT.md)** - Complex multi-agent financial analysis system +- **[Reasoning Content](./REASONING_CONTENT.md)** - Accessing model reasoning and thought processes + +## 🚀 **Getting Started** + +### **Prerequisites** +- Temporal server [running locally](https://docs.temporal.io/cli/server#start-dev) +- Required dependencies: `uv sync --group openai-agents` +- OpenAI API key: `export OPENAI_API_KEY=your_key_here` + +### **Quick Start** +1. **Choose a Service**: Start with [Basic Examples](./BASIC.md) for fundamental concepts +2. **Run the Worker**: Execute the appropriate `run_worker.py` script +3. **Execute Workflow**: Use the corresponding `run_*_workflow.py` script +4. **Explore Patterns**: Move to [Agent Patterns](./AGENT_PATTERNS.md) for advanced usage + +### **Development Workflow** +```bash +# Start Temporal server +temporal server start-dev + +# Install dependencies +uv sync --group openai-agents + +# Run a specific example +uv run openai_agents/basic/run_worker.py +# In another terminal +uv run openai_agents/basic/run_hello_world_workflow.py +``` + +## 🔧 **Key Integration Features** + +### **Temporal Workflow Decorators** +```python +@workflow.defn +class AgentWorkflow: + @workflow.run + async def run(self, input: str) -> str: + # Agent execution logic + pass +``` + +### **OpenAI Agents Plugin** +```python +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +worker = Worker( + client, + task_queue="openai-agents-task-queue", + plugins=[OpenAIAgentsPlugin()], +) +``` + +### **Agent Integration** +```python +from agents import Agent, Runner + +agent = Agent(name="MyAgent", instructions="...") +result = await Runner.run(agent, input_text) +``` + +## 📖 **Documentation Structure** + +Each service documentation follows a consistent structure: +- **Introduction**: Service purpose and role in the ecosystem +- **Architecture**: System design and component relationships +- **Code Examples**: Implementation patterns with file paths and benefits +- **Development Guidelines**: Best practices and common patterns +- **File Organization**: Directory structure and file purposes + +## 🔗 **Additional Resources** + +- [Temporal Python SDK Documentation](https://docs.temporal.io/python) +- [OpenAI Agents SDK Documentation](https://github.com/openai/openai-agents-python) +- [Module Documentation](https://github.com/temporalio/sdk-python/blob/main/temporalio/contrib/openai_agents/README.md) + +## 🎯 **Use Cases** + +This integration is ideal for: +- **Conversational AI**: Long-running, stateful conversations with memory +- **Multi-Agent Systems**: Coordinated AI agents working on complex tasks +- **Research & Analysis**: AI-powered research workflows with tool integration +- **Customer Service**: Intelligent support systems with escalation capabilities +- **Content Generation**: AI content creation with workflow orchestration +- **Data Processing**: AI-driven data analysis and transformation pipelines + +--- + +*For detailed implementation examples and specific use cases, refer to the individual service documentation linked above.* diff --git a/docs/openai_agents/REASONING_CONTENT.md b/docs/openai_agents/REASONING_CONTENT.md new file mode 100644 index 000000000..4176404e5 --- /dev/null +++ b/docs/openai_agents/REASONING_CONTENT.md @@ -0,0 +1,541 @@ +# Reasoning Content Extraction + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Reasoning Content Extraction service demonstrates how to access and utilize the reasoning content field from models that support it, such as deepseek-reasoner. This service showcases how to extract both the model's step-by-step thinking process and the final answer within Temporal workflows, providing transparency into AI decision-making processes. + +The system is designed for developers and engineering teams who want to: +- Learn how to access reasoning content from reasoning-capable models +- Understand the model's step-by-step thinking process before final answers +- Build transparent AI systems with explainable reasoning +- Extract both reasoning and regular content from model responses +- Implement reasoning content extraction within Temporal workflows +- Use activities to handle I/O operations in workflows +- Work with models that provide detailed reasoning capabilities + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Reasoning Transparency**: Access to the model's internal thinking process +- **Explainable AI**: Understanding how models arrive at their conclusions +- **Content Separation**: Distinguishing between reasoning steps and final answers +- **I/O Handling**: Managing external API calls within Temporal workflows +- **Model Compatibility**: Working with models that support reasoning content +- **Content Extraction**: Parsing complex response structures for different content types +- **Workflow Integration**: Seamlessly integrating reasoning extraction into workflows + +### Our Approach +- **Activity-Based I/O**: Use Temporal activities for external model API calls +- **Content Parsing**: Extract both reasoning and regular content from responses +- **Model Abstraction**: Use OpenAI provider for model interactions +- **Structured Output**: Return structured results with separated content types +- **Workflow Orchestration**: Temporal workflows manage the reasoning extraction process +- **Flexible Model Selection**: Support for different reasoning-capable models +- **Content Type Detection**: Automatic detection of reasoning vs. regular content + +## ⚡ System Constraints & Features + +### Key Features +- **Reasoning Content Extraction**: Access to model's step-by-step thinking process +- **Dual Content Support**: Both reasoning and regular content extraction +- **Activity-Based Architecture**: Temporal activities handle external I/O operations +- **Model Flexibility**: Support for different reasoning-capable models +- **Structured Results**: Clear separation of reasoning and regular content +- **Workflow Integration**: Seamless integration within Temporal workflows +- **Content Type Detection**: Automatic parsing of different response content types + +### System Constraints +- **Model Compatibility**: Only works with models that support reasoning content +- **No Streaming**: Temporal workflows don't support streaming responses +- **I/O Operations**: External API calls must be handled in activities +- **Content Parsing**: Complex response structure parsing required +- **Task Queue**: Uses `"reasoning-content-task-queue"` for all workflows +- **Timeout Management**: 5-minute timeouts for reasoning model interactions +- **API Dependencies**: OpenAI API key and compatible model required + +## 🏗️ System Overview + +```mermaid +graph TB + A[Temporal Workflow] --> B[Reasoning Content Workflow] + B --> C[Execute Activity] + C --> D[Reasoning Activity] + D --> E[OpenAI Provider] + E --> F[Reasoning Model] + + F --> G[Model Response] + G --> H[Content Parsing] + + H --> I{Content Type} + I -->|Reasoning| J[Reasoning Content] + I -->|Message| K[Regular Content] + + J --> L[Structured Result] + K --> L + L --> M[Workflow Output] + + subgraph "Content Types" + N[Reasoning Steps] + O[Final Answer] + P[Refusal Content] + end + + H --> N + H --> O + H --> P +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant W as Workflow + participant A as Activity + participant P as OpenAI Provider + participant M as Reasoning Model + participant C as Content Parser + + W->>A: Execute reasoning activity + A->>P: Get model instance + P->>M: Send reasoning request + M->>P: Return response with reasoning + P->>A: Model response + A->>C: Parse content types + C->>C: Extract reasoning content + C->>C: Extract regular content + C->>A: Parsed content tuple + A->>W: Return structured result +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflow orchestrating reasoning content extraction +2. **Activity Layer**: Temporal activities handling external I/O operations +3. **Provider Layer**: OpenAI provider for model interactions +4. **Model Layer**: Reasoning-capable models providing detailed responses +5. **Content Layer**: Content parsing and extraction logic +6. **Execution Layer**: Runner scripts and worker processes for deployment + +### Key Components +- **ReasoningContentWorkflow**: Main workflow orchestrating reasoning extraction +- **get_reasoning_response**: Activity for handling model API calls +- **OpenAIProvider**: Provider for model interactions +- **Content Parser**: Logic for extracting different content types +- **Structured Results**: Dataclass for organizing extracted content +- **Workflow Orchestration**: Temporal workflow management of reasoning extraction + +## 🔗 Interaction Flow + +### Internal Communication +- Workflows orchestrate reasoning extraction through activities +- Activities handle external model API calls and content parsing +- Content parsing logic extracts different types of response content +- Structured results organize reasoning and regular content +- Workflows manage the complete reasoning extraction lifecycle + +### External Dependencies +- **OpenAI API**: For model interactions and responses +- **Reasoning Models**: Models that support reasoning content (e.g., deepseek-reasoner) +- **Temporal Server**: For workflow orchestration and state management +- **Network Infrastructure**: For API connectivity and model communication +- **Environment Variables**: For API keys and model configuration + +## 💻 Development Guidelines + +### Code Organization +- **Workflow Files**: One file per reasoning pattern in `workflows/` directory +- **Activity Files**: I/O handling logic in `activities/` directory +- **Runner Scripts**: Individual execution scripts in root directory +- **Worker**: Central worker supporting reasoning workflows in `run_worker.py` + +### Design Patterns +- **Activity-Based I/O Pattern**: Use activities for external API calls +- **Content Parsing Pattern**: Extract different content types from responses +- **Structured Output Pattern**: Return organized results with separated content +- **Model Provider Pattern**: Use OpenAI provider for model interactions +- **Workflow Orchestration Pattern**: Temporal workflows managing reasoning extraction + +### Error Handling +- **Model API Failures**: Handle cases where reasoning models are unavailable +- **Content Parsing Errors**: Gracefully handle unexpected response structures +- **Activity Failures**: Handle activity execution failures +- **Model Compatibility**: Handle models that don't support reasoning content +- **Timeout Management**: Handle long-running reasoning model interactions + +## 📝 Code Examples & Best Practices + +### Reasoning Content Workflow Pattern +**File**: `openai_agents/reasoning_content/workflows/reasoning_content_workflow.py` + +This pattern demonstrates how to orchestrate reasoning content extraction using Temporal workflows and activities. + +```python +from dataclasses import dataclass + +from temporalio import workflow + +from openai_agents.reasoning_content.activities.reasoning_activities import ( + get_reasoning_response, +) + +# Structured output for reasoning results +@dataclass +class ReasoningResult: + reasoning_content: str | None # Model's step-by-step thinking process + regular_content: str | None # Final answer or response + prompt: str # Original input prompt + +@workflow.defn +class ReasoningContentWorkflow: + @workflow.run + async def run(self, prompt: str, model_name: str | None = None) -> ReasoningResult: + # Call the activity to get the reasoning response + # Activities handle external I/O operations that workflows cannot perform directly + reasoning_content, regular_content = await workflow.execute_activity( + get_reasoning_response, + args=[prompt, model_name], + start_to_close_timeout=workflow.timedelta(minutes=5), # Extended timeout for reasoning models + ) + + # Return structured result with separated content types + return ReasoningResult( + reasoning_content=reasoning_content, + regular_content=regular_content, + prompt=prompt, + ) +``` + +**Key Benefits**: +- **Activity-Based I/O**: External API calls handled in activities, not workflows +- **Structured Results**: Clear separation of reasoning and regular content +- **Extended Timeouts**: 5-minute timeouts for complex reasoning model interactions +- **Content Organization**: Dataclass provides clear structure for extracted content +- **Workflow Integration**: Seamless integration within Temporal workflow contexts + +### Reasoning Content Activity Pattern +**File**: `openai_agents/reasoning_content/activities/reasoning_activities.py` + +This pattern demonstrates how to implement activities for extracting reasoning content from model responses. + +```python +import os +from typing import Any, cast + +from agents import ModelSettings +from agents.models.interface import ModelTracing +from agents.models.openai_provider import OpenAIProvider +from openai.types.responses import ResponseOutputRefusal, ResponseOutputText +from temporalio import activity + +@activity.defn +async def get_reasoning_response( + prompt: str, model_name: str | None = None +) -> tuple[str | None, str | None]: + """ + Activity to get response from a reasoning-capable model. + Returns tuple of (reasoning_content, regular_content). + """ + # Use provided model name, environment variable, or default to deepseek-reasoner + model_name = model_name or os.getenv("EXAMPLE_MODEL_NAME") or "deepseek-reasoner" + + # Initialize OpenAI provider and get model instance + provider = OpenAIProvider() + model = provider.get_model(model_name) + + # Send request to reasoning-capable model + response = await model.get_response( + system_instructions="You are a helpful assistant that explains your reasoning step by step.", + input=prompt, + model_settings=ModelSettings(), + tools=[], # No tools needed for reasoning content extraction + output_schema=None, + handoffs=[], + tracing=ModelTracing.DISABLED, # Disable tracing for performance + previous_response_id=None, + prompt=None, + ) + + # Extract reasoning content and regular content from the response + reasoning_content = None + regular_content = None + + # Parse response output to identify different content types + for item in response.output: + if hasattr(item, "type") and item.type == "reasoning": + # Extract reasoning content from reasoning-type items + reasoning_content = item.summary[0].text + elif hasattr(item, "type") and item.type == "message": + # Extract regular content from message-type items + if item.content and len(item.content) > 0: + content_item = item.content[0] + if isinstance(content_item, ResponseOutputText): + # Handle text content + regular_content = content_item.text + elif isinstance(content_item, ResponseOutputRefusal): + # Handle refusal content + refusal_item = cast(Any, content_item) + regular_content = refusal_item.refusal + + # Return tuple of (reasoning_content, regular_content) + return reasoning_content, regular_content +``` + +**Key Benefits**: +- **Content Type Detection**: Automatic identification of reasoning vs. regular content +- **Flexible Model Selection**: Support for different reasoning-capable models +- **Comprehensive Parsing**: Handles text, refusal, and reasoning content types +- **Type Safety**: Proper type casting and validation for different content types +- **Environment Configuration**: Easy model selection through environment variables + +### Structured Result Pattern +**File**: `openai_agents/reasoning_content/workflows/reasoning_content_workflow.py` + +This pattern demonstrates how to structure and organize extracted reasoning content for clear consumption. + +```python +@dataclass +class ReasoningResult: + reasoning_content: str | None # Model's step-by-step thinking process + regular_content: str | None # Final answer or response + prompt: str # Original input prompt +``` + +**Key Benefits**: +- **Clear Organization**: Separates different types of extracted content +- **Type Safety**: Optional fields handle cases where content might not be available +- **Prompt Preservation**: Maintains original input for context and debugging +- **Easy Consumption**: Simple structure for consuming applications +- **Extensible Design**: Easy to add additional content types in the future + +### Content Parsing Pattern +**File**: `openai_agents/reasoning_content/activities/reasoning_activities.py` + +This pattern demonstrates how to parse complex model responses to extract different content types. + +```python +# Parse response output to identify different content types +for item in response.output: + if hasattr(item, "type") and item.type == "reasoning": + # Extract reasoning content from reasoning-type items + reasoning_content = item.summary[0].text + elif hasattr(item, "type") and item.type == "message": + # Extract regular content from message-type items + if item.content and len(item.content) > 0: + content_item = item.content[0] + if isinstance(content_item, ResponseOutputText): + # Handle text content + regular_content = content_item.text + elif isinstance(content_item, ResponseOutputRefusal): + # Handle refusal content + refusal_item = cast(Any, content_item) + regular_content = refusal_item.refusal +``` + +**Key Benefits**: +- **Type Detection**: Automatic identification of different content types +- **Comprehensive Coverage**: Handles reasoning, text, and refusal content +- **Safe Access**: Proper attribute checking before accessing content +- **Type Casting**: Safe type casting for different response structures +- **Error Resilience**: Graceful handling of unexpected response formats + +### Worker Configuration +**File**: `openai_agents/reasoning_content/run_worker.py` + +This is the central worker that supports the reasoning content workflow, registering both workflows and activities. + +```python +#!/usr/bin/env python3 + +import asyncio + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin +from temporalio.worker import Worker + +from openai_agents.reasoning_content.activities.reasoning_activities import ( + get_reasoning_response, +) +from openai_agents.reasoning_content.workflows.reasoning_content_workflow import ( + ReasoningContentWorkflow, +) + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Create worker supporting reasoning content workflows and activities + worker = Worker( + client, + task_queue="reasoning-content-task-queue", # Dedicated task queue for reasoning + workflows=[ReasoningContentWorkflow], # Register reasoning workflow + activities=[get_reasoning_response], # Register reasoning activity + ) + + print("Starting reasoning content worker...") + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Dual Registration**: Registers both workflows and activities +- **Dedicated Task Queue**: Separate queue for reasoning-specific operations +- **Activity Support**: Enables external I/O operations within workflows +- **Simple Setup**: Easy deployment of reasoning content capabilities +- **Clear Logging**: Informative startup messages for monitoring + +### Runner Script Pattern +**File**: `openai_agents/reasoning_content/run_reasoning_content_workflow.py` + +This pattern demonstrates how to execute reasoning content workflows with multiple demo prompts. + +```python +#!/usr/bin/env python3 + +import asyncio +import os + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.reasoning_content.workflows.reasoning_content_workflow import ( + ReasoningContentWorkflow, + ReasoningResult, +) + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Demo prompts that benefit from reasoning + demo_prompts = [ + "What is the square root of 841? Please explain your reasoning.", + "Explain the concept of recursion in programming", + "Write a haiku about recursion in programming", + ] + + # Get model name from environment or use default + model_name = os.getenv("EXAMPLE_MODEL_NAME") or "deepseek-reasoner" + print(f"Using model: {model_name}") + print("Note: This example requires a model that supports reasoning content.") + print("You may need to use a specific model like deepseek-reasoner or similar.\n") + + # Execute workflow for each demo prompt + for i, prompt in enumerate(demo_prompts, 1): + print(f"=== Example {i}: {prompt} ===") + + # Execute reasoning content workflow + result: ReasoningResult = await client.execute_workflow( + ReasoningContentWorkflow.run, + args=[prompt, model_name], + id=f"reasoning-content-{i}", + task_queue="reasoning-content-task-queue", + ) + + # Display structured results + print(f"\nPrompt: {result.prompt}") + print("\nReasoning Content:") + print(result.reasoning_content or "No reasoning content provided") + print("\nRegular Content:") + print(result.regular_content or "No regular content provided") + print("-" * 50 + "\n") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Multiple Examples**: Demonstrates reasoning content with different types of prompts +- **Environment Configuration**: Easy model selection through environment variables +- **Structured Display**: Clear presentation of extracted reasoning and regular content +- **Workflow Identification**: Unique IDs for each example execution +- **Comprehensive Testing**: Tests reasoning capabilities across different prompt types + +## 🎯 Key Benefits of This Structure + +1. **Reasoning Transparency**: Access to model's step-by-step thinking process +2. **Explainable AI**: Understanding how models arrive at conclusions +3. **Content Separation**: Clear distinction between reasoning and final answers +4. **Activity-Based I/O**: Proper handling of external API calls in workflows +5. **Structured Results**: Organized output for easy consumption +6. **Model Flexibility**: Support for different reasoning-capable models +7. **Content Type Detection**: Automatic parsing of different response structures +8. **Workflow Integration**: Seamless integration within Temporal contexts + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"reasoning-content-task-queue"` +- **Runner Scripts**: Use the same task queue for consistency +- **Note**: Dedicated task queue for reasoning content-specific workflows + +### Reasoning Content Dependencies and Setup +- **Model Compatibility**: Only works with models that support reasoning content +- **OpenAI API Key**: Required for model interactions +- **Reasoning Models**: Models like deepseek-reasoner provide reasoning content +- **Activity Registration**: Both workflows and activities must be registered + +### Specific Examples Implemented +- **Mathematical Reasoning**: Square root calculation with step-by-step explanation +- **Concept Explanation**: Programming concepts like recursion +- **Creative Writing**: Haiku generation with reasoning process +- **Content Extraction**: Automatic parsing of reasoning vs. regular content +- **Structured Results**: Clear organization of extracted content types + +### Architecture Patterns +- **Activity-Based I/O Design**: External API calls handled in activities +- **Content Parsing**: Automatic detection and extraction of different content types +- **Structured Output**: Organized results with clear content separation +- **Model Provider Integration**: OpenAI provider for reasoning model interactions +- **Workflow Orchestration**: Temporal workflows managing reasoning extraction + +### File Organization +``` +openai_agents/reasoning_content/ +├── workflows/ # Core reasoning workflow implementation +│ └── reasoning_content_workflow.py # Reasoning content extraction workflow +├── activities/ # I/O handling activities +│ └── reasoning_activities.py # Reasoning content extraction activity +├── run_worker.py # Central worker for reasoning workflows +├── run_reasoning_content_workflow.py # Reasoning workflow runner with demos +└── README.md # Reasoning content overview and usage +``` + +### Common Development Patterns +- **Use activities for external I/O** operations that workflows cannot perform +- **Implement content type detection** for automatic parsing of responses +- **Structure results clearly** with separated reasoning and regular content +- **Handle missing content gracefully** with optional fields and fallbacks +- **Use extended timeouts** for reasoning model interactions +- **Test with different prompt types** to validate reasoning capabilities +- **Monitor content extraction** for unexpected response formats + +This structure ensures developers can understand: +- **Reasoning content extraction patterns** with OpenAI agents in Temporal workflows +- **Activity-based I/O handling** for external API calls +- **Content parsing strategies** for different response types +- **Structured output organization** for extracted content +- **Model compatibility requirements** for reasoning content support +- **Production deployment** considerations for reasoning extraction systems + +The reasoning content extraction serves as a bridge between AI models and human understanding, providing transparency into AI decision-making processes while maintaining the reliability, observability, and error handling that Temporal provides. Each pattern demonstrates specific reasoning extraction strategies that can be adapted for custom explainable AI systems and reasoning content analysis. diff --git a/docs/openai_agents/RESEARCH_BOT.md b/docs/openai_agents/RESEARCH_BOT.md new file mode 100644 index 000000000..45280e4a0 --- /dev/null +++ b/docs/openai_agents/RESEARCH_BOT.md @@ -0,0 +1,324 @@ +# Research Bot + +## 📑 Table of Contents +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) +- [Key Benefits of This Structure](#key-benefits-of-this-structure) +- [Important Implementation Notes](#important-implementation-notes) +- [Architecture Patterns](#architecture-patterns) +- [File Organization](#file-organization) +- [Common Development Patterns](#common-development-patterns) + +## 🎯 Introduction +The Research Bot is a multi-agent research system that orchestrates specialized AI agents to perform comprehensive web-based research tasks. It extends the OpenAI Agents SDK research bot with Temporal's durable execution, enabling reliable, scalable research workflows that can handle complex queries through parallel search execution and intelligent result synthesis. + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Research Complexity**: Breaking down complex research queries into manageable, focused web searches +- **Information Overload**: Synthesizing multiple search results into coherent, actionable reports +- **Scalability**: Executing multiple searches in parallel while maintaining workflow reliability +- **Quality Assurance**: Ensuring research coverage and report coherence through specialized agent roles + +### Our Approach +- **Agent Specialization**: Each agent has a focused responsibility (planning, searching, writing) +- **Parallel Execution**: Leveraging Temporal's workflow capabilities for concurrent search operations +- **Structured Output**: Using Pydantic models to ensure data consistency and type safety +- **Progressive Refinement**: Planning → Searching → Synthesis workflow for systematic research + +## ⚡ System Constraints & Features + +### Key Features +- **Multi-Agent Orchestration**: Coordinated execution of specialized research agents +- **Parallel Web Search**: Concurrent execution of multiple search queries for efficiency +- **Intelligent Planning**: AI-driven search strategy generation based on research queries +- **Structured Reporting**: Consistent output format with summaries, detailed reports, and follow-up questions +- **Durable Execution**: Temporal workflow ensures research tasks survive interruptions + +### System Constraints +- **Search Result Limits**: Summaries capped at 300 words for conciseness +- **Report Length**: Target 5-10 pages (1000+ words) for comprehensive coverage +- **Model Selection**: Different models for different tasks (GPT-4o for planning, o3-mini for writing) +- **Tool Requirements**: Search agent must use WebSearchTool (tool_choice="required") + +## 🏗️ System Overview +```mermaid +graph TB + A[User Query] --> B[Research Workflow] + B --> C[Research Manager] + C --> D[Planner Agent] + C --> E[Search Agents] + C --> F[Writer Agent] + D --> G[Web Search Plan] + G --> E + E --> H[Search Results] + H --> F + F --> I[Final Report] + I --> J[Markdown Output] + + subgraph "Agent Specialization" + D + E + F + end + + subgraph "Temporal Workflow" + B + C + end +``` + +## 🔄 System Flow +```mermaid +sequenceDiagram + participant U as User + participant W as Research Workflow + participant M as Research Manager + participant P as Planner Agent + participant S as Search Agents + participant W as Writer Agent + + U->>W: Submit Research Query + W->>M: Execute Research + M->>P: Generate Search Plan + P->>M: Return WebSearchPlan + M->>S: Execute Parallel Searches + S->>M: Return Search Summaries + M->>W: Synthesize Report + W->>M: Return ReportData + M->>W: Return Final Report + W->>U: Deliver Markdown Report +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflow wrapper for the research process +2. **Orchestration Layer**: ResearchManager coordinates all agent interactions +3. **Agent Layer**: Specialized agents for planning, searching, and writing +4. **Tool Layer**: WebSearchTool integration for web research capabilities + +### Key Components +- **[ResearchWorkflow]**: Temporal workflow entry point that wraps the ResearchManager +- **[ResearchManager]**: Central orchestrator that manages the entire research lifecycle +- **[PlannerAgent]**: Generates strategic search plans based on research queries +- **[SearchAgent]**: Executes web searches and produces concise summaries +- **[WriterAgent]**: Synthesizes search results into comprehensive reports + +## 🔗 Interaction Flow + +### Internal Communication +- **Sequential Planning**: Planner agent generates search strategy before execution +- **Parallel Search Execution**: Multiple search agents run concurrently using `workflow.as_completed` +- **Result Aggregation**: Search results are collected and passed to the writer agent +- **Final Synthesis**: Writer agent processes all results to create the final report + +### External Dependencies +- **Web Search API**: WebSearchTool provides internet research capabilities +- **OpenAI Models**: Different models for different agent roles (GPT-4o, o3-mini) +- **Temporal Server**: Workflow orchestration and durable execution + +## 💻 Development Guidelines + +### Code Organization +- **Agent Separation**: Each agent is defined in its own file with clear responsibilities +- **Model Definitions**: Pydantic models for structured data exchange between agents +- **Workflow Isolation**: Simple workflow wrapper around the ResearchManager +- **Runner Scripts**: Separate execution scripts for workflow and worker + +### Design Patterns +- **Agent Factory Pattern**: `new_*_agent()` functions for agent instantiation +- **Strategy Pattern**: Different agents handle different aspects of research +- **Pipeline Pattern**: Sequential processing through planning → searching → writing +- **Parallel Execution**: Concurrent search execution for improved performance + +### Error Handling +- **Graceful Degradation**: Failed searches return None and are filtered out +- **Exception Isolation**: Individual search failures don't break the entire workflow +- **Result Validation**: Pydantic models ensure data consistency and type safety + +## 📝 Code Examples & Best Practices + +### Research Manager Orchestration +**File**: `openai_agents/research_bot/agents/research_manager.py` + +```python +class ResearchManager: + def __init__(self): + self.run_config = RunConfig() + self.search_agent = new_search_agent() + self.planner_agent = new_planner_agent() + self.writer_agent = new_writer_agent() + + async def run(self, query: str) -> str: + with trace("Research trace"): + search_plan = await self._plan_searches(query) + search_results = await self._perform_searches(search_plan) + report = await self._write_report(query, search_results) + return report.markdown_report +``` + +**Key Benefits**: +- Centralized orchestration of all research phases +- Clear separation of concerns with dedicated methods +- Temporal tracing for observability and debugging +- Structured workflow from planning to final report + +### Parallel Search Execution +**File**: `openai_agents/research_bot/agents/research_manager.py` + +```python +async def _perform_searches(self, search_plan: WebSearchPlan) -> list[str]: + with custom_span("Search the web"): + num_completed = 0 + tasks = [ + asyncio.create_task(self._search(item)) + for item in search_plan.searches + ] + results = [] + for task in workflow.as_completed(tasks): + result = await task + if result is not None: + results.append(result) + num_completed += 1 + return results +``` + +**Key Benefits**: +- Concurrent execution of multiple searches for improved performance +- Proper Temporal workflow integration with `workflow.as_completed` +- Graceful handling of failed searches (None results are filtered) +- Progress tracking with completion counters + +### Agent Configuration and Tools +**File**: `openai_agents/research_bot/agents/search_agent.py` + +```python +def new_search_agent(): + return Agent( + name="Search agent", + instructions=INSTRUCTIONS, + tools=[WebSearchTool()], + model_settings=ModelSettings(tool_choice="required"), + ) +``` + +**Key Benefits**: +- Forced tool usage ensures WebSearchTool is always employed +- Clear agent naming for debugging and monitoring +- Structured instructions for consistent search behavior +- Tool integration for web research capabilities + +### Structured Data Models +**File**: `openai_agents/research_bot/agents/planner_agent.py` + +```python +class WebSearchItem(BaseModel): + reason: str + "Your reasoning for why this search is important to the query." + query: str + "The search term to use for the web search." + +class WebSearchPlan(BaseModel): + searches: list[WebSearchItem] + """A list of web searches to perform to best answer the query.""" +``` + +**Key Benefits**: +- Type-safe data exchange between agents +- Self-documenting fields with descriptive docstrings +- Validation ensures data consistency across the workflow +- Clear structure for search planning and execution + +### Workflow Integration +**File**: `openai_agents/research_bot/workflows/research_bot_workflow.py` + +```python +@workflow.defn +class ResearchWorkflow: + @workflow.run + async def run(self, query: str) -> str: + return await ResearchManager().run(query) +``` + +**Key Benefits**: +- Simple workflow wrapper around the ResearchManager +- Clean separation between Temporal workflow and business logic +- Easy to extend with additional workflow features +- Minimal boilerplate for workflow definition + +### Worker Configuration +**File**: `openai_agents/research_bot/run_worker.py` + +```python +worker = Worker( + client, + task_queue="openai-agents-task-queue", + workflows=[ + ResearchWorkflow, + ], +) +``` + +**Key Benefits**: +- Consistent task queue naming across the system +- Extended timeouts for AI model operations (120 seconds) +- OpenAIAgentsPlugin integration for agent capabilities +- Simple worker setup focused on research workflows + +## 🎯 **Key Benefits of This Structure:** + +1. **Scalable Research**: Parallel search execution handles multiple queries efficiently +2. **Agent Specialization**: Each agent has a focused role for better performance +3. **Durable Execution**: Temporal ensures research tasks survive interruptions +4. **Structured Output**: Pydantic models guarantee data consistency +5. **Observability**: Tracing and custom spans provide debugging insights +6. **Modular Design**: Easy to extend with new agent types or research capabilities + +## ⚠️ **Important Implementation Notes:** + +- **Task Queue Consistency**: Uses `"openai-agents-task-queue"` for all operations +- **Model Selection Strategy**: Different models for different tasks (GPT-4o for planning, o3-mini for writing) +- **Search Result Filtering**: Failed searches return None and are automatically filtered out +- **Parallel Execution**: Uses `workflow.as_completed` for proper Temporal integration +- **Tool Requirements**: Search agent enforces tool usage with `tool_choice="required"` + +## 🏗️ **Architecture Patterns:** + +- **Multi-Agent System**: Coordinated execution of specialized AI agents +- **Pipeline Architecture**: Sequential processing through planning → searching → writing +- **Parallel Processing**: Concurrent search execution for improved performance +- **Workflow Orchestration**: Temporal manages the entire research lifecycle +- **Factory Pattern**: Agent instantiation through dedicated factory functions + +## 📁 **File Organization:** + +``` +openai_agents/research_bot/ +├── README.md # High-level overview and usage instructions +├── agents/ +│ ├── research_manager.py # Central orchestration and workflow management +│ ├── planner_agent.py # Search strategy generation +│ ├── search_agent.py # Web search execution and summarization +│ └── writer_agent.py # Report synthesis and generation +├── workflows/ +│ └── research_bot_workflow.py # Temporal workflow wrapper +├── run_research_workflow.py # Client script for executing research +└── run_worker.py # Worker configuration and execution +``` + +## 🔧 **Common Development Patterns:** + +- **Agent Creation**: Use factory functions (`new_*_agent()`) for consistent agent setup +- **Data Flow**: Follow the planning → searching → writing pipeline for new research features +- **Error Handling**: Implement graceful degradation for individual component failures +- **Tracing**: Use `trace()` and `custom_span()` for observability and debugging +- **Model Selection**: Choose appropriate models based on task complexity and requirements +- **Tool Integration**: Ensure required tools are properly configured with `tool_choice="required"` diff --git a/docs/openai_agents/TOOLS.md b/docs/openai_agents/TOOLS.md new file mode 100644 index 000000000..5617b3657 --- /dev/null +++ b/docs/openai_agents/TOOLS.md @@ -0,0 +1,792 @@ +# Tools Integration + +## 📑 Table of Contents + +- [Introduction](#introduction) +- [Philosophy & Challenges](#philosophy--challenges) +- [System Constraints & Features](#system-constraints--features) +- [System Overview](#system-overview) +- [System Flow](#system-flow) +- [Core Architecture](#core-architecture) +- [Interaction Flow](#interaction-flow) +- [Development Guidelines](#development-guidelines) +- [Code Examples & Best Practices](#code-examples--best-practices) + +## 🎯 Introduction + +The Tools Integration service demonstrates how to extend OpenAI agents with powerful external capabilities through specialized tools. This service showcases integration with code interpretation, file search, image generation, and web search tools, all implemented within Temporal's reliable workflow framework. + +The system is designed for developers and engineering teams who want to: +- Learn how to integrate external tools with OpenAI agents in Temporal workflows +- Implement code execution capabilities for mathematical and data analysis tasks +- Build document search systems using vector stores and similarity search +- Create image generation workflows with DALL-E integration +- Implement web search capabilities with location-aware context +- Understand tool configuration and resource management patterns + +## 🧠 Philosophy & Challenges + +### What We're Solving +- **Tool Integration Complexity**: Managing external tool dependencies and configurations +- **Resource Management**: Handling file uploads, vector stores, and API integrations +- **Tool Configuration**: Properly configuring tools with specific parameters and resources +- **Result Processing**: Extracting and processing different types of tool outputs +- **Knowledge Base Setup**: Creating and maintaining vector stores for document search +- **Cross-Platform Compatibility**: Handling tool outputs across different operating systems + +### Our Approach +- **Tool-First Design**: Integrate tools as first-class citizens in agent workflows +- **Resource Abstraction**: Abstract complex tool configurations behind simple interfaces +- **Result Standardization**: Normalize different tool outputs into consistent formats +- **Setup Automation**: Provide automated setup scripts for complex tool dependencies +- **Temporal Integration**: Leverage workflow durability for reliable tool execution +- **Error Handling**: Implement robust error handling for external tool failures + +## ⚡ System Constraints & Features + +### Key Features +- **Code Interpreter Tool**: Execute Python code for calculations and data analysis +- **File Search Tool**: Vector-based document search using OpenAI's file search capabilities +- **Image Generation Tool**: Create images using DALL-E with configurable quality settings +- **Web Search Tool**: Search the web with location-aware context and user preferences +- **Knowledge Base Setup**: Automated creation of vector stores with sample content +- **Cross-Platform Support**: Handle tool outputs across macOS, Windows, and Linux + +### System Constraints +- **OpenAI API Dependencies**: Requires valid OpenAI API key and proper tool access +- **File Upload Limits**: OpenAI API has specific file size and format requirements +- **Tool Resource Limits**: Vector stores and file search have usage quotas +- **Image Quality Trade-offs**: Lower quality settings for faster generation +- **Web Search Limitations**: Location-based results may vary by region +- **Task Queue**: Uses `"openai-agents-tools-task-queue"` for all workflows + +## 🏗️ System Overview + +```mermaid +graph TB + A[Client Request] --> B[Temporal Workflow] + B --> C[Tool-Enabled Agent] + C --> D[Tool Selection] + + D --> E[Code Interpreter] + D --> F[File Search] + D --> G[Image Generation] + D --> H[Web Search] + + E --> I[Python Execution] + F --> J[Vector Store Query] + G --> K[DALL-E API] + H --> L[Web Search API] + + I --> M[Result Processing] + J --> M + K --> M + L --> M + + M --> N[Formatted Output] + N --> B + B --> O[Client Response] + + P[OpenAI API] --> E + P --> F + P --> G + P --> H + + Q[Knowledge Base] --> F + R[Setup Scripts] --> Q +``` + +## 🔄 System Flow + +```mermaid +sequenceDiagram + participant C as Client + participant T as Temporal Workflow + participant A as Tool-Enabled Agent + participant O as OpenAI API + participant V as Vector Store + participant W as Web Search + + C->>T: Start Tool Workflow + T->>A: Initialize Agent with Tools + + alt Code Interpreter + A->>O: Execute Python Code + O->>A: Return Execution Result + else File Search + A->>V: Query Vector Store + V->>A: Return Similar Documents + else Image Generation + A->>O: Generate Image with DALL-E + O->>A: Return Image Data + else Web Search + A->>W: Search with Location Context + W->>A: Return Search Results + end + + A->>T: Process Tool Results + T->>C: Return Formatted Output +``` + +## 🏛️ Core Architecture + +### Component Layers +1. **Workflow Layer**: Temporal workflows for tool orchestration with `@workflow.defn` +2. **Tool Layer**: Specialized tool implementations (CodeInterpreter, FileSearch, etc.) +3. **Agent Layer**: Tool-enabled agents with specific instructions and tool configurations +4. **Resource Layer**: Vector stores, file uploads, and external API integrations +5. **Setup Layer**: Automated knowledge base creation and tool configuration + +### Key Components +- **Tool Workflows**: Implement specific tool capabilities with Temporal integration +- **Tool-Enabled Agents**: Agents configured with specific tools and resources +- **Resource Management**: Handle file uploads, vector stores, and API configurations +- **Result Processing**: Extract and format different types of tool outputs +- **Setup Automation**: Scripts for creating knowledge bases and configuring tools +- **Cross-Platform Handling**: Platform-specific file operations and tool outputs + +## 🔗 Interaction Flow + +### Internal Communication +- Workflows orchestrate tool execution using `Runner.run()` with tool-enabled agents +- Agents communicate with tools through OpenAI's tool integration system +- Results are processed through standardized output formats and error handling +- File operations and vector store queries are managed through OpenAI's APIs + +### External Dependencies +- **OpenAI API**: For tool execution and resource management +- **Vector Stores**: For document search and similarity matching +- **DALL-E API**: For image generation capabilities +- **Web Search APIs**: For current information retrieval +- **File Systems**: For temporary file management and cross-platform operations + +## 💻 Development Guidelines + +### Code Organization +- **Tool Workflows**: One file per tool type in `workflows/` directory +- **Runner Scripts**: Individual execution scripts for each tool in root directory +- **Worker**: Central worker supporting all tools in `run_worker.py` +- **Setup Scripts**: Knowledge base creation and tool configuration in root directory + +### Design Patterns +- **Tool Integration Pattern**: Agents configured with specific tools and resources +- **Resource Management Pattern**: Automated setup and configuration of external resources +- **Result Processing Pattern**: Standardized handling of different tool outputs +- **Cross-Platform Pattern**: Platform-specific handling of file operations and outputs + +### Error Handling +- **Tool Failures**: Handle cases where external tools fail or timeout +- **Resource Limits**: Manage API quotas and file upload restrictions +- **Configuration Errors**: Handle missing or invalid tool configurations +- **Platform Differences**: Gracefully handle cross-platform compatibility issues +- **Setup Failures**: Provide clear error messages for knowledge base setup issues + +## 📝 Code Examples & Best Practices + +### Code Interpreter Tool Pattern +**File**: `openai_agents/tools/workflows/code_interpreter_workflow.py` + +This pattern enables agents to execute Python code for mathematical calculations, data analysis, and computational tasks. + +```python +from __future__ import annotations + +from agents import Agent, CodeInterpreterTool, Runner +from temporalio import workflow + +@workflow.defn +class CodeInterpreterWorkflow: + @workflow.run + async def run(self, question: str) -> str: + # Create agent with code interpreter tool + agent = Agent( + name="Code interpreter", + instructions="You love doing math.", # Specialized instructions for mathematical tasks + tools=[ + CodeInterpreterTool( + tool_config={ + "type": "code_interpreter", # Specify tool type + "container": {"type": "auto"}, # Use automatic container configuration + }, + ) + ], + ) + + # Execute the agent with the mathematical question + result = await Runner.run(agent, question) + return result.final_output +``` + +**Key Benefits**: +- **Mathematical Capabilities**: Execute complex calculations and data analysis +- **Code Execution**: Run Python code in isolated containers +- **Automatic Configuration**: Use OpenAI's optimized container settings +- **Result Integration**: Seamlessly integrate code execution results into agent responses + +### File Search Tool Pattern +**File**: `openai_agents/tools/workflows/file_search_workflow.py` + +This pattern enables agents to search through uploaded documents using vector similarity and semantic understanding. + +```python +from __future__ import annotations + +from agents import Agent, FileSearchTool, Runner +from temporalio import workflow + +@workflow.defn +class FileSearchWorkflow: + @workflow.run + async def run(self, question: str, vector_store_id: str) -> str: + # Create agent with file search tool + agent = Agent( + name="File searcher", + instructions="You are a helpful agent.", + tools=[ + FileSearchTool( + max_num_results=3, # Limit results for focused responses + vector_store_ids=[vector_store_id], # Specify which knowledge base to search + include_search_results=True, # Include raw search results in context + ) + ], + ) + + # Execute search with the user's question + result = await Runner.run(agent, question) + return result.final_output +``` + +**Key Benefits**: +- **Semantic Search**: Find relevant documents using vector similarity +- **Knowledge Base Integration**: Search through custom document collections +- **Result Limiting**: Control the number of search results for focused responses +- **Context Preservation**: Include search results in agent context for better responses + +### Image Generation Tool Pattern +**File**: `openai_agents/tools/workflows/image_generator_workflow.py` + +This pattern enables agents to generate images using DALL-E with configurable quality settings and result processing. + +```python +from __future__ import annotations + +from dataclasses import dataclass +from typing import Optional + +from agents import Agent, ImageGenerationTool, Runner +from temporalio import workflow + +# Structured output for image generation results +@dataclass +class ImageGenerationResult: + final_output: str # Agent's text response + image_data: Optional[str] = None # Base64-encoded image data if generated + +@workflow.defn +class ImageGeneratorWorkflow: + @workflow.run + async def run(self, prompt: str) -> ImageGenerationResult: + # Create agent with image generation tool + agent = Agent( + name="Image generator", + instructions="You are a helpful agent.", + tools=[ + ImageGenerationTool( + tool_config={ + "type": "image_generation", # Specify tool type + "quality": "low", # Use lower quality for faster generation + }, + ) + ], + ) + + # Generate image based on the prompt + result = await Runner.run(agent, prompt) + + # Extract image data from tool results + image_data = None + for item in result.new_items: + if ( + item.type == "tool_call_item" + and item.raw_item.type == "image_generation_call" + and (img_result := item.raw_item.result) + ): + image_data = img_result + break + + # Return structured result with both text and image data + return ImageGenerationResult( + final_output=result.final_output, + image_data=image_data + ) +``` + +**Key Benefits**: +- **Image Generation**: Create custom images using DALL-E integration +- **Quality Control**: Configure image quality vs. generation speed trade-offs +- **Result Extraction**: Access both text responses and generated image data +- **Structured Output**: Use Pydantic models for type-safe result handling + +### Web Search Tool Pattern +**File**: `openai_agents/tools/workflows/web_search_workflow.py` + +This pattern enables agents to search the web for current information with location-aware context. + +```python +from __future__ import annotations + +from agents import Agent, Runner, WebSearchTool +from temporalio import workflow + +@workflow.defn +class WebSearchWorkflow: + @workflow.run + async def run(self, question: str, user_city: str = "New York") -> str: + # Create agent with web search tool + agent = Agent( + name="Web searcher", + instructions="You are a helpful agent.", + tools=[ + WebSearchTool( + user_location={ + "type": "approximate", # Use approximate location for privacy + "city": user_city # Specify user's city for localized results + } + ) + ], + ) + + # Execute web search with location context + result = await Runner.run(agent, question) + return result.final_output +``` + +**Key Benefits**: +- **Current Information**: Access up-to-date information from the web +- **Location Context**: Provide location-aware search results +- **Privacy Protection**: Use approximate location instead of exact coordinates +- **Real-time Data**: Get current news, weather, and local information + +### Worker Configuration +**File**: `openai_agents/tools/run_worker.py` + +This is the central worker that supports all tool workflows, providing a single execution environment for the entire system. + +```python +from __future__ import annotations + +import asyncio +from datetime import timedelta + +from temporalio.client import Client +from temporalio.contrib.openai_agents import ModelActivityParameters, OpenAIAgentsPlugin +from temporalio.worker import Worker + +# Import all tool workflow classes for registration +from openai_agents.tools.workflows.code_interpreter_workflow import CodeInterpreterWorkflow +from openai_agents.tools.workflows.file_search_workflow import FileSearchWorkflow +from openai_agents.tools.workflows.image_generator_workflow import ImageGeneratorWorkflow +from openai_agents.tools.workflows.web_search_workflow import WebSearchWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=60) # Extended timeout for tool operations + ) + ), + ], + ) + + # Create worker that supports all tool workflows + worker = Worker( + client, + task_queue="openai-agents-tools-task-queue", # Dedicated task queue for tools + workflows=[ + # Register all tool workflow classes for execution + CodeInterpreterWorkflow, + FileSearchWorkflow, + ImageGeneratorWorkflow, + WebSearchWorkflow, + ], + activities=[ + # No custom activities needed for these workflows + ], + ) + await worker.run() + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Centralized Execution**: Single worker for all tool workflows +- **Extended Timeouts**: Longer timeouts for complex tool operations +- **Dedicated Task Queue**: Separate queue for tool-specific workflows +- **Easy Deployment**: Single process to manage and monitor all tools + +### Runner Script Pattern +**File**: `openai_agents/tools/run_code_interpreter_workflow.py` (example) + +Runner scripts provide individual execution of specific tools for testing and demonstration purposes. + +```python +import asyncio + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.tools.workflows.code_interpreter_workflow import CodeInterpreterWorkflow + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute specific tool workflow with test input + result = await client.execute_workflow( + CodeInterpreterWorkflow.run, # Workflow to execute + "What is the square root of 273 * 312821 plus 1782?", # Mathematical question + id="code-interpreter-workflow", # Unique workflow ID + task_queue="openai-agents-tools-task-queue", # Task queue for execution + ) + print(f"Result: {result}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Individual Testing**: Test specific tools in isolation +- **Easy Demonstration**: Simple execution for demos and presentations +- **Development Workflow**: Quick iteration during development +- **Clear Examples**: Shows exactly how to execute each tool + +### Knowledge Base Setup Pattern +**File**: `openai_agents/tools/setup_knowledge_base.py` + +This script demonstrates how to create and configure vector stores for file search capabilities. + +```python +#!/usr/bin/env python3 +""" +Setup script to create vector store with sample documents for testing FileSearchWorkflow. +Creates documents about Arrakis/Dune and uploads them to OpenAI for file search testing. +""" + +import asyncio +import os +import tempfile +from contextlib import asynccontextmanager +from pathlib import Path +from typing import Dict, List + +from openai import AsyncOpenAI + +# Sample knowledge base content for testing +KNOWLEDGE_BASE = { + "arrakis_overview": """ + Arrakis: The Desert Planet + + Arrakis, also known as Dune, is the third planet of the Canopus system. + This harsh desert world is the sole source of the spice melange, the most + valuable substance in the known universe. + """, + # ... additional content sections +} + +@asynccontextmanager +async def temporary_files(content_dict: Dict[str, str]): + """Create temporary files from content dictionary.""" + temp_files = [] + try: + for name, content in content_dict.items(): + with tempfile.NamedTemporaryFile( + mode="w", suffix=".txt", delete=False, prefix=f"arrakis_{name}_" + ) as tmp: + tmp.write(content) + temp_files.append(tmp.name) + yield temp_files + finally: + # Clean up temporary files + for file_path in temp_files: + try: + os.unlink(file_path) + except Exception: + pass + +async def upload_files_to_openai(file_paths: List[str]) -> List[str]: + """Upload files to OpenAI and return file IDs.""" + client = AsyncOpenAI() + file_ids = [] + + for file_path in file_paths: + try: + with open(file_path, "rb") as file: + response = await client.files.create( + file=file, + purpose="assistants" # Specify purpose for vector store usage + ) + file_ids.append(response.id) + print(f"Uploaded {file_path}: {response.id}") + except Exception as e: + print(f"Error uploading {file_path}: {e}") + + return file_ids + +async def create_vector_store_with_assistant(file_ids: List[str]) -> str: + """Create vector store via OpenAI assistant creation.""" + client = AsyncOpenAI() + + try: + # Create assistant with file search capabilities + assistant = await client.beta.assistants.create( + name="Arrakis Knowledge Assistant", + instructions="You are an expert on the Arrakis/Dune universe.", + model="gpt-4o", + tools=[{"type": "file_search"}], + tool_resources={ + "file_search": { + "vector_stores": [ + { + "file_ids": file_ids, + "metadata": {"name": "Arrakis Knowledge Base"}, + } + ] + } + }, + ) + + # Extract vector store ID from assistant + if assistant.tool_resources and assistant.tool_resources.file_search: + vector_store_ids = assistant.tool_resources.file_search.vector_store_ids + if vector_store_ids: + return vector_store_ids[0] + + raise Exception("No vector store ID found in assistant response") + + except Exception as e: + print(f"Error creating assistant: {e}") + raise + +def update_workflow_files(vector_store_id: str): + """Update workflow files with the new vector store ID.""" + import re + + files_to_update = ["run_file_search_workflow.py"] + + # Pattern to match any vector store ID with the specific comment + pattern = r'(vs_[a-f0-9]+)",\s*#\s*Vector store with Arrakis knowledge' + replacement = f'{vector_store_id}", # Vector store with Arrakis knowledge' + + for filename in files_to_update: + file_path = Path(__file__).parent / filename + if file_path.exists(): + try: + content = file_path.read_text() + if re.search(pattern, content): + updated_content = re.sub(pattern, replacement, content) + file_path.write_text(updated_content) + print(f"Updated {filename} with vector store ID") + else: + print(f"No matching pattern found in {filename}") + except Exception as e: + print(f"Error updating {filename}: {e}") + +async def main(): + """Main function to set up the knowledge base.""" + # Check for API key + if not os.getenv("OPENAI_API_KEY"): + print("Error: OPENAI_API_KEY environment variable not set") + print("Please set your OpenAI API key:") + print("export OPENAI_API_KEY='your-api-key-here'") + return + + print("Setting up Arrakis knowledge base...") + + try: + # Create temporary files and upload them + async with temporary_files(KNOWLEDGE_BASE) as temp_files: + print(f"Created {len(temp_files)} temporary files") + + file_ids = await upload_files_to_openai(temp_files) + + if not file_ids: + print("Error: No files were successfully uploaded") + return + + print(f"Successfully uploaded {len(file_ids)} files") + + # Create vector store via assistant + vector_store_id = await create_vector_store_with_assistant(file_ids) + + print(f"Created vector store: {vector_store_id}") + + # Update workflow files automatically + update_workflow_files(vector_store_id) + + print() + print("=" * 60) + print("KNOWLEDGE BASE SETUP COMPLETE") + print("=" * 60) + print(f"Vector Store ID: {vector_store_id}") + print(f"Files indexed: {len(file_ids)}") + print("Content: Arrakis/Dune universe knowledge") + print("=" * 60) + + except Exception as e: + print(f"Setup failed: {e}") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Automated Setup**: Create knowledge bases with minimal manual intervention +- **Content Management**: Organize and structure knowledge base content +- **File Upload**: Handle OpenAI file uploads with proper cleanup +- **Vector Store Creation**: Automatically create and configure vector stores +- **Workflow Integration**: Update workflow files with new vector store IDs + +### Cross-Platform File Handling Pattern +**File**: `openai_agents/tools/run_image_generator_workflow.py` (example) + +This pattern demonstrates how to handle platform-specific file operations and tool outputs. + +```python +import asyncio +import base64 +import os +import subprocess +import sys +import tempfile + +from temporalio.client import Client +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +from openai_agents.tools.workflows.image_generator_workflow import ImageGeneratorWorkflow + +def open_file(path: str) -> None: + """Open file using platform-specific commands.""" + if sys.platform.startswith("darwin"): + subprocess.run(["open", path], check=False) # macOS + elif os.name == "nt": # Windows + os.startfile(path) # type: ignore + elif os.name == "posix": + subprocess.run(["xdg-open", path], check=False) # Linux/Unix + else: + print(f"Don't know how to open files on this platform: {sys.platform}") + +async def main(): + # Create client connected to Temporal server + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()], + ) + + # Execute image generation workflow + result = await client.execute_workflow( + ImageGeneratorWorkflow.run, + "Create an image of a frog eating a pizza, comic book style.", + id="image-generator-workflow", + task_queue="openai-agents-tools-task-queue", + ) + + print(f"Text result: {result.final_output}") + + if result.image_data: + # Save and open the generated image + with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp: + tmp.write(base64.b64decode(result.image_data)) + temp_path = tmp.name + + print(f"Image saved to: {temp_path}") + # Open the image using platform-specific commands + open_file(temp_path) + else: + print("No image data found in result") + +if __name__ == "__main__": + asyncio.run(main()) +``` + +**Key Benefits**: +- **Cross-Platform Support**: Handle file operations across different operating systems +- **Automatic File Opening**: Use platform-appropriate commands to open generated files +- **Temporary File Management**: Properly handle temporary file creation and cleanup +- **Base64 Decoding**: Convert tool outputs to usable file formats + +## 🎯 Key Benefits of This Structure + +1. **Tool Integration**: Demonstrates comprehensive external tool integration with OpenAI agents +2. **Resource Management**: Shows how to manage complex tool dependencies and configurations +3. **Result Processing**: Implements standardized handling of different tool output types +4. **Setup Automation**: Provides automated setup scripts for complex tool configurations +5. **Cross-Platform Support**: Handles tool outputs across different operating systems +6. **Knowledge Base Management**: Demonstrates vector store creation and document indexing +7. **Quality Control**: Configurable tool parameters for performance and quality trade-offs +8. **Error Handling**: Robust error handling for external tool failures and resource limits + +## ⚠️ Important Implementation Notes + +### Task Queue Configuration +- **Worker**: Uses task queue `"openai-agents-tools-task-queue"` +- **All Runner Scripts**: Use the same task queue for consistency +- **Note**: Dedicated task queue for tool-specific workflows + +### Tool Dependencies and Setup +- **OpenAI API Key**: Required for all tool operations +- **Knowledge Base Setup**: Run `setup_knowledge_base.py` before testing file search +- **File Upload Limits**: OpenAI API has specific file size and format requirements +- **Tool Access**: Ensure your OpenAI account has access to the required tools + +### Specific Examples Implemented +- **Code Interpreter**: Mathematical calculations and Python code execution +- **File Search**: Vector-based document search with Arrakis knowledge base +- **Image Generation**: DALL-E integration with configurable quality settings +- **Web Search**: Location-aware web search with user context +- **Knowledge Base**: Automated setup with sample Dune/Arrakis content + +### Architecture Patterns +- **Tool-First Design**: Tools are primary components, not afterthoughts +- **Resource Management**: Automated setup and configuration of external resources +- **Result Standardization**: Consistent handling of different tool output types +- **Cross-Platform Support**: Platform-specific handling of file operations +- **Setup Automation**: Scripts for creating and configuring complex tool dependencies + +### File Organization +``` +openai_agents/tools/ +├── workflows/ # Core tool implementations +│ ├── code_interpreter_workflow.py # Python code execution +│ ├── file_search_workflow.py # Vector-based document search +│ ├── image_generator_workflow.py # DALL-E image generation +│ └── web_search_workflow.py # Web search with location context +├── run_worker.py # Central worker for all tools +├── run_*.py # Individual tool runners +├── setup_knowledge_base.py # Knowledge base creation script +└── README.md # Tool overview and usage +``` + +### Common Development Patterns +- **Always check API keys** before running tool workflows +- **Use setup scripts** for complex tool configurations +- **Handle tool outputs** with proper error checking +- **Implement cross-platform** file operations for tool results +- **Configure tool parameters** for performance vs. quality trade-offs +- **Manage external resources** with proper cleanup and error handling + +This structure ensures developers can understand: +- **Tool integration patterns** with OpenAI agents in Temporal workflows +- **Resource management** for complex external dependencies +- **Result processing** for different types of tool outputs +- **Setup automation** for knowledge bases and tool configurations +- **Cross-platform compatibility** for tool outputs and file operations +- **Quality and performance** trade-offs in tool configuration + +The tools serve as building blocks for extending agent capabilities while maintaining the reliability, observability, and error handling that Temporal provides. Each tool demonstrates specific integration patterns that can be adapted for custom tool development and integration. From f48c7ca1ef7098036cb16504361b50b9fee7e216 Mon Sep 17 00:00:00 2001 From: jdnichollsc Date: Mon, 25 Aug 2025 00:49:51 -0500 Subject: [PATCH 2/4] add llms.txt file for AI assistants --- docs/openai_agents/ARCHITECTURE.md | 338 +++++++++++++++ docs/openai_agents/README.md | 28 +- docs/openai_agents/llms.txt | 635 +++++++++++++++++++++++++++++ 3 files changed, 998 insertions(+), 3 deletions(-) create mode 100644 docs/openai_agents/ARCHITECTURE.md create mode 100644 docs/openai_agents/llms.txt diff --git a/docs/openai_agents/ARCHITECTURE.md b/docs/openai_agents/ARCHITECTURE.md new file mode 100644 index 000000000..784a83579 --- /dev/null +++ b/docs/openai_agents/ARCHITECTURE.md @@ -0,0 +1,338 @@ +# OpenAI Agents SDK + Temporal Architecture Deep Dive + +## 🏗️ **Integration Architecture** + +This document provides a technical deep dive into how the OpenAI Agents SDK integrates with Temporal's durable execution engine, focusing on the architectural patterns and implementation details. + +## 🔄 **Core Integration Mechanism** + +### **Implicit Activity Creation** +The integration's key innovation is automatic Temporal Activity creation for every agent invocation: + +```python +# What you write +result = await Runner.run(agent, input="Hello") + +# What happens under the hood +# 1. Temporal creates an Activity for this agent execution +# 2. The Activity handles the OpenAI API call +# 3. Results are automatically persisted +# 4. Workflow state is checkpointed +``` + +### **Runner Abstraction** +The `Runner` class serves as the bridge between OpenAI Agents SDK and Temporal: + +```python +from agents import Runner + +# Standard usage (creates implicit activities) +result = await Runner.run(agent, input="...") + +# With custom configuration +result = await Runner.run( + agent, + input="...", + run_config=RunConfig( + max_steps=100, + timeout=300 + ) +) +``` + +## 🏛️ **System Architecture** + +```mermaid +graph TB + subgraph "Client Layer" + C[Client Application] + end + + subgraph "Temporal Layer" + W[Workflow] + A[Implicit Activities] + S[Temporal Server] + end + + subgraph "OpenAI Layer" + AG[Agent Definition] + LLM[LLM API] + T[Tools] + end + + subgraph "External Services" + API[External APIs] + DB[Databases] + FS[File Systems] + end + + C --> W + W --> A + A --> AG + AG --> LLM + AG --> T + T --> API + T --> DB + T --> FS + + A --> S + S --> A +``` + +## 🔄 **Execution Flow** + +```mermaid +sequenceDiagram + participant C as Client + participant W as Workflow + participant A as Activity + participant AG as Agent + participant O as OpenAI API + participant T as Tools + + C->>W: Start Workflow + W->>A: Create Activity (implicit) + A->>AG: Initialize Agent + AG->>O: Generate Response + O->>AG: AI Response + + alt Tool Usage Required + AG->>T: Execute Tool + T->>A: Tool Result + A->>AG: Tool Response + AG->>O: Generate Final Response + O->>AG: Final Response + end + + AG->>A: Return Result + A->>W: Activity Complete + W->>C: Workflow Complete +``` + +## 🧩 **Component Architecture** + +### **Workflow Layer** +```python +@workflow.defn +class AgentWorkflow: + @workflow.run + async def run(self, input: str) -> str: + # Workflow orchestrates agent execution + # State is automatically persisted + # Failures trigger automatic retries + pass +``` + +### **Activity Layer (Implicit)** +```python +# Activities are created automatically by the integration +# Each Runner.run() call becomes a Temporal Activity +# Benefits: +# - Automatic retries +# - Timeout handling +# - Error isolation +# - Resource management +``` + +### **Agent Layer** +```python +from agents import Agent + +agent = Agent( + name="MyAgent", + instructions="...", + model="gpt-4o", + tools=[...], + handoffs=[...] +) +``` + +## 🔧 **Configuration Patterns** + +### **Worker Configuration** +```python +from temporalio.worker import Worker +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin, ModelActivityParameters + +worker = Worker( + client, + task_queue="openai-agents-task-queue", + workflows=[...], + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=120), + retry_policy=RetryPolicy( + initial_interval=timedelta(seconds=1), + maximum_interval=timedelta(seconds=10), + maximum_attempts=3 + ) + ) + ) + ] +) +``` + +### **Agent Configuration** +```python +from agents import Agent, ModelSettings + +agent = Agent( + name="ResearchAgent", + instructions="...", + model="gpt-4o", + model_settings=ModelSettings( + temperature=0.7, + max_tokens=1000, + tool_choice="auto" + ), + tools=[WebSearchTool(), CodeInterpreterTool()] +) +``` + +## 📊 **Scaling Architecture** + +### **Horizontal Scaling** +```mermaid +graph LR + subgraph "Workflow Orchestrator" + W[Workflow] + end + + subgraph "Agent Workers" + A1[Agent Worker 1] + A2[Agent Worker 2] + A3[Agent Worker 3] + end + + subgraph "Temporal Server" + TS[Temporal Server] + end + + W --> TS + TS --> A1 + TS --> A2 + TS --> A3 +``` + +### **Independent Scaling** +- **Planner Agents**: Scale based on planning workload +- **Search Agents**: Scale based on search volume +- **Writer Agents**: Scale based on report generation needs +- **Tool Agents**: Scale based on external API usage + +## 🔍 **Observability Architecture** + +### **Integrated Tracing** +```python +from temporalio import workflow +from agents import trace, custom_span + +@workflow.defn +class ObservableWorkflow: + @workflow.run + async def run(self, input: str) -> str: + with trace("Agent Workflow"): + with custom_span("Agent Execution"): + agent = Agent(...) + result = await Runner.run(agent, input) + return result.final_output +``` + +### **Dual Dashboard Access** +- **Temporal Dashboard**: Workflow execution, activity history, retries +- **OpenAI Dashboard**: Agent interactions, tool usage, token consumption + +## 🚨 **Error Handling Architecture** + +### **Automatic Retries** +```python +# Temporal automatically retries failed agent invocations +# Configurable retry policies per activity type +# Exponential backoff with jitter +# Maximum retry attempts per activity +``` + +### **Failure Isolation** +```python +# Agent failures don't crash the entire workflow +# Partial results can be salvaged +# Compensation logic for failed steps +# Graceful degradation strategies +``` + +### **Error Recovery Patterns** +```python +@workflow.defn +class ResilientWorkflow: + @workflow.run + async def run(self, input: str) -> str: + try: + result = await Runner.run(agent, input) + return result.final_output + except Exception as e: + # Fallback logic + return await self.fallback_agent(input) +``` + +## 🔐 **Security Architecture** + +### **API Key Management** +```python +# OpenAI API keys managed through environment variables +# Temporal server authentication and authorization +# Secure communication between components +# Audit logging for all agent interactions +``` + +### **Tool Access Control** +```python +# Tools can be restricted based on agent permissions +# External API access controlled through activities +# Input validation and sanitization +# Output filtering and validation +``` + +## 📈 **Performance Architecture** + +### **Optimization Strategies** +1. **Parallel Agent Execution**: Use `asyncio.gather()` for concurrent agents +2. **Connection Pooling**: Reuse OpenAI API connections +3. **Caching**: Cache agent responses and tool results +4. **Batch Processing**: Group similar agent operations + +### **Resource Management** +```python +# Automatic cleanup of agent resources +# Memory management for large conversations +# Timeout handling for long-running operations +# Resource limits per workflow execution +``` + +## 🔄 **State Management Architecture** + +### **Workflow State** +```python +@workflow.defn +class StatefulWorkflow: + def __init__(self): + self.conversation_history = [] + self.agent_context = {} + + @workflow.run + async def run(self, input: str) -> str: + # State automatically persisted between steps + self.conversation_history.append(input) + # ... agent execution + return result +``` + +### **Persistence Patterns** +- **Conversation History**: Maintain context across agent interactions +- **Agent State**: Preserve agent-specific information +- **Tool Results**: Cache and reuse tool outputs +- **Execution Metadata**: Track performance and usage metrics + +--- + +*This architecture document provides the technical foundation for understanding the integration. For implementation examples and specific use cases, refer to the individual service documentation.* diff --git a/docs/openai_agents/README.md b/docs/openai_agents/README.md index 1fe3c95af..ab35406c0 100644 --- a/docs/openai_agents/README.md +++ b/docs/openai_agents/README.md @@ -17,6 +17,23 @@ This combination ensures that AI agent workflows are: - **Scalable**: Handle complex multi-agent interactions and long-running conversations - **Reliable**: Built-in retry mechanisms and error handling +## 🚀 **Key Integration Benefits** + +### **Implicit Activity Creation** +Every agent invocation is automatically executed through a Temporal Activity, providing durability without code changes. The `Runner.run()` calls automatically create activities under the hood. + +### **Integrated Tracing** +Unified observability across both Temporal and OpenAI systems. View agent execution in Temporal's workflow history and OpenAI's tracing dashboard simultaneously. + +### **Horizontal Scaling** +Each agent runs in its own process or thread, enabling independent scaling. Add more capacity for specific agent types without affecting others. + +### **Production Readiness** +- **Crash-Proof Execution**: Automatic recovery from failures, restarts, and bugs +- **Rate Limit Handling**: Graceful handling of LLM API rate limits +- **Network Resilience**: Automatic retries for downstream API failures +- **State Persistence**: Workflow state automatically saved between steps + ## 🔄 **Core Integration Patterns** ### **Workflow-Orchestrated Agents** @@ -31,9 +48,11 @@ Seamless integration of OpenAI's built-in tools (web search, code interpreter, f ### **Multi-Agent Coordination** Complex workflows can coordinate multiple specialized agents, each with distinct roles and responsibilities. -## 📚 **Service Documentation** +## 📚 **Documentation Structure** -Each service demonstrates specific integration patterns and use cases: +### **Getting Started** +- **[llms.txt](./llms.txt)** - LLM-friendly summary for AI assistants and developers +- **[ARCHITECTURE.md](./ARCHITECTURE.md)** - Technical deep dive into integration patterns ### **Core Services** - **[Basic Examples](./BASIC.md)** - Fundamental agent patterns, lifecycle management, and tool integration @@ -101,11 +120,12 @@ worker = Worker( ) ``` -### **Agent Integration** +### **Agent Integration (Implicit Activities)** ```python from agents import Agent, Runner agent = Agent(name="MyAgent", instructions="...") +# This automatically creates a Temporal Activity under the hood result = await Runner.run(agent, input_text) ``` @@ -123,6 +143,8 @@ Each service documentation follows a consistent structure: - [Temporal Python SDK Documentation](https://docs.temporal.io/python) - [OpenAI Agents SDK Documentation](https://github.com/openai/openai-agents-python) - [Module Documentation](https://github.com/temporalio/sdk-python/blob/main/temporalio/contrib/openai_agents/README.md) +- [Temporal Blog: OpenAI Agents Integration](https://temporal.io/blog/announcing-openai-agents-sdk-integration) +- [Community Demos](https://github.com/temporal-community/openai-agents-demos) ## 🎯 **Use Cases** diff --git a/docs/openai_agents/llms.txt b/docs/openai_agents/llms.txt new file mode 100644 index 000000000..9af2c849a --- /dev/null +++ b/docs/openai_agents/llms.txt @@ -0,0 +1,635 @@ +# OpenAI Agents SDK + Temporal Integration + +## Overview +This integration combines OpenAI's Agents SDK with Temporal's durable execution engine to create production-ready AI agent workflows. The key innovation is that every agent invocation automatically becomes a Temporal Activity, providing durability, observability, and scalability without code changes. + +## Core Concept +When you call `Runner.run(agent, input)` in a Temporal workflow, it automatically: +1. Creates a Temporal Activity for the agent execution +2. Provides automatic retries, state persistence, and error handling +3. Enables horizontal scaling (each agent runs in its own process/thread) +4. Integrates tracing between Temporal and OpenAI systems + +--- + +# basic-usage: Basic Usage +URL: /docs/openai_agents/basic +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/BASIC.md + +## Basic Agent Workflow +```python +from temporalio import workflow +from agents import Agent, Runner + +@workflow.defn +class HelloWorldAgent: + @workflow.run + async def run(self, prompt: str) -> str: + agent = Agent( + name="Assistant", + instructions="You only respond in haikus.", + model="gpt-4o" + ) + + # This automatically creates a Temporal Activity + result = await Runner.run(agent, input=prompt) + return result.final_output +``` + +## Worker Setup +```python +from temporalio.worker import Worker +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin + +worker = Worker( + client, + task_queue="openai-agents-task-queue", + workflows=[HelloWorldAgent], + plugins=[OpenAIAgentsPlugin()] +) +``` + +## Agent Lifecycle Management +```python +@workflow.defn +class AgentLifecycleWorkflow: + @workflow.run + async def run(self, input: str) -> str: + # Initialize agent with specific configuration + agent = Agent( + name="LifecycleAgent", + instructions="Process the input step by step", + model="gpt-4o", + model_settings=ModelSettings( + temperature=0.7, + max_tokens=1000 + ) + ) + + # Execute with lifecycle management + result = await Runner.run( + agent, + input=input, + run_config=RunConfig( + max_steps=50, + timeout=300 + ) + ) + + return result.final_output +``` + +--- + +# agent-patterns: Agent Patterns +URL: /docs/openai_agents/agent_patterns +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/AGENT_PATTERNS.md + +## Multi-Agent Orchestration +```python +@workflow.defn +class MultiAgentWorkflow: + @workflow.run + async def run(self, task: str) -> str: + # Planning agent + planner = Agent( + name="Planner", + instructions="Break down complex tasks into steps", + model="gpt-4o" + ) + + plan = await Runner.run(planner, f"Plan: {task}") + + # Execution agent + executor = Agent( + name="Executor", + instructions="Execute tasks based on plans", + model="gpt-4o" + ) + + result = await Runner.run(executor, f"Execute: {plan.final_output}") + return result.final_output +``` + +## Agent Routing with Handoffs +```python +@workflow.defn +class RoutingWorkflow: + @workflow.run + async def run(self, query: str) -> str: + # Triage agent decides routing + triage_agent = Agent( + name="Triage", + instructions="Route queries to appropriate specialists", + handoffs=[weather_agent, business_agent, tech_agent] + ) + + # Automatic handoff based on query type + result = await Runner.run(triage_agent, query) + return result.final_output +``` + +## Parallel Agent Execution +```python +@workflow.defn +class ParallelWorkflow: + @workflow.run + async def run(self, tasks: list[str]) -> list[str]: + # Create specialized agents + agents = [ + Agent(name=f"Agent{i}", instructions=f"Process {task}") + for i, task in enumerate(tasks) + ] + + # Execute agents in parallel + results = await asyncio.gather(*[ + Runner.run(agent, task) + for agent, task in zip(agents, tasks) + ]) + + return [r.final_output for r in results] +``` + +--- + +# tools-integration: Tools Integration +URL: /docs/openai_agents/tools +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/TOOLS.md + +## Built-in OpenAI Tools +```python +from agents import WebSearchTool, CodeInterpreterTool, FileSearchTool + +@workflow.defn +class ToolsWorkflow: + @workflow.run + async def run(self, query: str) -> str: + agent = Agent( + name="ResearchAgent", + instructions="Research and analyze using available tools", + tools=[ + WebSearchTool(), + CodeInterpreterTool(), + FileSearchTool() + ], + model="gpt-4o" + ) + + result = await Runner.run(agent, query) + return result.final_output +``` + +## Custom Tool Integration +```python +@workflow.defn +class CustomToolsWorkflow: + @workflow.run + async def run(self, request: str) -> str: + # Custom tool for database queries + db_tool = CustomTool( + name="database_query", + description="Query the database for information", + function=self.query_database + ) + + agent = Agent( + name="DataAgent", + instructions="Use database tools to answer questions", + tools=[db_tool], + model="gpt-4o" + ) + + result = await Runner.run(agent, request) + return result.final_output + + async def query_database(self, query: str) -> str: + # Database query implementation + return "Database result" +``` + +## Image Generation Tools +```python +@workflow.defn +class ImageGenerationWorkflow: + @workflow.run + async def run(self, prompt: str) -> str: + agent = Agent( + name="ImageAgent", + instructions="Generate images based on descriptions", + tools=[ImageGeneratorTool()], + model="gpt-4o" + ) + + result = await Runner.run(agent, prompt) + return result.final_output +``` + +--- + +# handoffs: Agent Handoffs +URL: /docs/openai_agents/handoffs +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/HANDOFFS.md + +## Message Filtering with Handoffs +```python +@workflow.defn +class MessageFilterWorkflow: + @workflow.run + async def run(self, message: str) -> str: + # Filter agent with handoff capabilities + filter_agent = Agent( + name="MessageFilter", + instructions="Filter and route messages appropriately", + handoffs=[ + Agent(name="SpamFilter", instructions="..."), + Agent(name="ContentModerator", instructions="..."), + Agent(name="Router", instructions="...") + ] + ) + + result = await Runner.run(filter_agent, message) + return result.final_output +``` + +## Collaborative Agent Workflows +```python +@workflow.defn +class CollaborationWorkflow: + @workflow.run + async def run(self, project: str) -> str: + # Project manager agent + manager = Agent( + name="ProjectManager", + instructions="Coordinate project execution", + handoffs=[ + Agent(name="Designer", instructions="..."), + Agent(name="Developer", instructions="..."), + Agent(name="Tester", instructions="...") + ] + ) + + result = await Runner.run(manager, project) + return result.final_output +``` + +--- + +# hosted-mcp: Hosted MCP Integration +URL: /docs/openai_agents/hosted_mcp +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/HOSTED_MCP.md + +## MCP Server Integration +```python +@workflow.defn +class MCPWorkflow: + @workflow.run + async def run(self, request: str) -> str: + # Agent with MCP tool access + mcp_agent = Agent( + name="MCPAgent", + instructions="Use MCP tools to access external systems", + tools=[MCPTool(server_url="http://localhost:3000")], + model="gpt-4o" + ) + + result = await Runner.run(mcp_agent, request) + return result.final_output +``` + +## Approval Workflow with MCP +```python +@workflow.defn +class ApprovalMCPWorkflow: + @workflow.run + async def run(self, request: str) -> str: + # Approval agent using MCP tools + approval_agent = Agent( + name="ApprovalAgent", + instructions="Process approval requests using MCP tools", + tools=[ + MCPTool(server_url="http://approval-system:3000"), + MCPTool(server_url="http://notification-system:3000") + ] + ) + + result = await Runner.run(approval_agent, request) + return result.final_output +``` + +--- + +# model-providers: Model Providers +URL: /docs/openai_agents/model_providers +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/MODEL_PROVIDERS.md + +## LiteLLM Integration +```python +@workflow.defn +class LiteLLMWorkflow: + @workflow.run + async def run(self, input: str) -> str: + agent = Agent( + name="LiteLLMAgent", + instructions="Process requests using LiteLLM", + model="gpt-4o", + model_settings=ModelSettings( + provider="litellm", + api_base="http://localhost:8000" + ) + ) + + result = await Runner.run(agent, input) + return result.final_output +``` + +## Ollama Integration +```python +@workflow.defm +class OllamaWorkflow: + @workflow.run + async def run(self, input: str) -> str: + agent = Agent( + name="OllamaAgent", + instructions="Use local Ollama models", + model="llama2", + model_settings=ModelSettings( + provider="ollama", + api_base="http://localhost:11434" + ) + ) + + result = await Runner.run(agent, input) + return result.final_output +``` + +## GPT-OSS Integration +```python +@workflow.defn +class GPTOSSWorkflow: + @workflow.run + async def run(self, input: str) -> str: + agent = Agent( + name="GPTOSSAgent", + instructions="Use open-source GPT models", + model="gpt2", + model_settings=ModelSettings( + provider="gpt-oss", + api_base="http://localhost:5000" + ) + ) + + result = await Runner.run(agent, input) + return result.final_output +``` + +--- + +# reasoning-content: Reasoning Content +URL: /docs/openai_agents/reasoning_content +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/REASONING_CONTENT.md + +## Accessing Model Reasoning +```python +@workflow.defn +class ReasoningWorkflow: + @workflow.run + async def run(self, problem: str) -> str: + agent = Agent( + name="ReasoningAgent", + instructions="Show your reasoning step by step", + model="gpt-4o", + model_settings=ModelSettings( + show_reasoning=True, + reasoning_format="step_by_step" + ) + ) + + result = await Runner.run(agent, problem) + return result.final_output +``` + +## Thought Process Extraction +```python +@workflow.defn +class ThoughtProcessWorkflow: + @workflow.run + async def run(self, question: str) -> str: + agent = Agent( + name="ThoughtAgent", + instructions="Explain your thinking process", + model="gpt-4o", + model_settings=ModelSettings( + extract_thoughts=True, + thought_format="detailed" + ) + ) + + result = await Runner.run(agent, question) + return result.final_output +``` + +--- + +# customer-service: Customer Service +URL: /docs/openai_agents/customer_service +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/CUSTOMER_SERVICE.md + +## Conversational Customer Service +```python +@workflow.defn +class CustomerServiceWorkflow: + def __init__(self): + self.conversation_history = [] + + @workflow.run + async def run(self, customer_query: str) -> str: + # Add to conversation history + self.conversation_history.append(customer_query) + + # Customer service agent + agent = Agent( + name="CustomerService", + instructions="Provide helpful customer support", + model="gpt-4o", + context=f"Conversation history: {self.conversation_history}" + ) + + result = await Runner.run(agent, customer_query) + + # Update history + self.conversation_history.append(result.final_output) + return result.final_output +``` + +## Escalation Workflow +```python +@workflow.defn +class EscalationWorkflow: + @workflow.run + async def run(self, issue: str) -> str: + # Initial support agent + support_agent = Agent( + name="SupportAgent", + instructions="Handle customer issues, escalate if needed", + handoffs=[ + Agent(name="SeniorSupport", instructions="..."), + Agent(name="TechnicalSpecialist", instructions="..."), + Agent(name="Manager", instructions="...") + ] + ) + + result = await Runner.run(support_agent, issue) + return result.final_output +``` + +--- + +# financial-research: Financial Research Agent +URL: /docs/openai_agents/financial_research_agent +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md + +## Multi-Agent Financial Analysis +```python +@workflow.defn +class FinancialResearchWorkflow: + @workflow.run + async def run(self, research_request: str) -> str: + # Planning agent + planner = Agent( + name="FinancialPlanner", + instructions="Plan financial research approach", + model="gpt-4o" + ) + + plan = await Runner.run(planner, research_request) + + # Risk analysis agent + risk_agent = Agent( + name="RiskAnalyst", + instructions="Analyze financial risks", + model="gpt-4o" + ) + + risk_analysis = await Runner.run(risk_agent, plan.final_output) + + # Financial data agent + data_agent = Agent( + name="DataAnalyst", + instructions="Analyze financial data", + model="gpt-4o" + ) + + data_analysis = await Runner.run(data_agent, risk_analysis.final_output) + + # Synthesis agent + synthesizer = Agent( + name="Synthesizer", + instructions="Combine all analyses into final report", + model="gpt-4o" + ) + + final_report = await Runner.run( + synthesizer, + f"Plan: {plan.final_output}\nRisk: {risk_analysis.final_output}\nData: {data_analysis.final_output}" + ) + + return final_report.final_output +``` + +--- + +# research-bot: Research Bot +URL: /docs/openai_agents/research_bot +Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/RESEARCH_BOT.md + +## Comprehensive Research Workflow +```python +@workflow.defn +class ResearchBotWorkflow: + @workflow.run + async def run(self, research_topic: str) -> str: + # Research manager coordinates the process + manager = Agent( + name="ResearchManager", + instructions="Coordinate comprehensive research", + model="gpt-4o" + ) + + # Planning phase + plan = await Runner.run(manager, f"Plan research for: {research_topic}") + + # Search agents for different aspects + search_agents = [ + Agent(name="AcademicSearch", instructions="Search academic sources"), + Agent(name="NewsSearch", instructions="Search recent news"), + Agent(name="TechnicalSearch", instructions="Search technical documentation") + ] + + # Parallel search execution + search_results = await asyncio.gather(*[ + Runner.run(agent, plan.final_output) + for agent in search_agents + ]) + + # Writing agent synthesizes results + writer = Agent( + name="ResearchWriter", + instructions="Write comprehensive research report", + model="gpt-4o" + ) + + synthesis_input = f"Topic: {research_topic}\nPlan: {plan.final_output}\nResults: {search_results}" + final_report = await Runner.run(writer, synthesis_input) + + return final_report.final_output +``` + +--- + +# Key Benefits +- **Durability**: Survives crashes, restarts, and failures +- **Scalability**: Independent scaling of different agent types +- **Observability**: Unified tracing across Temporal and OpenAI +- **Production Ready**: Automatic retries, rate limit handling, state persistence + +## Common Patterns +1. **Sequential Agents**: Run agents one after another +2. **Parallel Agents**: Use `asyncio.gather()` for concurrent execution +3. **Agent Handoffs**: Use `handoffs` parameter for agent-to-agent transitions +4. **State Management**: Leverage workflow state for conversation history +5. **Error Handling**: Temporal automatically retries failed agent invocations + +## File Structure +``` +openai_agents/ +├── basic/ # Fundamental patterns +├── agent_patterns/ # Multi-agent architectures +├── tools/ # Tool integration examples +├── handoffs/ # Agent collaboration +├── hosted_mcp/ # MCP integration +├── model_providers/ # Custom LLM providers +├── reasoning_content/ # Model reasoning access +├── customer_service/ # Conversational workflows +├── financial_research_agent/ # Complex multi-agent system +└── research_bot/ # Research workflow example +``` + +## Getting Started +1. Start Temporal server: `temporal server start-dev` +2. Install dependencies: `uv sync --group openai-agents` +3. Set OpenAI API key: `export OPENAI_API_KEY=your_key` +4. Run worker: `uv run openai_agents/basic/run_worker.py` +5. Execute workflow: `uv run openai_agents/basic/run_hello_world_workflow.py` + +## Best Practices +- Keep agents focused on single responsibilities +- Use descriptive agent names for debugging +- Leverage workflow state for conversation context +- Handle agent failures gracefully with Temporal's retry mechanisms +- Use appropriate timeouts for agent operations +- Monitor both Temporal and OpenAI dashboards for observability + +## Resources +- [Temporal Blog](https://temporal.io/blog/announcing-openai-agents-sdk-integration) +- [Python SDK](https://github.com/temporalio/sdk-python/tree/main/temporalio/contrib/openai_agents) +- [Community Demos](https://github.com/temporal-community/openai-agents-demos) From a1a50ba36804091f1e76c2351e3fd5e0b6efd5cd Mon Sep 17 00:00:00 2001 From: jdnichollsc Date: Mon, 25 Aug 2025 12:08:09 -0500 Subject: [PATCH 3/4] docs review --- docs/openai_agents/ARCHITECTURE.md | 180 +++++++++++++++++------------ 1 file changed, 107 insertions(+), 73 deletions(-) diff --git a/docs/openai_agents/ARCHITECTURE.md b/docs/openai_agents/ARCHITECTURE.md index 784a83579..bf17fece5 100644 --- a/docs/openai_agents/ARCHITECTURE.md +++ b/docs/openai_agents/ARCHITECTURE.md @@ -6,27 +6,28 @@ This document provides a technical deep dive into how the OpenAI Agents SDK inte ## 🔄 **Core Integration Mechanism** -### **Implicit Activity Creation** -The integration's key innovation is automatic Temporal Activity creation for every agent invocation: +### **Implicit Activity Creation for Model Invocations** +The integration's key innovation is automatic Temporal Activity creation for each **model invocation** within the agentic loop: ```python # What you write result = await Runner.run(agent, input="Hello") # What happens under the hood -# 1. Temporal creates an Activity for this agent execution -# 2. The Activity handles the OpenAI API call -# 3. Results are automatically persisted -# 4. Workflow state is checkpointed +# 1. Runner.run() executes inside the Temporal workflow +# 2. During the agentic loop, each model invocation becomes a separate Activity +# 3. Each Activity is stored in Temporal history +# 4. The agentic loop can be resumed from any point +# 5. Workflow state is automatically checkpointed between model invocations ``` ### **Runner Abstraction** -The `Runner` class serves as the bridge between OpenAI Agents SDK and Temporal: +The `Runner.run()` method executes **inside** the Temporal workflow, not as an Activity itself: ```python from agents import Runner -# Standard usage (creates implicit activities) +# Standard usage (executes inside workflow, creates implicit activities for model calls) result = await Runner.run(agent, input="...") # With custom configuration @@ -40,6 +41,12 @@ result = await Runner.run( ) ``` +### **Key Architectural Insight** +- **`Runner.run()`**: Executes inside the workflow, orchestrating the agentic loop +- **Model Invocations**: Each LLM call automatically becomes a Temporal Activity +- **Tool Executions**: Run in the workflow, with optional `activity_as_tool` helper +- **Resumability**: The agentic loop can resume from any model invocation point + ## 🏛️ **System Architecture** ```mermaid @@ -50,7 +57,8 @@ graph TB subgraph "Temporal Layer" W[Workflow] - A[Implicit Activities] + MA[Model Activities] + TA[Tool Activities] S[Temporal Server] end @@ -67,16 +75,20 @@ graph TB end C --> W - W --> A - A --> AG + W --> AG AG --> LLM AG --> T + + LLM --> MA + T --> TA T --> API T --> DB T --> FS - A --> S - S --> A + MA --> S + TA --> S + S --> MA + S --> TA ``` ## 🔄 **Execution Flow** @@ -85,27 +97,31 @@ graph TB sequenceDiagram participant C as Client participant W as Workflow - participant A as Activity participant AG as Agent - participant O as OpenAI API + participant M as Model Activity participant T as Tools + participant O as OpenAI API C->>W: Start Workflow - W->>A: Create Activity (implicit) - A->>AG: Initialize Agent - AG->>O: Generate Response - O->>AG: AI Response + W->>AG: Initialize Agent (inside workflow) - alt Tool Usage Required - AG->>T: Execute Tool - T->>A: Tool Result - A->>AG: Tool Response - AG->>O: Generate Final Response - O->>AG: Final Response + loop Agentic Loop + AG->>M: Model Invocation (creates Activity) + M->>O: LLM API Call + O->>M: AI Response + M->>AG: Model Result + + alt Tool Usage Required + AG->>T: Execute Tool (in workflow) + T->>O: Tool API Call + O->>T: Tool Result + T->>AG: Tool Response + end + + Note over AG: Checkpoint workflow state end - AG->>A: Return Result - A->>W: Activity Complete + AG->>W: Return Final Result W->>C: Workflow Complete ``` @@ -117,21 +133,22 @@ sequenceDiagram class AgentWorkflow: @workflow.run async def run(self, input: str) -> str: - # Workflow orchestrates agent execution - # State is automatically persisted - # Failures trigger automatic retries + # Workflow orchestrates the entire agentic loop + # Runner.run() executes inside this workflow + # State is automatically persisted between model invocations + # Failures can resume from any model invocation point pass ``` -### **Activity Layer (Implicit)** +### **Activity Layer (Implicit for Models)** ```python -# Activities are created automatically by the integration -# Each Runner.run() call becomes a Temporal Activity +# Model invocations automatically create Temporal Activities +# Each LLM call becomes a separate Activity stored in history # Benefits: -# - Automatic retries -# - Timeout handling -# - Error isolation -# - Resource management +# - Automatic retries for failed model calls +# - Timeout handling per model invocation +# - Error isolation between model calls +# - Resumability from any point in the agentic loop ``` ### **Agent Layer** @@ -199,10 +216,10 @@ graph LR W[Workflow] end - subgraph "Agent Workers" - A1[Agent Worker 1] - A2[Agent Worker 2] - A3[Agent Worker 3] + subgraph "Model Activity Workers" + M1[Model Worker 1] + M2[Model Worker 2] + M3[Model Worker 3] end subgraph "Temporal Server" @@ -210,16 +227,15 @@ graph LR end W --> TS - TS --> A1 - TS --> A2 - TS --> A3 + TS --> M1 + TS --> M2 + TS --> M3 ``` ### **Independent Scaling** -- **Planner Agents**: Scale based on planning workload -- **Search Agents**: Scale based on search volume -- **Writer Agents**: Scale based on report generation needs -- **Tool Agents**: Scale based on external API usage +- **Model Workers**: Scale based on LLM API call volume +- **Tool Workers**: Scale based on external API usage +- **Workflow Workers**: Scale based on orchestration needs ## 🔍 **Observability Architecture** @@ -240,25 +256,25 @@ class ObservableWorkflow: ``` ### **Dual Dashboard Access** -- **Temporal Dashboard**: Workflow execution, activity history, retries +- **Temporal Dashboard**: Workflow execution, model activity history, retries - **OpenAI Dashboard**: Agent interactions, tool usage, token consumption ## 🚨 **Error Handling Architecture** -### **Automatic Retries** +### **Automatic Retries for Model Calls** ```python -# Temporal automatically retries failed agent invocations -# Configurable retry policies per activity type -# Exponential backoff with jitter -# Maximum retry attempts per activity +# Temporal automatically retries failed model invocations +# Each model call is a separate Activity with its own retry policy +# Exponential backoff with jitter per model invocation +# Maximum retry attempts per model call ``` -### **Failure Isolation** +### **Failure Isolation and Resumability** ```python -# Agent failures don't crash the entire workflow -# Partial results can be salvaged -# Compensation logic for failed steps -# Graceful degradation strategies +# Model failures don't crash the entire agentic loop +# The workflow can resume from any successful model invocation +# Partial results are preserved in workflow state +# Compensation logic can be implemented for failed model calls ``` ### **Error Recovery Patterns** @@ -271,7 +287,8 @@ class ResilientWorkflow: result = await Runner.run(agent, input) return result.final_output except Exception as e: - # Fallback logic + # The workflow can resume from the last successful model invocation + # or implement fallback logic return await self.fallback_agent(input) ``` @@ -282,7 +299,7 @@ class ResilientWorkflow: # OpenAI API keys managed through environment variables # Temporal server authentication and authorization # Secure communication between components -# Audit logging for all agent interactions +# Audit logging for all model invocations and tool executions ``` ### **Tool Access Control** @@ -296,16 +313,16 @@ class ResilientWorkflow: ## 📈 **Performance Architecture** ### **Optimization Strategies** -1. **Parallel Agent Execution**: Use `asyncio.gather()` for concurrent agents -2. **Connection Pooling**: Reuse OpenAI API connections -3. **Caching**: Cache agent responses and tool results -4. **Batch Processing**: Group similar agent operations +1. **Parallel Model Invocations**: Use `asyncio.gather()` for concurrent model calls +2. **Connection Pooling**: Reuse OpenAI API connections across model activities +3. **Caching**: Cache model responses and tool results +4. **Batch Processing**: Group similar model operations ### **Resource Management** ```python -# Automatic cleanup of agent resources +# Automatic cleanup of model resources # Memory management for large conversations -# Timeout handling for long-running operations +# Timeout handling for long-running model calls # Resource limits per workflow execution ``` @@ -321,18 +338,35 @@ class StatefulWorkflow: @workflow.run async def run(self, input: str) -> str: - # State automatically persisted between steps + # State automatically persisted between model invocations self.conversation_history.append(input) - # ... agent execution + # ... agent execution with automatic checkpointing return result ``` ### **Persistence Patterns** -- **Conversation History**: Maintain context across agent interactions -- **Agent State**: Preserve agent-specific information +- **Conversation History**: Maintain context across model invocations +- **Agent State**: Preserve agent-specific information between model calls - **Tool Results**: Cache and reuse tool outputs -- **Execution Metadata**: Track performance and usage metrics +- **Execution Metadata**: Track performance and usage metrics per model invocation + +## 🎯 **Key Benefits of This Architecture** + +### **Resumability** +- Agentic loops can resume from any model invocation point +- No need to restart the entire agent execution +- Partial progress is preserved in workflow state + +### **Granular Durability** +- Each model call is a separate, durable Activity +- Fine-grained retry policies per model invocation +- Better error isolation and recovery + +### **Scalability** +- Model workers can scale independently +- Different types of model calls can use different worker pools +- Better resource utilization --- -*This architecture document provides the technical foundation for understanding the integration. For implementation examples and specific use cases, refer to the individual service documentation.* +*This architecture document provides the technical foundation for understanding the integration. The key insight is that `Runner.run()` executes inside the workflow, while model invocations automatically create Activities, enabling resumable agentic loops.* From a3ef4629bfab87d699f238a1485bba52632d88ec Mon Sep 17 00:00:00 2001 From: jdnichollsc Date: Sun, 7 Sep 2025 20:30:01 -0500 Subject: [PATCH 4/4] Enhance documentation for OpenAI Agents integration with Temporal. Added details on the requirement for `OpenAIAgentsPlugin` registration, clarified the architecture separating Runner and Agent execution, and updated task queue names in examples. Introduced new agent patterns and improved structured output integration using Pydantic models. --- docs/openai_agents/AGENT_PATTERNS.md | 11 + docs/openai_agents/ARCHITECTURE.md | 52 +- docs/openai_agents/BASIC.md | 8 +- docs/openai_agents/README.md | 35 +- docs/openai_agents/llms.txt | 878 +++++++++++---------------- 5 files changed, 438 insertions(+), 546 deletions(-) diff --git a/docs/openai_agents/AGENT_PATTERNS.md b/docs/openai_agents/AGENT_PATTERNS.md index f21e8eeb7..0e0bb97f4 100644 --- a/docs/openai_agents/AGENT_PATTERNS.md +++ b/docs/openai_agents/AGENT_PATTERNS.md @@ -60,6 +60,7 @@ The system is designed for developers and engineering teams who want to: - **State Persistence**: Automatic state management through Temporal - **Task Queue**: Uses `"openai-agents-patterns-task-queue"` for all workflows - **Pydantic Models**: Output validation requires structured Pydantic models +- **Plugin Requirement**: Requires `OpenAIAgentsPlugin` to be registered with the worker ## 🏗️ System Overview @@ -469,6 +470,12 @@ def orchestrator_agent() -> Agent: handoff_description="An english to french translator", ) + italian_agent = Agent( + name="italian_agent", + instructions="You translate the user's message to Italian", + handoff_description="An english to italian translator", + ) + # Main orchestrator agent that coordinates other agents as tools orchestrator_agent = Agent( name="orchestrator_agent", @@ -487,6 +494,10 @@ def orchestrator_agent() -> Agent: tool_name="translate_to_french", tool_description="Translate the user's message to French", ), + italian_agent.as_tool( + tool_name="translate_to_italian", + tool_description="Translate the user's message to Italian", + ), ], ) return orchestrator_agent diff --git a/docs/openai_agents/ARCHITECTURE.md b/docs/openai_agents/ARCHITECTURE.md index bf17fece5..cba0aaca9 100644 --- a/docs/openai_agents/ARCHITECTURE.md +++ b/docs/openai_agents/ARCHITECTURE.md @@ -7,13 +7,13 @@ This document provides a technical deep dive into how the OpenAI Agents SDK inte ## 🔄 **Core Integration Mechanism** ### **Implicit Activity Creation for Model Invocations** -The integration's key innovation is automatic Temporal Activity creation for each **model invocation** within the agentic loop: +The integration's key innovation is automatic Temporal Activity creation for each **model invocation** within the agentic loop. **Important**: This behavior only occurs when the `OpenAIAgentsPlugin` is registered with the Temporal worker. ```python # What you write result = await Runner.run(agent, input="Hello") -# What happens under the hood +# What happens under the hood (with OpenAIAgentsPlugin registered) # 1. Runner.run() executes inside the Temporal workflow # 2. During the agentic loop, each model invocation becomes a separate Activity # 3. Each Activity is stored in Temporal history @@ -55,36 +55,38 @@ graph TB C[Client Application] end - subgraph "Temporal Layer" - W[Workflow] - MA[Model Activities] - TA[Tool Activities] - S[Temporal Server] + subgraph "Temporal Workflow" + W[Workflow Execution] + R[Runner + Agent] + T[Tools Processing] end - subgraph "OpenAI Layer" - AG[Agent Definition] - LLM[LLM API] - T[Tools] + subgraph "Temporal Activities" + MA[Model Invocation Activities] + TA[Tool Activities] end subgraph "External Services" + LLM[LLM APIs] API[External APIs] DB[Databases] FS[File Systems] end - C --> W - W --> AG - AG --> LLM - AG --> T + subgraph "Temporal Server" + S[History & State] + end - LLM --> MA + C --> W + W --> R + R --> T + R --> MA T --> TA T --> API T --> DB T --> FS + MA --> LLM MA --> S TA --> S S --> MA @@ -97,31 +99,31 @@ graph TB sequenceDiagram participant C as Client participant W as Workflow - participant AG as Agent + participant R as Runner + Agent participant M as Model Activity participant T as Tools participant O as OpenAI API C->>W: Start Workflow - W->>AG: Initialize Agent (inside workflow) + W->>R: Initialize Runner + Agent (inside workflow) loop Agentic Loop - AG->>M: Model Invocation (creates Activity) + R->>M: Model Invocation (creates Activity) M->>O: LLM API Call O->>M: AI Response - M->>AG: Model Result + M->>R: Model Result alt Tool Usage Required - AG->>T: Execute Tool (in workflow) + R->>T: Execute Tool (in workflow) T->>O: Tool API Call O->>T: Tool Result - T->>AG: Tool Response + T->>R: Tool Response end - Note over AG: Checkpoint workflow state + Note over R: Checkpoint workflow state end - AG->>W: Return Final Result + R->>W: Return Final Result W->>C: Workflow Complete ``` @@ -190,6 +192,8 @@ worker = Worker( ) ``` +**Critical**: The `OpenAIAgentsPlugin` must be registered with the worker for model invocations to automatically create Temporal Activities. Without this plugin, the integration will not work as described. + ### **Agent Configuration** ```python from agents import Agent, ModelSettings diff --git a/docs/openai_agents/BASIC.md b/docs/openai_agents/BASIC.md index df5bca460..3be4c8ed9 100644 --- a/docs/openai_agents/BASIC.md +++ b/docs/openai_agents/BASIC.md @@ -367,7 +367,7 @@ class DynamicSystemPromptWorkflow: - **Flexible Behavior**: Single agent with multiple personality modes ### Usage Tracking with RunHooks -**File**: `openai_agents/basic/workflows/agent_lifecycle_workflow.py` +**File**: `openai_agents/basic/workflows/lifecycle_workflow.py` This pattern demonstrates comprehensive usage monitoring for API consumption, token tracking, and cost optimization. @@ -618,7 +618,7 @@ async def main(): worker = Worker( client, - task_queue="openai-agents-task-queue", + task_queue="openai-agents-basic-task-queue", workflows=[ HelloWorldAgent, ToolsWorkflow, @@ -705,9 +705,9 @@ async def test_agent_lifecycle(): ## ⚠️ Important Implementation Notes ### Task Queue Configuration -- **Worker**: Uses task queue `"openai-agents-task-queue"` +- **Worker**: Uses task queue `"openai-agents-basic-task-queue"` - **Runner Scripts**: Use task queue `"openai-agents-basic-task-queue"` -- **Note**: This mismatch means runner scripts won't connect to the worker unless the worker is started with the correct task queue +- **Note**: Task queues are consistent between worker and runner scripts ### Specific Examples Implemented - **Hello World**: Asks about recursion in programming diff --git a/docs/openai_agents/README.md b/docs/openai_agents/README.md index ab35406c0..2cb24ef75 100644 --- a/docs/openai_agents/README.md +++ b/docs/openai_agents/README.md @@ -19,8 +19,8 @@ This combination ensures that AI agent workflows are: ## 🚀 **Key Integration Benefits** -### **Implicit Activity Creation** -Every agent invocation is automatically executed through a Temporal Activity, providing durability without code changes. The `Runner.run()` calls automatically create activities under the hood. +### **Architecture: Runner in Workflow, Model Calls in Activities** +The **Runner and Agent execute within the Temporal workflow** (deterministic environment), while **model invocations automatically become Temporal Activities** (non-deterministic environment). This separation ensures that agent orchestration logic is durable and deterministic, while LLM API calls benefit from Temporal's retry mechanisms and fault tolerance. The integration provides this durability without requiring code changes to your existing Agent SDK applications. ### **Integrated Tracing** Unified observability across both Temporal and OpenAI systems. View agent execution in Temporal's workflow history and OpenAI's tracing dashboard simultaneously. @@ -111,22 +111,41 @@ class AgentWorkflow: ### **OpenAI Agents Plugin** ```python -from temporalio.contrib.openai_agents import OpenAIAgentsPlugin +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin, ModelActivityParameters +from datetime import timedelta + +client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) + ) + ), + ], +) worker = Worker( client, task_queue="openai-agents-task-queue", - plugins=[OpenAIAgentsPlugin()], + workflows=[YourWorkflowClass], ) ``` -### **Agent Integration (Implicit Activities)** +### **Agent Integration (Model Invocations Create Activities)** ```python from agents import Agent, Runner -agent = Agent(name="MyAgent", instructions="...") -# This automatically creates a Temporal Activity under the hood -result = await Runner.run(agent, input_text) +@workflow.defn +class MyAgentWorkflow: + @workflow.run + async def run(self, input_text: str) -> str: + agent = Agent(name="MyAgent", instructions="...") + # Runner.run() executes inside the workflow (deterministic) + # Model invocations automatically create Temporal Activities (non-deterministic) + # (Requires OpenAIAgentsPlugin to be registered with the worker) + result = await Runner.run(agent, input_text) + return result.final_output ``` ## 📖 **Documentation Structure** diff --git a/docs/openai_agents/llms.txt b/docs/openai_agents/llms.txt index 9af2c849a..cb284a857 100644 --- a/docs/openai_agents/llms.txt +++ b/docs/openai_agents/llms.txt @@ -1,626 +1,444 @@ # OpenAI Agents SDK + Temporal Integration ## Overview -This integration combines OpenAI's Agents SDK with Temporal's durable execution engine to create production-ready AI agent workflows. The key innovation is that every agent invocation automatically becomes a Temporal Activity, providing durability, observability, and scalability without code changes. +This integration combines OpenAI's Agents SDK with Temporal's durable execution engine to create production-ready AI agent workflows. The key innovation is that every **model invocation** within the agentic loop automatically becomes a Temporal Activity, providing durability, observability, and scalability without code changes. **Note**: This behavior requires the `OpenAIAgentsPlugin` to be registered with the Temporal worker. ## Core Concept -When you call `Runner.run(agent, input)` in a Temporal workflow, it automatically: -1. Creates a Temporal Activity for the agent execution -2. Provides automatic retries, state persistence, and error handling -3. Enables horizontal scaling (each agent runs in its own process/thread) -4. Integrates tracing between Temporal and OpenAI systems - ---- - -# basic-usage: Basic Usage -URL: /docs/openai_agents/basic -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/BASIC.md - -## Basic Agent Workflow +When you call `Runner.run(agent, input)` in a Temporal workflow (with `OpenAIAgentsPlugin` registered), it: +1. Executes inside the Temporal workflow (not as an Activity) +2. Orchestrates the agentic loop +3. Each model invocation automatically creates a Temporal Activity +4. Provides automatic retries, state persistence, and error handling +5. Enables horizontal scaling (each agent runs in its own process/thread) +6. Integrates tracing between Temporal and OpenAI systems + +## Architecture Summary +- **Workflow Layer**: Temporal workflows orchestrate agent execution +- **Agent Layer**: OpenAI agents with specialized capabilities and tools +- **Activity Layer**: Model invocations become Temporal Activities automatically +- **Tool Layer**: External capabilities (web search, code execution, file search, image generation) +- **State Management**: Persistent conversation state and context across executions +- **Error Handling**: Automatic retries and fault tolerance for production reliability + +## Basic Pattern ```python from temporalio import workflow from agents import Agent, Runner @workflow.defn -class HelloWorldAgent: +class AgentWorkflow: @workflow.run - async def run(self, prompt: str) -> str: + async def run(self, input: str) -> str: agent = Agent( name="Assistant", - instructions="You only respond in haikus.", + instructions="You are a helpful assistant.", model="gpt-4o" ) - # This automatically creates a Temporal Activity - result = await Runner.run(agent, input=prompt) + # Runner.run() executes inside the workflow + # Model invocations automatically create Temporal Activities + result = await Runner.run(agent, input=input) return result.final_output ``` ## Worker Setup ```python +from temporalio import Client from temporalio.worker import Worker -from temporalio.contrib.openai_agents import OpenAIAgentsPlugin +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin, ModelActivityParameters +from datetime import timedelta + +# Create client with plugin +client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin( + model_params=ModelActivityParameters( + start_to_close_timeout=timedelta(seconds=30) + ) + ) + ] +) +# Create worker worker = Worker( client, task_queue="openai-agents-task-queue", - workflows=[HelloWorldAgent], - plugins=[OpenAIAgentsPlugin()] + workflows=[AgentWorkflow] ) ``` -## Agent Lifecycle Management +## Multi-Agent Workflow ```python -@workflow.defn -class AgentLifecycleWorkflow: - @workflow.run - async def run(self, input: str) -> str: - # Initialize agent with specific configuration - agent = Agent( - name="LifecycleAgent", - instructions="Process the input step by step", - model="gpt-4o", - model_settings=ModelSettings( - temperature=0.7, - max_tokens=1000 - ) - ) - - # Execute with lifecycle management - result = await Runner.run( - agent, - input=input, - run_config=RunConfig( - max_steps=50, - timeout=300 - ) - ) - - return result.final_output -``` - ---- - -# agent-patterns: Agent Patterns -URL: /docs/openai_agents/agent_patterns -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/AGENT_PATTERNS.md +from temporalio import workflow +from agents import Agent, Runner -## Multi-Agent Orchestration -```python @workflow.defn class MultiAgentWorkflow: @workflow.run async def run(self, task: str) -> str: - # Planning agent - planner = Agent( - name="Planner", - instructions="Break down complex tasks into steps", - model="gpt-4o" - ) - + # Plan with planner agent + planner = Agent(name="Planner", instructions="...") plan = await Runner.run(planner, f"Plan: {task}") - # Execution agent - executor = Agent( - name="Executor", - instructions="Execute tasks based on plans", - model="gpt-4o" - ) - + # Execute with executor agent + executor = Agent(name="Executor", instructions="...") result = await Runner.run(executor, f"Execute: {plan.final_output}") - return result.final_output -``` - -## Agent Routing with Handoffs -```python -@workflow.defn -class RoutingWorkflow: - @workflow.run - async def run(self, query: str) -> str: - # Triage agent decides routing - triage_agent = Agent( - name="Triage", - instructions="Route queries to appropriate specialists", - handoffs=[weather_agent, business_agent, tech_agent] - ) - # Automatic handoff based on query type - result = await Runner.run(triage_agent, query) return result.final_output ``` -## Parallel Agent Execution -```python -@workflow.defn -class ParallelWorkflow: - @workflow.run - async def run(self, tasks: list[str]) -> list[str]: - # Create specialized agents - agents = [ - Agent(name=f"Agent{i}", instructions=f"Process {task}") - for i, task in enumerate(tasks) - ] - - # Execute agents in parallel - results = await asyncio.gather(*[ - Runner.run(agent, task) - for agent, task in zip(agents, tasks) - ]) - - return [r.final_output for r in results] -``` ---- +## Key Benefits +- **Durability**: Survives crashes, restarts, and failures +- **Scalability**: Independent scaling of different agent types +- **Observability**: Unified tracing across Temporal and OpenAI +- **Production Ready**: Automatic retries, rate limit handling, state persistence +- **Resumability**: Agentic loops can resume from any model invocation point -# tools-integration: Tools Integration -URL: /docs/openai_agents/tools -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/TOOLS.md +## Common Patterns -## Built-in OpenAI Tools +### 1. Sequential Agents ```python -from agents import WebSearchTool, CodeInterpreterTool, FileSearchTool +# Run agents one after another +planner = Agent(name="Planner", instructions="...") +plan = await Runner.run(planner, input) -@workflow.defn -class ToolsWorkflow: - @workflow.run - async def run(self, query: str) -> str: - agent = Agent( - name="ResearchAgent", - instructions="Research and analyze using available tools", - tools=[ - WebSearchTool(), - CodeInterpreterTool(), - FileSearchTool() - ], - model="gpt-4o" - ) - - result = await Runner.run(agent, query) - return result.final_output +executor = Agent(name="Executor", instructions="...") +result = await Runner.run(executor, plan.final_output) ``` -## Custom Tool Integration +### 2. Parallel Agents ```python -@workflow.defn -class CustomToolsWorkflow: - @workflow.run - async def run(self, request: str) -> str: - # Custom tool for database queries - db_tool = CustomTool( - name="database_query", - description="Query the database for information", - function=self.query_database - ) - - agent = Agent( - name="DataAgent", - instructions="Use database tools to answer questions", - tools=[db_tool], - model="gpt-4o" - ) - - result = await Runner.run(agent, request) - return result.final_output - - async def query_database(self, query: str) -> str: - # Database query implementation - return "Database result" +# Use asyncio.gather() for concurrent execution +results = await asyncio.gather( + Runner.run(agent1, input1), + Runner.run(agent2, input2), + Runner.run(agent3, input3) +) ``` -## Image Generation Tools +### 3. Agent Handoffs ```python -@workflow.defn -class ImageGenerationWorkflow: - @workflow.run - async def run(self, prompt: str) -> str: - agent = Agent( - name="ImageAgent", - instructions="Generate images based on descriptions", - tools=[ImageGeneratorTool()], - model="gpt-4o" - ) - - result = await Runner.run(agent, prompt) - return result.final_output +# Use handoffs parameter for agent-to-agent transitions +triage_agent = Agent( + name="Triage", + handoffs=[specialist_agent1, specialist_agent2] +) ``` ---- - -# handoffs: Agent Handoffs -URL: /docs/openai_agents/handoffs -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/HANDOFFS.md - -## Message Filtering with Handoffs +### 4. State Management ```python +# Leverage workflow state for conversation history @workflow.defn -class MessageFilterWorkflow: +class ConversationWorkflow: + def __init__(self): + self.history = [] + @workflow.run - async def run(self, message: str) -> str: - # Filter agent with handoff capabilities - filter_agent = Agent( - name="MessageFilter", - instructions="Filter and route messages appropriately", - handoffs=[ - Agent(name="SpamFilter", instructions="..."), - Agent(name="ContentModerator", instructions="..."), - Agent(name="Router", instructions="...") - ] - ) - - result = await Runner.run(filter_agent, message) - return result.final_output + async def run(self, message: str): + self.history.append(f"User: {message}") + result = await Runner.run(agent, self.history) + self.history.append(f"Agent: {result.final_output}") ``` -## Collaborative Agent Workflows + +### 5. Guardrails & Validation ```python -@workflow.defn -class CollaborationWorkflow: - @workflow.run - async def run(self, project: str) -> str: - # Project manager agent - manager = Agent( - name="ProjectManager", - instructions="Coordinate project execution", - handoffs=[ - Agent(name="Designer", instructions="..."), - Agent(name="Developer", instructions="..."), - Agent(name="Tester", instructions="...") - ] - ) - - result = await Runner.run(manager, project) - return result.final_output +# Input and output guardrails +@input_guardrail +async def validate_input(context, agent, input): + # Custom validation logic + return GuardrailFunctionOutput(tripwire_triggered=False) + +agent = Agent( + name="SafeAgent", + input_guardrails=[validate_input] +) ``` ---- - -# hosted-mcp: Hosted MCP Integration -URL: /docs/openai_agents/hosted_mcp -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/HOSTED_MCP.md - -## MCP Server Integration +### 6. Custom Model Providers ```python -@workflow.defn -class MCPWorkflow: - @workflow.run - async def run(self, request: str) -> str: - # Agent with MCP tool access - mcp_agent = Agent( - name="MCPAgent", - instructions="Use MCP tools to access external systems", - tools=[MCPTool(server_url="http://localhost:3000")], - model="gpt-4o" +# Custom model provider for local/OSS models +class CustomModelProvider(ModelProvider): + def get_model(self, model_name: str) -> Model: + return OpenAIChatCompletionsModel( + model=model_name, + openai_client=ollama_client ) - - result = await Runner.run(mcp_agent, request) - return result.final_output + +# Use in client +client = await Client.connect( + "localhost:7233", + plugins=[ + OpenAIAgentsPlugin(model_provider=CustomModelProvider()) + ] +) ``` -## Approval Workflow with MCP +### 7. Activity-Based I/O ```python +# External I/O operations in activities +@activity.defn +async def external_api_call(data: str) -> str: + # Non-deterministic operations + return await some_external_api(data) + @workflow.defn -class ApprovalMCPWorkflow: +class MyWorkflow: @workflow.run - async def run(self, request: str) -> str: - # Approval agent using MCP tools - approval_agent = Agent( - name="ApprovalAgent", - instructions="Process approval requests using MCP tools", - tools=[ - MCPTool(server_url="http://approval-system:3000"), - MCPTool(server_url="http://notification-system:3000") - ] + async def run(self, input: str): + result = await workflow.execute_activity( + external_api_call, + input, + start_to_close_timeout=workflow.timedelta(seconds=30) ) - - result = await Runner.run(approval_agent, request) - return result.final_output + return result ``` ---- +## Documentation Index -# model-providers: Model Providers -URL: /docs/openai_agents/model_providers -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/MODEL_PROVIDERS.md +### Core Integration Patterns +- **[BASIC.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/BASIC.md)** - Fundamental agent patterns, lifecycle hooks, dynamic prompts, image processing, conversation continuity +- **[AGENT_PATTERNS.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/AGENT_PATTERNS.md)** - Advanced patterns: deterministic flows, parallelization, LLM-as-a-judge, agents-as-tools, guardrails, forcing tool use -## LiteLLM Integration -```python -@workflow.defn -class LiteLLMWorkflow: - @workflow.run - async def run(self, input: str) -> str: - agent = Agent( - name="LiteLLMAgent", - instructions="Process requests using LiteLLM", - model="gpt-4o", - model_settings=ModelSettings( - provider="litellm", - api_base="http://localhost:8000" - ) - ) - - result = await Runner.run(agent, input) - return result.final_output -``` +### Tool Integration & External Services +- **[TOOLS.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/TOOLS.md)** - Code interpreter, file search, image generation, web search, knowledge base setup +- **[HOSTED_MCP.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/HOSTED_MCP.md)** - Model Context Protocol integration with approval workflows +- **[MODEL_PROVIDERS.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/MODEL_PROVIDERS.md)** - Custom LLM providers, LiteLLM integration, local Ollama support -## Ollama Integration -```python -@workflow.defm -class OllamaWorkflow: - @workflow.run - async def run(self, input: str) -> str: - agent = Agent( - name="OllamaAgent", - instructions="Use local Ollama models", - model="llama2", - model_settings=ModelSettings( - provider="ollama", - api_base="http://localhost:11434" - ) - ) - - result = await Runner.run(agent, input) - return result.final_output +### Multi-Agent Systems +- **[CUSTOMER_SERVICE.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/CUSTOMER_SERVICE.md)** - Persistent conversations, agent handoffs, stateful workflows +- **[FINANCIAL_RESEARCH_AGENT.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md)** - Multi-agent orchestration, parallel search execution, specialist analysis tools +- **[RESEARCH_BOT.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/RESEARCH_BOT.md)** - Web research workflows, parallel search execution, intelligent planning + +### Advanced Features +- **[HANDOFFS.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/HANDOFFS.md)** - Agent handoff patterns, message filtering, context management +- **[REASONING_CONTENT.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/REASONING_CONTENT.md)** - Access model reasoning content, explainable AI, step-by-step thinking + +### Architecture & Implementation +- **[ARCHITECTURE.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/ARCHITECTURE.md)** - Deep technical architecture, system design, execution flow diagrams +- **[README.md](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/README.md)** - High-level overview, integration benefits, quick start guide + +## File Structure +``` +openai_agents/ +├── basic/ # Fundamental patterns (9 workflows) +├── agent_patterns/ # Multi-agent architectures (8 patterns) +├── tools/ # Tool integration examples (4 tools) +├── handoffs/ # Agent collaboration (1 workflow) +├── hosted_mcp/ # MCP integration (2 workflows) +├── model_providers/ # Custom LLM providers (2 workflows) +├── reasoning_content/ # Model reasoning access (1 workflow) +├── customer_service/ # Conversational workflows (1 workflow) +├── financial_research_agent/ # Complex multi-agent system (1 workflow) +└── research_bot/ # Research workflow example (1 workflow) ``` -## GPT-OSS Integration +## Task Queue Conventions +Each service uses dedicated task queues for isolation and scalability: +- `openai-agents-basic-task-queue` - Basic patterns and examples +- `openai-agents-patterns-task-queue` - Advanced agent patterns +- `openai-agents-tools-task-queue` - Tool integration workflows +- `openai-agents-handoffs-task-queue` - Agent handoff patterns +- `openai-agents-hosted-mcp-task-queue` - MCP integration workflows +- `openai-agents-model-providers-task-queue` - Custom model providers +- `reasoning-content-task-queue` - Reasoning content extraction +- `openai-agents-task-queue` - Customer service and research workflows +- `financial-research-task-queue` - Financial research agent + +## Key Conventions + +### Workflow Structure ```python @workflow.defn -class GPTOSSWorkflow: +class MyWorkflow: @workflow.run async def run(self, input: str) -> str: - agent = Agent( - name="GPTOSSAgent", - instructions="Use open-source GPT models", - model="gpt2", - model_settings=ModelSettings( - provider="gpt-oss", - api_base="http://localhost:5000" - ) - ) - - result = await Runner.run(agent, input) - return result.final_output + # Workflow implementation + pass ``` ---- - -# reasoning-content: Reasoning Content -URL: /docs/openai_agents/reasoning_content -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/REASONING_CONTENT.md - -## Accessing Model Reasoning +### Agent Configuration ```python -@workflow.defn -class ReasoningWorkflow: - @workflow.run - async def run(self, problem: str) -> str: - agent = Agent( - name="ReasoningAgent", - instructions="Show your reasoning step by step", - model="gpt-4o", - model_settings=ModelSettings( - show_reasoning=True, - reasoning_format="step_by_step" - ) - ) - - result = await Runner.run(agent, problem) - return result.final_output +agent = Agent( + name="DescriptiveName", # Clear, descriptive names + instructions="Specific instructions...", # Focused instructions + model="gpt-4o", # Specify model when needed + tools=[...], # Tool integration + handoffs=[...], # Agent handoffs + input_guardrails=[...], # Input validation + output_guardrails=[...] # Output validation +) ``` -## Thought Process Extraction +### Worker Setup ```python -@workflow.defn -class ThoughtProcessWorkflow: - @workflow.run - async def run(self, question: str) -> str: - agent = Agent( - name="ThoughtAgent", - instructions="Explain your thinking process", - model="gpt-4o", - model_settings=ModelSettings( - extract_thoughts=True, - thought_format="detailed" - ) - ) - - result = await Runner.run(agent, question) - return result.final_output +# Client with plugin (required for model invocations to become activities) +client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()] +) + +# Worker (no plugins needed here) +worker = Worker( + client, + task_queue="service-specific-task-queue", + workflows=[WorkflowClass], + activities=[...] # Custom activities if needed +) ``` ---- +### Error Handling +- Temporal automatically retries failed model invocations +- Use try/catch for custom error handling +- Implement graceful degradation for external tool failures +- Use workflow queries and updates for real-time error reporting -# customer-service: Customer Service -URL: /docs/openai_agents/customer_service -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/CUSTOMER_SERVICE.md +## Quick Reference -## Conversational Customer Service +### Essential Imports ```python -@workflow.defn -class CustomerServiceWorkflow: - def __init__(self): - self.conversation_history = [] - - @workflow.run - async def run(self, customer_query: str) -> str: - # Add to conversation history - self.conversation_history.append(customer_query) - - # Customer service agent - agent = Agent( - name="CustomerService", - instructions="Provide helpful customer support", - model="gpt-4o", - context=f"Conversation history: {self.conversation_history}" - ) - - result = await Runner.run(agent, customer_query) - - # Update history - self.conversation_history.append(result.final_output) - return result.final_output +# Core Temporal imports +from temporalio import workflow, activity +from temporalio.client import Client +from temporalio.worker import Worker + +# OpenAI Agents SDK imports +from agents import Agent, Runner, WebSearchTool, CodeInterpreterTool +from agents import FileSearchTool, ImageGenerationTool, HostedMCPTool +from agents import input_guardrail, output_guardrail, function_tool + +# Temporal OpenAI integration +from temporalio.contrib.openai_agents import OpenAIAgentsPlugin, ModelActivityParameters + +# Pydantic for structured data +from pydantic import BaseModel ``` -## Escalation Workflow +### Common Setup Patterns ```python +# Basic workflow setup @workflow.defn -class EscalationWorkflow: +class MyWorkflow: @workflow.run - async def run(self, issue: str) -> str: - # Initial support agent - support_agent = Agent( - name="SupportAgent", - instructions="Handle customer issues, escalate if needed", - handoffs=[ - Agent(name="SeniorSupport", instructions="..."), - Agent(name="TechnicalSpecialist", instructions="..."), - Agent(name="Manager", instructions="...") - ] - ) - - result = await Runner.run(support_agent, issue) + async def run(self, input: str) -> str: + agent = Agent(name="MyAgent", instructions="...") + result = await Runner.run(agent, input) return result.final_output + +# Worker setup with plugin in client +async def create_worker(): + client = await Client.connect( + "localhost:7233", + plugins=[OpenAIAgentsPlugin()] + ) + worker = Worker( + client, + task_queue="my-task-queue", + workflows=[MyWorkflow] + ) + return worker ``` ---- +## Getting Started +1. Start Temporal server: `temporal server start-dev` +2. Install dependencies: `uv sync --group openai-agents` +3. Set OpenAI API key: `export OPENAI_API_KEY=your_key` +4. Run worker: `uv run openai_agents/basic/run_worker.py` +5. Execute workflow: `uv run openai_agents/basic/run_hello_world_workflow.py` + +## Available Tools -# financial-research: Financial Research Agent -URL: /docs/openai_agents/financial_research_agent -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/FINANCIAL_RESEARCH_AGENT.md +### Core Tools +- **WebSearchTool**: Web search with location-aware context +- **CodeInterpreterTool**: Python code execution for calculations and data analysis +- **FileSearchTool**: Vector-based document search using OpenAI's file search +- **ImageGenerationTool**: DALL-E integration for image creation +- **HostedMCPTool**: Model Context Protocol integration with approval workflows -## Multi-Agent Financial Analysis +### Tool Configuration Examples ```python -@workflow.defn -class FinancialResearchWorkflow: - @workflow.run - async def run(self, research_request: str) -> str: - # Planning agent - planner = Agent( - name="FinancialPlanner", - instructions="Plan financial research approach", - model="gpt-4o" - ) - - plan = await Runner.run(planner, research_request) - - # Risk analysis agent - risk_agent = Agent( - name="RiskAnalyst", - instructions="Analyze financial risks", - model="gpt-4o" - ) - - risk_analysis = await Runner.run(risk_agent, plan.final_output) - - # Financial data agent - data_agent = Agent( - name="DataAnalyst", - instructions="Analyze financial data", - model="gpt-4o" - ) - - data_analysis = await Runner.run(data_agent, risk_analysis.final_output) - - # Synthesis agent - synthesizer = Agent( - name="Synthesizer", - instructions="Combine all analyses into final report", - model="gpt-4o" - ) - - final_report = await Runner.run( - synthesizer, - f"Plan: {plan.final_output}\nRisk: {risk_analysis.final_output}\nData: {data_analysis.final_output}" - ) - - return final_report.final_output -``` +# Web Search with location context +WebSearchTool(user_location={"type": "approximate", "city": "New York"}) + +# Code Interpreter with auto container +CodeInterpreterTool(tool_config={ + "type": "code_interpreter", + "container": {"type": "auto"} +}) + +# File Search with vector store +FileSearchTool( + max_num_results=3, + vector_store_ids=["vs_123"], + include_search_results=True +) ---- +# Image Generation with quality control +ImageGenerationTool(tool_config={ + "type": "image_generation", + "quality": "low" # or "high" +}) + +# MCP Tool without approval +HostedMCPTool(tool_config={ + "type": "mcp", + "server_label": "gitmcp", + "server_url": "https://gitmcp.io/openai/codex", + "require_approval": "never" +}) + +# MCP Tool with approval callback +HostedMCPTool( + tool_config={ + "type": "mcp", + "server_label": "gitmcp", + "server_url": "https://gitmcp.io/openai/codex", + "require_approval": "always" + }, + on_approval_request=approval_callback +) +``` -# research-bot: Research Bot -URL: /docs/openai_agents/research_bot -Source: https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/RESEARCH_BOT.md +## Data Models & Pydantic Integration -## Comprehensive Research Workflow +### Common Pydantic Models ```python -@workflow.defn -class ResearchBotWorkflow: - @workflow.run - async def run(self, research_topic: str) -> str: - # Research manager coordinates the process - manager = Agent( - name="ResearchManager", - instructions="Coordinate comprehensive research", - model="gpt-4o" - ) - - # Planning phase - plan = await Runner.run(manager, f"Plan research for: {research_topic}") - - # Search agents for different aspects - search_agents = [ - Agent(name="AcademicSearch", instructions="Search academic sources"), - Agent(name="NewsSearch", instructions="Search recent news"), - Agent(name="TechnicalSearch", instructions="Search technical documentation") - ] - - # Parallel search execution - search_results = await asyncio.gather(*[ - Runner.run(agent, plan.final_output) - for agent in search_agents - ]) - - # Writing agent synthesizes results - writer = Agent( - name="ResearchWriter", - instructions="Write comprehensive research report", - model="gpt-4o" - ) - - synthesis_input = f"Topic: {research_topic}\nPlan: {plan.final_output}\nResults: {search_results}" - final_report = await Runner.run(writer, synthesis_input) - - return final_report.final_output +from pydantic import BaseModel + +# Financial research models +class FinancialSearchItem(BaseModel): + reason: str + query: str + +class FinancialSearchPlan(BaseModel): + searches: list[FinancialSearchItem] + +# Image generation result +class ImageGenerationResult(BaseModel): + final_output: str + image_data: str | None = None + +# Reasoning content models (from activities) +class ReasoningResult(BaseModel): + reasoning_content: str | None + regular_content: str | None + prompt: str ``` ---- - -# Key Benefits -- **Durability**: Survives crashes, restarts, and failures -- **Scalability**: Independent scaling of different agent types -- **Observability**: Unified tracing across Temporal and OpenAI -- **Production Ready**: Automatic retries, rate limit handling, state persistence - -## Common Patterns -1. **Sequential Agents**: Run agents one after another -2. **Parallel Agents**: Use `asyncio.gather()` for concurrent execution -3. **Agent Handoffs**: Use `handoffs` parameter for agent-to-agent transitions -4. **State Management**: Leverage workflow state for conversation history -5. **Error Handling**: Temporal automatically retries failed agent invocations +### Output Type Integration +```python +agent = Agent( + name="StructuredAgent", + instructions="Generate structured output", + output_type=MyPydanticModel # Enforces structured output +) -## File Structure -``` -openai_agents/ -├── basic/ # Fundamental patterns -├── agent_patterns/ # Multi-agent architectures -├── tools/ # Tool integration examples -├── handoffs/ # Agent collaboration -├── hosted_mcp/ # MCP integration -├── model_providers/ # Custom LLM providers -├── reasoning_content/ # Model reasoning access -├── customer_service/ # Conversational workflows -├── financial_research_agent/ # Complex multi-agent system -└── research_bot/ # Research workflow example +result = await Runner.run(agent, input) +structured_output = result.final_output_as(MyPydanticModel) ``` -## Getting Started -1. Start Temporal server: `temporal server start-dev` -2. Install dependencies: `uv sync --group openai-agents` -3. Set OpenAI API key: `export OPENAI_API_KEY=your_key` -4. Run worker: `uv run openai_agents/basic/run_worker.py` -5. Execute workflow: `uv run openai_agents/basic/run_hello_world_workflow.py` - ## Best Practices - Keep agents focused on single responsibilities - Use descriptive agent names for debugging @@ -628,8 +446,48 @@ openai_agents/ - Handle agent failures gracefully with Temporal's retry mechanisms - Use appropriate timeouts for agent operations - Monitor both Temporal and OpenAI dashboards for observability +- Use Pydantic models for structured data validation +- Implement proper error handling for external tool failures +- Use dedicated task queues for service isolation +- Follow naming conventions for workflows and agents + +## Integration Summary + +### Core Architecture Benefits +- **Implicit Activity Creation**: Model invocations automatically become Temporal Activities +- **Durable Execution**: Agent workflows survive crashes, restarts, and failures +- **Horizontal Scaling**: Each agent can run in its own process/thread +- **Unified Observability**: Tracing across both Temporal and OpenAI systems +- **Production Ready**: Automatic retries, rate limiting, and state persistence + +### Service Capabilities Overview +- **9 Basic Workflows**: Hello world, lifecycle hooks, dynamic prompts, image processing +- **8 Agent Patterns**: Deterministic flows, parallelization, LLM-as-a-judge, guardrails +- **4 Tool Types**: Web search, code execution, file search, image generation +- **2 MCP Patterns**: Simple connections and approval-based workflows +- **2 Model Providers**: LiteLLM integration and custom Ollama providers +- **3 Multi-Agent Systems**: Customer service, financial research, research bot +- **1 Reasoning System**: Access to model step-by-step thinking processes + +### Key Technical Patterns +1. **Workflow-First Design**: All agent execution happens within Temporal workflows +2. **Activity-Based I/O**: External operations (API calls, file I/O) in activities +3. **Stateful Conversations**: Persistent state across multiple interactions +4. **Parallel Execution**: Concurrent agent execution using `asyncio.gather()` +5. **Tool Integration**: Seamless integration of external capabilities +6. **Error Resilience**: Automatic retries and graceful degradation +7. **Structured Output**: Pydantic models for type-safe data exchange +8. **Guardrails**: Input/output validation for production safety + +### Development Workflow +1. **Design**: Plan agent responsibilities and workflow structure +2. **Implement**: Create workflows with proper error handling +3. **Test**: Use dedicated task queues for isolated testing +4. **Deploy**: Scale workers independently based on demand +5. **Monitor**: Use Temporal and OpenAI dashboards for observability ## Resources - [Temporal Blog](https://temporal.io/blog/announcing-openai-agents-sdk-integration) - [Python SDK](https://github.com/temporalio/sdk-python/tree/main/temporalio/contrib/openai_agents) - [Community Demos](https://github.com/temporal-community/openai-agents-demos) +- [Architecture Deep Dive](https://raw.githubusercontent.com/temporalio/samples-python/refs/heads/main/docs/openai_agents/ARCHITECTURE.md)