22
33** Feature** : INITIAL-10.md — Agentic Layer
44** Status** : Ready for Implementation
5- ** Confidence Score** : 7.5/10
5+ ** Confidence Score** : 8.0/10
6+ ** Last Updated** : 2026-02-01 (Post Phase-9 RAG Review)
67
78---
89
@@ -65,7 +66,7 @@ This is the "Brain" layer that orchestrates tools from INITIAL-9 (RAG), Phase 5
6566 why : " Official PydanticAI docs - main reference"
6667
6768- url : https://ai.pydantic.dev/agents/
68- why : " Agent constructor, result_type , system_prompt, run/run_stream methods"
69+ why : " Agent constructor, output_type , system_prompt, run/run_stream methods"
6970
7071- url : https://ai.pydantic.dev/tools/
7172 why : " @agent.tool decorator, RunContext, deps_type, tool parameters"
@@ -165,13 +166,14 @@ examples/agents/
165166### Known Gotchas & Library Quirks
166167
167168``` python
168- # CRITICAL: PydanticAI model identifier format
169- # Use "anthropic:claude-sonnet-4-20250514" NOT "claude-sonnet-4-20250514"
170- agent = Agent(model = " anthropic:claude-sonnet-4-20250514" )
169+ # CRITICAL: PydanticAI model identifier format (updated Jan 2026)
170+ # Use "anthropic:claude-sonnet-4-5" NOT "claude-sonnet-4-5"
171+ # For production, pin specific version: "anthropic:claude-sonnet-4-5-20250929"
172+ agent = Agent(model = " anthropic:claude-sonnet-4-5" )
171173
172174# CRITICAL: deps_type must match RunContext generic parameter
173175agent = Agent(
174- model = " anthropic:claude-sonnet-4-20250514 " ,
176+ model = " anthropic:claude-sonnet-4-5 " ,
175177 deps_type = AgentDeps, # Your dependency dataclass
176178)
177179
@@ -502,9 +504,12 @@ class WSEvent(BaseModel):
502504``` yaml
503505MODIFY : pyproject.toml
504506ADD to dependencies :
505- - " pydantic-ai>=0.1. 0" # PydanticAI agent framework
506- - " anthropic>=0.40 .0" # Anthropic SDK for Claude
507+ - " pydantic-ai>=1.48. 0" # PydanticAI agent framework (v1 stable, API guaranteed)
508+ - " anthropic>=0.50 .0" # Anthropic SDK for Claude
507509 - " websockets>=13.0" # WebSocket support (already in uvicorn[standard])
510+
511+ NOTE : PydanticAI v1.0 was released Sept 2025 with API stability guarantee.
512+ Current version is 1.48.0 (Jan 2026). Do NOT use 0.x versions.
508513` ` `
509514
510515### Task 2: Add Agent Settings to config.py
@@ -514,7 +519,7 @@ MODIFY: app/core/config.py
514519ADD after RAG settings :
515520
516521 # Agent LLM Configuration
517- agent_default_model : str = "anthropic:claude-sonnet-4-20250514 "
522+ agent_default_model : str = "anthropic:claude-sonnet-4-5 "
518523 agent_fallback_model : str = "openai:gpt-4o"
519524 agent_temperature : float = 0.1
520525 agent_max_tokens : int = 4096
@@ -596,28 +601,41 @@ INCLUDE:
596601CREATE : app/features/agents/tools/registry_tools.py
597602TOOLS :
598603 - list_runs(ctx, filters) -> list[RunSummary]
604+ # Wraps: RegistryService.list_runs(db, page, page_size, model_type, status, store_id, product_id)
599605 - compare_runs(ctx, run_id_a, run_id_b) -> CompareResult
606+ # Wraps: RegistryService.compare_runs(db, run_id_a, run_id_b)
600607 - create_alias(ctx, alias_name, run_id) -> AliasResult
608+ # Wraps: RegistryService.create_alias(db, AliasCreate(...))
609+ # REQUIRES HUMAN APPROVAL
601610 - archive_run(ctx, run_id) -> ArchiveResult
611+ # Wraps: RegistryService.update_run(db, run_id, RunUpdate(status=RunStatus.ARCHIVED))
612+ # NOTE: No direct archive method - use update_run with ARCHIVED status
613+ # REQUIRES HUMAN APPROVAL
602614
603615CREATE : app/features/agents/tools/backtesting_tools.py
604616TOOLS :
605617 - run_backtest(ctx, model_type, config, store_id, product_id, n_splits) -> BacktestResult
618+ # Wraps: BacktestingService.run_backtest(db, store_id, product_id, start_date, end_date, config)
606619
607620CREATE : app/features/agents/tools/forecasting_tools.py
608621TOOLS :
609622 - list_models(ctx) -> list[ModelInfo]
623+ # Returns available model types: naive, seasonal_naive, moving_average, lightgbm (if enabled)
610624
611625CREATE : app/features/agents/tools/rag_tools.py
612626TOOLS :
613- - retrieve_context(ctx, query, top_k) -> list[RetrievedChunk]
627+ - retrieve_context(ctx, query, top_k) -> list[ChunkResult]
628+ # Wraps: RAGService.retrieve(db, RetrieveRequest(query=query, top_k=top_k))
629+ # NOTE: RAG service uses retrieve() not retrieve_context()
614630 - format_citation(ctx, chunk) -> Citation
631+ # Transforms ChunkResult to Citation schema
615632
616633CRITICAL for all tools :
617634 - Use @agent.tool decorator (not @agent.tool_plain) for db access
618635 - First param is RunContext[AgentDeps]
619- - Detailed docstrings for LLM schema
636+ - Detailed docstrings for LLM schema (Google/numpy style supported)
620637 - Structured logging with timing
638+ - Match actual service method signatures from Phase 5-9 implementations
621639` ` `
622640
623641### Task 8: Create Agent Definitions
@@ -736,11 +754,58 @@ ADD websocket: app.add_api_websocket_route("/agents/stream", websocket_stream)
736754` ` ` yaml
737755CREATE : app/features/agents/tests/conftest.py
738756FIXTURES :
739- - db_session : Async session with cleanup
757+ - db_session : Async session with cleanup (follow registry/tests/conftest.py pattern)
740758 - client : AsyncClient with db override
741- - mock_anthropic : Mock Anthropic API responses
742- - sample_experiment_request : Test request
743- - sample_rag_request : Test request
759+ - mock_pydantic_ai_agent : Mock PydanticAI Agent (see pattern below)
760+ - sample_experiment_request : ExperimentRequest fixture
761+ - sample_rag_request : RAGQueryRequest fixture
762+ - sample_agent_session : AgentSession ORM fixture
763+
764+ MOCK PATTERN (following rag/tests/conftest.py mock_embedding_service) :
765+ ` ` `
766+
767+ ` ` ` python
768+ @pytest.fixture
769+ def mock_pydantic_ai_agent() :
770+ " " " Mock PydanticAI Agent for unit tests without LLM calls.
771+
772+ Follows the mock_embedding_service pattern from RAG tests.
773+ Returns deterministic responses without API calls.
774+ " " "
775+ from unittest.mock import AsyncMock, MagicMock
776+ from app.features.agents.schemas import ExperimentReport, RunSummary
777+
778+ # Create mock structured output
779+ mock_report = ExperimentReport(
780+ objective="Test objective",
781+ methodology="Tested naive and seasonal_naive models",
782+ experiments_run=2,
783+ best_run=RunSummary(
784+ run_id="test123",
785+ model_type="seasonal_naive",
786+ config={"season_length" : 7},
787+ metrics={"mae" : 5.0, "smape": 10.0},
788+ ),
789+ baseline_comparison=None,
790+ recommendation="Deploy seasonal_naive model",
791+ approval_required=False,
792+ )
793+
794+ # Mock result object
795+ mock_result = MagicMock()
796+ mock_result.output = mock_report
797+ mock_result.usage.return_value = MagicMock(
798+ input_tokens=100,
799+ output_tokens=50,
800+ )
801+ mock_result.messages = []
802+
803+ # Mock agent
804+ agent = MagicMock()
805+ agent.run = AsyncMock(return_value=mock_result)
806+ agent.run_stream = AsyncMock()
807+
808+ return agent
744809```
745810
746811### Task 14: Create Unit Tests
@@ -792,9 +857,11 @@ MODIFY: .env.example
792857ADD :
793858 # Agent Configuration
794859 ANTHROPIC_API_KEY=sk-ant-...
795- AGENT_DEFAULT_MODEL=anthropic:claude-sonnet-4-20250514
860+ AGENT_DEFAULT_MODEL=anthropic:claude-sonnet-4-5
861+ AGENT_FALLBACK_MODEL=openai:gpt-4o
796862 AGENT_MAX_TOOL_CALLS=10
797863 AGENT_TIMEOUT_SECONDS=120
864+ AGENT_TEMPERATURE=0.1
798865```
799866
800867---
@@ -899,22 +966,31 @@ python examples/agents/websocket_client.py
899966
900967---
901968
902- ## Confidence Score: 7.5 /10
969+ ## Confidence Score: 8.0 /10
903970
904971** Strengths:**
905- - PydanticAI has excellent documentation
906- - Clear FastAPI integration patterns
907- - Existing service patterns to follow
908- - Tool integrations with existing modules
972+ - PydanticAI v1.x provides API stability guarantee (released Sept 2025)
973+ - Clear FastAPI integration patterns with excellent documentation
974+ - Existing service patterns from Registry/RAG/Backtesting to follow
975+ - Tool integrations with existing modules well-defined
976+ - Mock patterns established in RAG tests (mock_embedding_service)
909977
910978** Risks:**
911- - PydanticAI is relatively new (versioning may change)
912979- WebSocket streaming with tools is complex
913- - LLM rate limits may affect tests
980+ - LLM rate limits may affect integration tests
914981- Message history serialization edge cases
982+ - Tool execution ordering in multi-step workflows
915983
916984** Mitigations:**
917- - Pin PydanticAI version in pyproject.toml
918- - Comprehensive mocking for unit tests
919- - Rate-limited integration tests
985+ - Pin PydanticAI version >=1.48.0 in pyproject.toml
986+ - Comprehensive mocking following RAG test patterns
987+ - Rate-limited integration tests with retry logic
920988- JSONB for flexible message storage
989+ - Timeout handling with asyncio.wait_for
990+
991+ ** Changes Since Initial Review (2026-02-01):**
992+ - Updated PydanticAI from 0.1.0 to 1.48.0 (v1 stable)
993+ - Updated Claude model identifier to claude-sonnet-4-5 format
994+ - Added service method mapping notes to Task 7
995+ - Added mock_pydantic_ai_agent fixture pattern
996+ - Verified tool wrappers match actual service APIs
0 commit comments