You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Proposal: agentic-core public API — composable step functions per ADR-03
Context
ADR-03 defines the core crate's public API as a set of composable async functions that implement the agentic loop. Each function is a standalone step that can be:
Called directly in standalone mode (execute() composes them)
Wrapped in a Praxis HttpFilter for gateway mode
Tested in isolation without HTTP infrastructure
The ADR intentionally left the function parameters as (...) placeholders. This issue proposes concrete type signatures, informed by:
The ADR-02 store schema (three-table model: Conversation/Response/Item)
pubstructExecutionContext{/// Pluggable response/conversation store (ADR-02).pubstore:Arc<dynResponseStore>,/// Inference backend caller.pubinference:Arc<dynInferenceClient>,/// Tool registry for dispatch.pubtools:Arc<dynToolRegistry>,/// Gateway configuration (LLM base URL, timeouts, etc.)pubconfig:CoreConfig,}
Store trait (ADR-02 alignment)
#[async_trait]pubtraitResponseStore:Send + Sync{/// Load a response by ID, returning its metadata and history_item_ids.asyncfnget_response(&self,response_id:&str) -> Result<StoredResponse,Error>;/// Bulk-fetch items by IDs. Caller is responsible for re-ordering.asyncfnget_items(&self,item_ids:&[String]) -> Result<Vec<StoredItem>,Error>;/// Persist a completed response with its items.asyncfnpersist(&self,response_id:&str,conversation_id:Option<&str>,previous_response_id:Option<&str>,items:&[Item],metadata:&ResponseMetadata,) -> Result<(),Error>;}
Inference trait
#[async_trait]pubtraitInferenceClient:Send + Sync{/// Send a request to the LLM and return a stream of output chunks./// The stream yields typed events (delta text, tool calls, done).asyncfncall(&self,request:&InferenceRequest,) -> Result<Pin<Box<dynStream<Item = Result<InferenceEvent,Error>> + Send>>,Error>;}
Tool dispatch trait
#[async_trait]pubtraitToolRegistry:Send + Sync{/// Execute a tool call, returning the result.asyncfndispatch(&self,call:&ToolCall) -> Result<ToolResult,Error>;/// Check if a tool is registered and available.fnhas_tool(&self,name:&str) -> bool;}
Proposed Step Functions
1. rehydrate_conversation
/// Reconstruct conversation history from a previous_response_id.////// Loads the stored response, fetches referenced items, restores ordering,/// and returns the full conversation ready for the next inference call.pubasyncfnrehydrate_conversation(previous_response_id:&str,ctx:&ExecutionContext,) -> Result<Conversation,Error>
Returns:Conversation containing ordered history items + metadata from the prior turn.
2. call_inference
/// Send the conversation to the LLM backend and collect the response.////// For non-streaming consumers, collects the full response./// For streaming, returns a stream handle.pubasyncfncall_inference(conversation:&Conversation,request:&InferenceRequest,ctx:&ExecutionContext,) -> Result<InferenceResult,Error>
Returns:InferenceResult containing output items + any tool calls detected.
3. dispatch_tools
/// Execute all tool calls from an inference result.////// Dispatches each call through the ToolRegistry, collecting results./// Returns the tool outputs ready to be appended to the conversation.pubasyncfndispatch_tools(tool_calls:&[ToolCall],ctx:&ExecutionContext,) -> Result<Vec<ToolResult>,Error>
4. assemble_response
/// Build the final API response from inference output and tool results.////// Produces the ResponsesResponse object with stable item IDs,/// proper lifecycle events, and correct output typing.pubasyncfnassemble_response(request_id:&str,output:&InferenceResult,model:&str,metadata:&ResponseMetadata,) -> Result<ResponsesResponse,Error>
5. persist_response
/// Save the completed response and its items to the store.////// Only persists when store is enabled and the request requires it.pubasyncfnpersist_response(response:&ResponsesResponse,conversation_id:Option<&str>,previous_response_id:Option<&str>,ctx:&ExecutionContext,) -> Result<(),Error>
6. execute — the convenience orchestrator
/// Run the full agentic loop: rehydrate → infer → (tool loop) → assemble → persist.////// This is the default composition of the step functions with standard/// iteration policy (max_iterations capped, tool results fed back).pubasyncfnexecute(request:ResponsesRequest,ctx:&ExecutionContext,) -> Result<ResponsesResponse,Error>
Key Domain Types
pubstructConversation{pubid:Option<String>,pubhistory:Vec<Item>,pubmetadata:ConversationMetadata,}pubstructItem{pubid:String,pubdata: serde_json::Value,// model-agnostic}pubstructInferenceRequest{pubmodel:String,pubinput:Vec<serde_json::Value>,// history + new inputpubtools:Vec<ToolConfig>,pubstream:bool,pubparams: serde_json::Map<String, serde_json::Value>,// passthrough}pubstructInferenceResult{puboutput:Vec<OutputItem>,pubtool_calls:Vec<ToolCall>,pubusage:Option<Usage>,}pubstructToolCall{pubid:String,pubcall_id:String,pubname:String,pubarguments:String,}pubstructToolResult{pubcall_id:String,puboutput:String,}
Composability Example
Standalone mode (default execute())
let response = agentic_core::execute(request,&ctx).await?;
Custom loop with guardrails
let conversation = rehydrate_conversation(&prev_id,&ctx).await?;letmut inference_req = build_request(&conversation,&request);for _ in0..max_iterations {let result = call_inference(&conversation,&inference_req,&ctx).await?;if result.tool_calls.is_empty(){returnassemble_response(&req_id,&result,&model,&meta).await;}// Custom guardrail: validate tool calls before dispatchvalidate_tool_calls(&result.tool_calls)?;let tool_results = dispatch_tools(&result.tool_calls,&ctx).await?;append_tool_results(&mut inference_req,&result,&tool_results);}
ExecutionContext is extensible — a session_cache: Option<Arc<dyn SessionCache>> field can be added without breaking the API.
Open Questions
Streaming + tool loop tension: SSE streaming to clients requires forwarding chunks in real-time. But the tool loop needs to collect the full output to detect tool calls before iterating. vLLM solves this with a ConversationContext that accumulates output while yielding each chunk to the SSE stream — effectively "tee-ing" the stream. Should call_inference return a stream that the caller both forwards and accumulates, or should we have separate streaming/non-streaming paths? Proposal: call_inference returns a Stream<InferenceEvent>. For the tool-loop path, execute() collects internally. For direct streaming, the server layer can forward events as they arrive and buffer for tool detection simultaneously.
Error granularity: Should each step have its own error type, or share a unified Error enum? Proposal: single Error enum with variants per failure domain (Store, Inference, ToolDispatch, Serialization).
Where does SSE assembly live?assemble_response produces the final API object, but SSE event framing (data: ...\n\n) is HTTP-level. Proposal: assembly produces the typed response; SSE framing lives in agentic-server.
cc @leseb — as discussed, happy to drive the implementation. Would appreciate feedback on the trait boundaries and type choices before I start coding.
Proposal:
agentic-corepublic API — composable step functions per ADR-03Context
ADR-03 defines the core crate's public API as a set of composable async functions that implement the agentic loop. Each function is a standalone step that can be:
execute()composes them)HttpFilterfor gateway modeThe ADR intentionally left the function parameters as
(...)placeholders. This issue proposes concrete type signatures, informed by:ConversationContextpattern inresponses/serving.pyDesign Goals
serde_json::Valueat boundaries, typed internally — the gateway is model-agnostic; we don't enforce a fixed item schema at the API levelExecutionContextcarries optional cache hints without breaking the APIProposed Types
ExecutionContext— shared state across stepsStore trait (ADR-02 alignment)
Inference trait
Tool dispatch trait
Proposed Step Functions
1.
rehydrate_conversationReturns:
Conversationcontaining ordered history items + metadata from the prior turn.2.
call_inferenceReturns:
InferenceResultcontaining output items + any tool calls detected.3.
dispatch_tools4.
assemble_response5.
persist_response6.
execute— the convenience orchestratorKey Domain Types
Composability Example
Standalone mode (default
execute())Custom loop with guardrails
Praxis filter chain
Implementation Plan
Phase 1: Type definitions + traits (this issue)
crates/agentic-core/src/types.rscrates/agentic-core/src/traits.rstodo!()bodiesagentic-coreconversation and responses database CRUD. #33, feat: add OGX integration with agentic loop and state hydration #34) can align their interfacesPhase 2:
rehydrate_conversation+persist_responseResponseStoretraitagentic-coreconversation and responses database CRUD. #33's SQLx backend becomes one implementation of the traitPhase 3:
call_inferenceInferenceClienttraitPhase 4:
dispatch_tools+executeToolRegistryimplementationexecute()ties all steps together with default iteration policyPhase 5: Praxis filter wrappers
HttpFilterimpls inagentic-praxisthat delegate to step functionsRelationship to Existing PRs
ResponseStoretrait implementation. Types align withStoredResponse/StoredItem.OgxStorebecomes aResponseStoreimpl. The agentic loop inagentic.rsbecomes the reference forexecute().ExecutionContextis extensible — asession_cache: Option<Arc<dyn SessionCache>>field can be added without breaking the API.Open Questions
Streaming + tool loop tension: SSE streaming to clients requires forwarding chunks in real-time. But the tool loop needs to collect the full output to detect tool calls before iterating. vLLM solves this with a
ConversationContextthat accumulates output while yielding each chunk to the SSE stream — effectively "tee-ing" the stream. Shouldcall_inferencereturn a stream that the caller both forwards and accumulates, or should we have separate streaming/non-streaming paths? Proposal:call_inferencereturns aStream<InferenceEvent>. For the tool-loop path,execute()collects internally. For direct streaming, the server layer can forward events as they arrive and buffer for tool detection simultaneously.serde_json::Valuevs typed items: PR feat: add OGX integration with agentic loop and state hydration #34 usesVec<serde_json::Value>for input items (maximum flexibility, no schema enforcement). PR [FEAT]agentic-coreconversation and responses database CRUD. #33 defines typedInOutItemenum. Proposal: core API usesserde_json::Valueat boundaries; typed wrappers are optional convenience in atypesmodule.Error granularity: Should each step have its own error type, or share a unified
Errorenum? Proposal: singleErrorenum with variants per failure domain (Store,Inference,ToolDispatch,Serialization).Where does SSE assembly live?
assemble_responseproduces the final API object, but SSE event framing (data: ...\n\n) is HTTP-level. Proposal: assembly produces the typed response; SSE framing lives inagentic-server.cc @leseb — as discussed, happy to drive the implementation. Would appreciate feedback on the trait boundaries and type choices before I start coding.