The tool engine adds function-calling capabilities to the voice pipeline. The LLM decides when to call tools — the pipeline executes them and feeds results back.
transcript → LLM (with tool definitions)
│
├─ text response → TTS → audio
│
└─ tool_call(name, args) → ToolExecutor → result
│
inject into context
│
LLM (with result) → TTS → audio
- Tool definitions are passed to the LLM via
set_tools() - The LLM receives the conversation and decides whether to call a tool
- If the LLM returns
ToolCallrequests, the pipeline executes the matching tools - Tool results are injected as
Toolmessages in the conversation - The LLM is called again with the updated conversation
- The final text response is spoken via TTS
Tools can use either a callback handler (for platform consumers like Swift/Kotlin) or a shell command (for CLI usage). The handler takes priority when both are set.
ToolDefinition tool;
tool.name = "tell_time";
tool.description = "Tell the current time";
tool.cooldown = 0;
tool.handler = [](const std::string& name, const std::string& args) -> std::string {
return "3:14 PM";
};
pipeline.tool_registry().add(tool);ToolDefinition tool;
tool.name = "tell_time";
tool.description = "Tell the current time";
tool.triggers = {}; // optional, for standalone IntentMatcher use
tool.command = "date '+%I:%M %p'";
tool.timeout = 5; // seconds (advisory)
tool.cooldown = 30; // seconds between invocations// Callback-based tool
const char* my_handler(const char* name, const char* args, void* ctx) {
return "3:14 PM";
}
const char* triggers[] = {"what time", NULL};
sc_tool_definition_t tool = {
.name = "tell_time",
.description = "Tell the current time",
.triggers = triggers,
.handler = my_handler,
.handler_context = NULL,
.timeout = 5,
.cooldown = 30
};
sc_pipeline_add_tool(pipeline, tool);Tools can be loaded from JSON (shell-command tools only):
[
{
"name": "tell_time",
"description": "Tell the current time",
"triggers": [],
"command": "date '+%I:%M %p'",
"timeout": 5,
"cooldown": 30
}
]// C++ API
pipeline.tool_registry().load_json(json_string);
// C API
sc_pipeline_load_tools_json(pipeline, json_string);The LLMInterface::chat() returns an LLMResponse containing text and/or tool calls:
struct ToolCall {
std::string name; // tool name
std::string arguments; // JSON arguments
};
struct LLMResponse {
std::string text;
std::vector<ToolCall> tool_calls;
};The LLM implementation decides when to call tools based on the conversation and tool definitions provided via set_tools().
Each tool has a cooldown period (in seconds). After execution, the same tool cannot be triggered again until the cooldown expires. Set cooldown = 0 to disable.
When mask_tool_results is enabled (default), tool result messages are dropped before conversation messages during history trimming. Tool outputs are self-contained — the LLM already acted on them, so the raw output can be dropped while keeping user/assistant turns. The last 2 messages are always protected to avoid dropping the current turn's tool result before the LLM sees it.
IntentMatcher is available as a standalone component for regex-based pattern matching — useful outside the pipeline for quick intent classification. The pipeline itself does not use it; the LLM decides tool calls.
| Event | When | Payload |
|---|---|---|
ToolCallStarted |
LLM requested a tool call | text = tool name |
ToolCallCompleted |
Tool execution finished | text = tool output |
VoicePipeline pipeline(stt, tts, &llm, vad, config, on_event);
// Callback-based tool
ToolDefinition tool;
tool.name = "tell_time";
tool.description = "Tell the current time";
tool.cooldown = 0;
tool.handler = [](const std::string& name, const std::string& args) {
return "3:14 PM";
};
pipeline.tool_registry().add(tool);
pipeline.start();