10. Tool Calling

Tool Calling in TabbyAPI

Tool calling is available for supported models, and enabled by selecting a tool format in the model config. This can also be specifed per model using tabby_config.yml.

Most tool-calling models are also reasoning models and it is recommended to enable reasoning as well, with appropriate reasoning tags (these cannot currently be inferred from the model's template).

model:
    reasoning: true
    reasoning_start_token: "<think>"
    reasoning_end_token: "</think>"
    tool_format: qwen3_5

Supported formats

Below are the currently recognized formats

tool_format	Aliases	Model types
qwen3_coder	qwen3_5, step3_5	Qwen3-Coder Qwen3-Next Qwen3.5
minimax_m2		Minimax-M2 Minimax-M2.1 Minimax-M2.5
glm4_5	glm4_6 glm4_7	GLM4.5 GLM4.6 GLM4.7
mistral_old ¹		(older Mistral-family models)
mistral		Codestral 2508+ Devstral-Small 2507+ Magistral-Medium 2506+ Magistral-Small 2506+ Ministral-3 2512+ Mistral-Medium-3.1 2508+ Mistral-Small-3.2 2506+
gemma4		Gemma 4-it

¹ Older Mistral models tend to have unreliable tool calling support and even newer ones are often released without official chat templates or with templates that omit any tool formatting. Tokenization also changes frequently between model releases. YMMV

Clients

TabbyAPI should support any software that uses the OAI tool calling API. But the standard is evolving, no two clients can agree on exactly what it looks like and models are trained with different assumptions as well. Below will be collected notes pertaining to various client software and how it relates to TabbyAPI's tool calling support.

OpenCode

OpenCode by default forces categorical sampling, overriding TabbyAPI's defaults with top-P = 1.0. This confuses some models, so if you're experiencing occasional random gibberish in your output, check your OpenCode config to make sure sampling is configured there, e.g.:
```
"agent": {
  "build": {
    "top_p": 0.8
  },
  "plan": {
    "top_p": 0.8
  }
}
```
OpenCode doesn't explicitly enable reasoning in the request by default. For some models this doesn't matter. For others (e.g. Gemma4) you can configure force_enable_thinking: true in TabbyAPI.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

10. Tool Calling

Tool Calling in TabbyAPI

Supported formats

Clients

OpenCode

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally