Gambiarra to accept tool calls.#559
Conversation
|
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a local patching mechanism to modify the bundled llama.cpp server at build time (without moving the submodule pointer) so it can accept OpenAI-style tools / tool_choice requests and translate strict JSON tool-call outputs into OpenAI-compatible message.tool_calls.
Changes:
- Introduces a Python patch applier that updates
llama.cppsources (including a CORS preflight tweak) prior to builds. - Adds a unified diff patch that adapts the llama.cpp server utilities for tool calls / tool results conversion.
- Updates the environment setup script to apply patches automatically before compiling.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| utils/apply_local_patches.py | Implements local unified-diff patch application and a direct CORS block replacement. |
| setup_env.py | Runs the local patch step automatically before build steps. |
| patches/llama-server-tools.patch | Patch content that modifies llama.cpp server utils to support tool calls and tool result formatting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (2)
patches/llama-server-tools.patch:1
nlohmann::json::contains()throwstype_errorwhen the JSON value is not an object. If the model outputs valid JSON that’s an array/number/string (e.g.,[],"ok"), this will crash the server instead of returning “no tool calls”. Add an early guard likeif (!parsed.is_object()) return json::array();(or equivalent) before callingcontains().
diff --git a/examples/server/utils.hpp b/examples/server/utils.hpp
patches/llama-server-tools.patch:1
- This fallback treats any JSON object containing
"name"(or"function") as a tool call, which can misclassify normal “JSON-only” answers (common for structured output) as tool invocations and incorrectly setfinish_reason = "tool_calls". Tighten the detection criteria (e.g., requiretool_calls/tool_call, or requireargumentsalongsidename, and/or only attempt this upgrade when the request actually providedtools).
diff --git a/examples/server/utils.hpp b/examples/server/utils.hpp
Adds a patch mechanism to support OpenAI-style tool calls in the bundled llama.cpp server without changing the submodule pointer.
Changes
Adds patches/llama-server-tools.patch.
Adds utils/apply_local_patches.py.
Updates setup_env.py to apply local patches before building.
Enables the server to accept tools / tool_choice requests instead of rejecting them.
Converts strict JSON tool-call responses into OpenAI-compatible message.tool_calls.
Handles both tool_calls and tool_call response shapes.
Converts tool result messages into user-visible observations so the model can produce a final answer.