extension/llm/server: document pi integration#19999
Open
mergennachin wants to merge 8 commits into
Open
Conversation
[ghstack-poisoned]
Contributor
Author
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19999
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Unrelated FailureAs of commit 83986e5 with merge base eeb0646 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Jun 3, 2026
[ghstack-poisoned]
mergennachin
added a commit
that referenced
this pull request
Jun 3, 2026
Add an operational recipe to the server README for pointing pi (or any OpenAI-compatible harness) at the ExecuTorch server for local tool-use: the launch command, useful flags (--no-think / --enable-prefix-cache / --max-context / --allow-chatml-fallback), client base_url/model/api_key settings, the supported chat-completions + Hermes/Qwen tool-call contract (only tool_choice auto/none/unset; response_format/logprobs/top_p!=1/seed rejected), and reliability guidance. Docs only; no runtime or dependency changes. ghstack-source-id: 672db61 ghstack-comment-id: 4617420672 Pull-Request: #19999
[ghstack-poisoned]
mergennachin
added a commit
that referenced
this pull request
Jun 4, 2026
Add an operational recipe to the server README for pointing pi (or any OpenAI-compatible harness) at the ExecuTorch server for local tool-use: the launch command, useful flags (--no-think / --enable-prefix-cache / --max-context / --allow-chatml-fallback), client base_url/model/api_key settings, the supported chat-completions + Hermes/Qwen tool-call contract (only tool_choice auto/none/unset; response_format/logprobs/top_p!=1/seed rejected), and reliability guidance. Docs only; no runtime or dependency changes. ghstack-source-id: 9e816d2 ghstack-comment-id: 4617420672 Pull-Request: #19999
psiddh
approved these changes
Jun 4, 2026
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add an operational recipe to the server README for pointing pi (or any
OpenAI-compatible harness) at the ExecuTorch server for local tool-use: the
launch command, useful flags (--no-think / --max-context /
--allow-chatml-fallback), client base_url/model/api_key settings, the supported
chat-completions + Hermes/Qwen tool-call contract (only tool_choice
auto/none/unset; response_format/logprobs/top_p!=1/seed rejected), and
reliability guidance. Docs only; no runtime or dependency changes.
Part of #20001