Skip to content

extension/llm/server: document pi integration#19999

Open
mergennachin wants to merge 8 commits into
gh/mergennachin/5/headfrom
gh/mergennachin/6/head
Open

extension/llm/server: document pi integration#19999
mergennachin wants to merge 8 commits into
gh/mergennachin/5/headfrom
gh/mergennachin/6/head

Conversation

@mergennachin

@mergennachin mergennachin commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Add an operational recipe to the server README for pointing pi (or any
OpenAI-compatible harness) at the ExecuTorch server for local tool-use: the
launch command, useful flags (--no-think / --max-context /
--allow-chatml-fallback), client base_url/model/api_key settings, the supported
chat-completions + Hermes/Qwen tool-call contract (only tool_choice
auto/none/unset; response_format/logprobs/top_p!=1/seed rejected), and
reliability guidance. Docs only; no runtime or dependency changes.

Part of #20001

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 3, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19999

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Unrelated Failure

As of commit 83986e5 with merge base eeb0646 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
mergennachin added a commit that referenced this pull request Jun 3, 2026
Add an operational recipe to the server README for pointing pi (or any
OpenAI-compatible harness) at the ExecuTorch server for local tool-use: the
launch command, useful flags (--no-think / --enable-prefix-cache / --max-context
/ --allow-chatml-fallback), client base_url/model/api_key settings, the
supported chat-completions + Hermes/Qwen tool-call contract (only
tool_choice auto/none/unset; response_format/logprobs/top_p!=1/seed rejected),
and reliability guidance. Docs only; no runtime or dependency changes.

ghstack-source-id: 672db61
ghstack-comment-id: 4617420672
Pull-Request: #19999
[ghstack-poisoned]
mergennachin added a commit that referenced this pull request Jun 4, 2026
Add an operational recipe to the server README for pointing pi (or any
OpenAI-compatible harness) at the ExecuTorch server for local tool-use: the
launch command, useful flags (--no-think / --enable-prefix-cache / --max-context
/ --allow-chatml-fallback), client base_url/model/api_key settings, the
supported chat-completions + Hermes/Qwen tool-call contract (only
tool_choice auto/none/unset; response_format/logprobs/top_p!=1/seed rejected),
and reliability guidance. Docs only; no runtime or dependency changes.

ghstack-source-id: 9e816d2
ghstack-comment-id: 4617420672
Pull-Request: #19999
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@mergennachin mergennachin marked this pull request as ready for review June 5, 2026 19:00
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants