fix(provider-swift): map defective chat-template render to 422, not cascading 500 (#242) by shwniscool · Pull Request #248 · Layr-Labs/d-inference

shwniscool · 2026-05-29T17:15:33Z

What

Fixes the 500 upper filter requires string thrown by provider-swift whenever a request carrying tool definitions is routed to a model whose chat_template.jinja is not portable to swift-jinja — specifically mlx-community/gemma-4-26b-a4b-it-8bit (#242).

Why

The model's chat_template.jinja renders X | upper against an Undefined/None value. CPython's jinja2 (used by oMLX and other backends) is permissive and propagates Undefined/"" through the filter, so the template "works" everywhere else. swift-jinja is stricter and raises upper filter requires string.

The render happens inside MultiModelBatchSchedulerEngine.streamChatCompletion (and applyTemplate), which rethrew the raw error verbatim. It then fell through mapInferenceErrorToStatus to a generic 500. The coordinator reads a 500 as a provider fault and reroutes the request — so a deterministic, request-shaped failure turns into a cascading model load failed across every provider serving the model, impacting all agent harnesses that send tools.

What this PR changes (provider-side defensive guard — the "should" item in the issue)

New typed error MultiModelBatchSchedulerEngineError.templateRenderingFailed(String).
Defensive try/catch around applyChatTemplate in both streamChatCompletion and applyTemplate: any non-typed render failure is wrapped into .templateRenderingFailed (the verbatim underlying message is preserved for operator debugging); already-typed engine errors pass through unchanged.
Status mapping .templateRenderingFailed → 422 in ProviderLoop+ErrorMapping. A 422 fails the request cleanly without marking the provider faulty or triggering a reroute.
Tests (MultiModelBatchSchedulerEngineTests): the 422 mapping, verbatim-message preservation, and a tokenizer-driven applyTemplate test that sends a tool definition and asserts the swift-jinja failure surfaces as .templateRenderingFailed → 422 (no live model required).
CHANGELOG entry under Provider (Swift) → Bug Fixes.

Remaining items from the issue (NOT in this PR — they are publish/infra, not repo code)

These require artifacts/access outside this repo and the actual re-vended template:

Patch chat_template.jinja — replace X | upper with (X | default('')) | upper throughout. The template is not in this repo; it ships inside the model snapshot on R2.
Re-vend the patched template from a new R2 prefix.
Bump aggregate_sha256 in the coordinator catalog — this lives in the model_versions Postgres table (r2_prefix + aggregate_sha256), populated at publish time, not in a static repo file. It must be set to the real digest produced by the re-vend.
(Optional) Upstream the template patch to mlx-community/gemma-4-26b-a4b-it-8bit.

Once the patched template is re-vended, the 422 guard here remains valuable as defense-in-depth against any future non-portable template.

Verification note

provider-swift is a macOS/MLX Swift package and its dependencies (MLXLMServer, mlx-swift-lm) do not build in this Linux CI sandbox, so I could not run swift test here. The new tests are written against the verified upstream type signatures (ApplyTemplateRequest, OpenAITool, MLXLMCommon.Tokenizer) and mirror existing test patterns; please confirm they pass in the macOS test job.

Resolves #242 (provider-side guard).

^{Need help on this PR? Tag @codesmith with what you need. Autofix is disabled.}

…ascading 500 (Layr-Labs#242) When a model's chat_template.jinja throws while rendering a request that carries tool definitions (e.g. mlx-community/gemma-4-26b-a4b-it-8bit's `X | upper` on an Undefined value, which CPython jinja2 tolerates but swift-jinja rejects with "upper filter requires string"), MultiModelBatchSchedulerEngine rethrew the raw error. It fell through mapInferenceErrorToStatus to a generic 500, which the coordinator reads as a provider fault and reroutes -- cascading into "model load failed" across every provider that serves the model. Add a typed MultiModelBatchSchedulerEngineError.templateRenderingFailed case and wrap applyChatTemplate failures in both streamChatCompletion and applyTemplate. Map it to 422 (unprocessable) so the request fails cleanly and the provider stays healthy. Add unit + tokenizer-driven tests covering the tool-definition render path. Note: the underlying template defect, R2 re-vend, and coordinator aggregate_sha256 bump are publish/infra steps tracked separately in the issue -- this PR is the provider-side defensive guard.

vercel · 2026-05-29T17:15:38Z

Someone is attempting to deploy a commit to the EigenLabs Team on Vercel.

A member of the Team first needs to authorize it.

hankbobtheresearchoor

Review Summary

Clean, well-scoped PR. The root cause is clear (swift-jinja is stricter than CPython jinja2), the fix is defensive in the right place (engine level, before the error hits the coordinator), and the 422 mapping is semantically correct (the request is unprocessable given this model's template, not a transient provider fault).

Build + test results (macOS, M3 Ultra):

swift build --product darkbloom -c debug — clean (1674 steps)
All 3 new unit tests pass (status mapping, message preservation, applyTemplate integration)
All 3 existing error-map tests still pass
Full ProviderCore test suite passes

One inoffensive observation below — nothing blocking.

LGTM 🚢

hankbobtheresearchoor · 2026-05-29T17:41:13Z

                messages: messages, tools: toolSpecs, additionalContext: nil
            )
-        } catch {
+        } catch let error as MultiModelBatchSchedulerEngineError {


🔵 Observation (non-blocking): This catch let error as MultiModelBatchSchedulerEngineError catches every typed engine error, not just template-related ones — queueFull, tokenBudgetExhausted, requestRejected, etc. Today that's harmless because applyChatTemplate won't throw capacity/congestion errors (those come from the scheduler, not the tokenizer). But a future maintainer adding a new typed engine error that could be thrown from tokenization path — e.g. a tokenizer-warming-timeout — would have it silently pass through here and get the wrong status code downstream. Consider narrowing this catch to only .templateRenderingFailed (the one case that can actually originate from the tokenizer), or adding a comment noting the trust boundary between tokenizer errors and scheduler errors.

Good call — narrowed in ece91e6. The pass-through now matches only .templateRenderingFailed (so its message isn't double-wrapped), and everything else out of the render block is wrapped as .templateRenderingFailed. Added a comment at both call sites documenting the tokenizer↔scheduler trust boundary you flagged, so a future typed error on the tokenizer path can't silently slip out with the wrong status.

…ngFailed (review Layr-Labs#248) Address hankbob's review note on Layr-Labs#248: the broad `catch as MultiModelBatchSchedulerEngineError` passed through every typed engine error, so a future typed error thrown from the tokenizer path could silently slip out with the wrong status. Narrow the pass-through to only .templateRenderingFailed (avoids double-wrapping its message) and wrap everything else from the render block as .templateRenderingFailed. Document the tokenizer<->scheduler trust boundary at both call sites.

hankbobtheresearchoor approved these changes May 29, 2026

View reviewed changes

hankbobtheresearchoor reviewed May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(provider-swift): map defective chat-template render to 422, not cascading 500 (#242)#248

fix(provider-swift): map defective chat-template render to 422, not cascading 500 (#242)#248
shwniscool wants to merge 2 commits into
Layr-Labs:masterfrom
shwniscool:fix/242-swift-jinja-template-render-4xx

shwniscool commented May 29, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

vercel Bot commented May 29, 2026

Uh oh!

hankbobtheresearchoor left a comment

Uh oh!

hankbobtheresearchoor May 29, 2026

Uh oh!

shwniscool May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shwniscool commented May 29, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

What this PR changes (provider-side defensive guard — the "should" item in the issue)

Remaining items from the issue (NOT in this PR — they are publish/infra, not repo code)

Verification note

Uh oh!

vercel Bot commented May 29, 2026

Uh oh!

hankbobtheresearchoor left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

hankbobtheresearchoor May 29, 2026

Choose a reason for hiding this comment

Uh oh!

shwniscool May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shwniscool commented May 29, 2026 •

edited by blacksmith-sh Bot

Loading