feat(http_server): expose tokenizer SHA256 on /get_model_info for parity verification#15
Open
DavidBellamy wants to merge 19 commits intomainfrom
Open
feat(http_server): expose tokenizer SHA256 on /get_model_info for parity verification#15DavidBellamy wants to merge 19 commits intomainfrom
DavidBellamy wants to merge 19 commits intomainfrom
Conversation
…alistic perf and auto-discover ut (sgl-project#22086) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
…gl-project#21649) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
…ity verification Add a tokenizer_sha256 field to the /get_model_info endpoint that returns a deterministic hash of the active tokenizer's canonical JSON form. Lets clients verify two SGLang instances (or an SGLang instance and a separate trainer/embedding service) are using bit-identical tokenizers before relying on cross-process token-id assumptions. When token IDs cross process boundaries, the two sides must use bit-identical tokenizers including merges, special tokens, and byte fallbacks. A subtle mismatch silently corrupts downstream logic in ways that are hard to diagnose because the IDs still look plausible. Exposing a tokenizer hash on the existing model-info endpoint gives clients a one-call way to do this consistency check at startup. Field is optional (None when the tokenizer doesn't support .to_str()); existing clients ignore unknown fields. One-time hash, cached. No new dependencies.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a `tokenizer_sha256` field to the `/get_model_info` endpoint that returns a deterministic hash of the active tokenizer's canonical JSON form. Lets clients verify two SGLang instances (or an SGLang instance and a separate trainer/embedding service) are using bit-identical tokenizers before relying on cross-process token-id assumptions.
Why
When token IDs cross process boundaries (e.g. an inference worker emits token IDs that another component uses as input to a separate process), the two sides must use bit-identical tokenizers — including merges, special tokens, and byte fallbacks. A subtle mismatch silently corrupts downstream logic in ways that are hard to diagnose because the IDs still look plausible.
Exposing a tokenizer hash on the existing model-info endpoint gives clients a one-call way to do this consistency check at startup.
Changes (`python/sglang/srt/entrypoints/http_server.py`)
Behavior
Provenance
One of five focused PRs that supersede #3.