feat(gateway): debug-level request/response logging for TITO verification#18
Open
DavidBellamy wants to merge 19 commits intomainfrom
Open
feat(gateway): debug-level request/response logging for TITO verification#18DavidBellamy wants to merge 19 commits intomainfrom
DavidBellamy wants to merge 19 commits intomainfrom
Conversation
…alistic perf and auto-discover ut (sgl-project#22086) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
…gl-project#21649) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
…tion Add two debug-level log calls in sgl-model-gateway/src/routers/http/router.rs that dump the request body forwarded to a worker and the response body received back, gated behind the smg::tito_debug tracing target. Tokens-in tokens-out (TITO) mode requires the agent's extra_body["input_ids"] to land at the worker exactly as sent, and the worker's completion_token_ids to ride back unchanged. When that round-trip breaks, we want to see the actual bytes on the wire without rebuilding the gateway with extra prints. Activation: RUST_LOG=info,smg::tito_debug=debug Logs only fire at DEBUG level; gated by tracing::enabled!. Zero overhead at INFO/WARN. Operators turn them on only when investigating a TITO regression.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add two debug-level log calls in `sgl-model-gateway/src/routers/http/router.rs` that dump the request body forwarded to a worker and the response body received back, gated behind the `smg::tito_debug` tracing target.
Why
Tokens-in tokens-out (TITO) mode requires the agent's `extra_body["input_ids"]` to land at the worker exactly as sent, and the worker's `completion_token_ids` to ride back unchanged. When that round-trip breaks, we want to see the actual bytes on the wire without rebuilding the gateway with extra prints.
Gating behind a custom tracing target (`smg::tito_debug`) lets operators flip TITO logs on with a single env var without enabling all of `smg`.
Activation
```
RUST_LOG=info,smg::tito_debug=debug
```
Behavior
Future
Additional debug points (e.g. inside the chat-completion-aware path that parses the response choices) can be added under the same `smg::tito_debug` target as needed. This PR is the first installment of living TITO instrumentation.
Provenance
One of five focused PRs that supersede #3.