perf(api): Stream JSON query responses to reduce peak memory#8030
Draft
phacops wants to merge 1 commit into
Draft
perf(api): Stream JSON query responses to reduce peak memory#8030phacops wants to merge 1 commit into
phacops wants to merge 1 commit into
Conversation
snuba-api builds the full serialized JSON response body as a single string via json.dumps before returning it, holding that whole copy in memory in addition to the result object graph. Under concurrency this contributes to OOMs. Encode responses incrementally with simplejson's iterencode and return the generator directly from the query endpoints, so the WSGI server (Granian) consumes chunks as produced and the process never holds the entire serialized body at once. Bytes are routed through a custom default (encoding=None) so undecodable values are hex-encoded inline instead of raising UnicodeDecodeError mid-stream, preserving the previous RAW_BYTESTRING__ behavior. dump_payload is kept as a thin wrapper over the new stream_payload. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Agent transcript: https://claudescope.sentry.dev/share/lE8HCAkDe8F1lBD4bRmx8sdFv-cGMrE-YmHGlNFyigk
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
snuba-api has been OOMing in production. While auditing the read path for memory hot spots, one clear, contained win stood out: every query response is fully serialized into a single JSON string via
json.dumpsbefore being returned, so the process holds that entire string in memory in addition to the result object graph. Under concurrency this adds a full serialized-size copy per in-flight request.(Worker-recycling guardrails
API_WORKERS_MAX_RSS/API_WORKERS_LIFETIMEare already set in prod, so the remaining lever is reducing per-request peak.)Change
stream_payload(), which encodes the response incrementally with simplejson'siterencodeand yields chunks. The two query response callsites (/query,/snql,/mqlviadataset_query, plus the storage-delete result path) now return this generator directly. Werkzeug treats it as a streamed response and Granian writes chunks as they are produced, so we never hold the whole serialized body at once.default(_payload_default) withencoding=None. This is required for streaming: simplejson otherwise raisesUnicodeDecodeErroron invalid-UTF8 bytes, which on a streamed response would fire mid-stream after bytes were already sent. The inline handler decodes valid UTF-8 and hex-encodes the rest, preserving the existingRAW_BYTESTRING__<hex>behavior. This replaces the old post-hoc_sanitize_payloadretry.dump_payload()is kept as a thin wrapper ("".join(stream_payload(...))) so existing callers/tests are unaffected.Behavioral note
Streamed responses use chunked transfer encoding and carry no
Content-Lengthheader (previously computed from the full string). The primary consumer (sentry's snuba client over urllib3) handles chunked responses fine.Out of scope (follow-ups)
This only removes the serialized-string copy. The other copies on the read path — the native driver's buffered rows, the
list[dict]rebuild inclickhouse/native.py, and the result-cacherapidjson.dumpscopy — require rearchitecting theReader/cache contract and are intentionally left for separate work.Testing
pytest tests/web/test_views.py— existingtest_response_dumpingunchanged; addedtest_stream_payload(incremental chunks + valid/invalid bytes) andtest_streamed_response(generatorResponseisis_streamedand parses end-to-end).mypy snuba/web/views.pyand pre-commit clean.🤖 Generated with Claude Code
Agent transcript: https://claudescope.sentry.dev/share/8nrIX8NvceNRXqVEO5-f1DnOqTRtDlATlFL7Z04Z9ts