Skip to content

perf(api): Stream JSON query responses to reduce peak memory#8030

Draft
phacops wants to merge 1 commit into
masterfrom
feat/stream-api-json-responses
Draft

perf(api): Stream JSON query responses to reduce peak memory#8030
phacops wants to merge 1 commit into
masterfrom
feat/stream-api-json-responses

Conversation

@phacops

@phacops phacops commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Context

snuba-api has been OOMing in production. While auditing the read path for memory hot spots, one clear, contained win stood out: every query response is fully serialized into a single JSON string via json.dumps before being returned, so the process holds that entire string in memory in addition to the result object graph. Under concurrency this adds a full serialized-size copy per in-flight request.

(Worker-recycling guardrails API_WORKERS_MAX_RSS / API_WORKERS_LIFETIME are already set in prod, so the remaining lever is reducing per-request peak.)

Change

  • Add stream_payload(), which encodes the response incrementally with simplejson's iterencode and yields chunks. The two query response callsites (/query, /snql, /mql via dataset_query, plus the storage-delete result path) now return this generator directly. Werkzeug treats it as a streamed response and Granian writes chunks as they are produced, so we never hold the whole serialized body at once.
  • Bytes handling moved inline into a custom default (_payload_default) with encoding=None. This is required for streaming: simplejson otherwise raises UnicodeDecodeError on invalid-UTF8 bytes, which on a streamed response would fire mid-stream after bytes were already sent. The inline handler decodes valid UTF-8 and hex-encodes the rest, preserving the existing RAW_BYTESTRING__<hex> behavior. This replaces the old post-hoc _sanitize_payload retry.
  • dump_payload() is kept as a thin wrapper ("".join(stream_payload(...))) so existing callers/tests are unaffected.

Behavioral note

Streamed responses use chunked transfer encoding and carry no Content-Length header (previously computed from the full string). The primary consumer (sentry's snuba client over urllib3) handles chunked responses fine.

Out of scope (follow-ups)

This only removes the serialized-string copy. The other copies on the read path — the native driver's buffered rows, the list[dict] rebuild in clickhouse/native.py, and the result-cache rapidjson.dumps copy — require rearchitecting the Reader/cache contract and are intentionally left for separate work.

Testing

  • pytest tests/web/test_views.py — existing test_response_dumping unchanged; added test_stream_payload (incremental chunks + valid/invalid bytes) and test_streamed_response (generator Response is is_streamed and parses end-to-end).
  • mypy snuba/web/views.py and pre-commit clean.

🤖 Generated with Claude Code

Agent transcript: https://claudescope.sentry.dev/share/8nrIX8NvceNRXqVEO5-f1DnOqTRtDlATlFL7Z04Z9ts

snuba-api builds the full serialized JSON response body as a single string via
json.dumps before returning it, holding that whole copy in memory in addition to
the result object graph. Under concurrency this contributes to OOMs.

Encode responses incrementally with simplejson's iterencode and return the
generator directly from the query endpoints, so the WSGI server (Granian)
consumes chunks as produced and the process never holds the entire serialized
body at once. Bytes are routed through a custom default (encoding=None) so
undecodable values are hex-encoded inline instead of raising UnicodeDecodeError
mid-stream, preserving the previous RAW_BYTESTRING__ behavior. dump_payload is
kept as a thin wrapper over the new stream_payload.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/lE8HCAkDe8F1lBD4bRmx8sdFv-cGMrE-YmHGlNFyigk
@phacops phacops requested a review from a team as a code owner June 14, 2026 21:39
@phacops phacops marked this pull request as draft June 14, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant