feat(span-buffer): Add a new write/read option for removing payloads from sets#107809
feat(span-buffer): Add a new write/read option for removing payloads from sets#107809
Conversation
…from sets There is a hypothesis that some of the CPU usage on Redis is from copying the compressed payload bytes into the new sets when the sets are merged. Add a new write/read method that stores payloads over a certain size into a separate key. This key is then fetched and populated when flushing the segment.
| if span_data.starswith("span-buf:p:ld:"): | ||
| payload_keys.append(span_data) | ||
| else: | ||
| decompressed_spans.extend(self._decompress_batch(span_data)) | ||
|
|
||
| if payload_keys: | ||
| payloads = self.client.mget(*payload_keys) | ||
| decompressed_spans.extend(map(self._decompressed_batch, payloads)) |
There was a problem hiding this comment.
Bug: The payloads variable is reassigned from a dict to a list, causing a TypeError on subsequent dictionary-style access.
Severity: CRITICAL
Suggested Fix
Use a different variable name for the result of self.client.mget(*payload_keys) to avoid overwriting the payloads dictionary. Then, iterate through the results to populate the original payloads dictionary correctly.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/spans/buffer.py#L818-L825
Potential issue: In the `_load_segment_data` method, the `payloads` variable is
initialized as a dictionary. However, it is later reassigned to the result of
`self.client.mget()`, which returns a list. Subsequent code attempts to access this
variable using dictionary key lookups (e.g., `payloads[key]`), which will fail with a
`TypeError` because list indices must be integers, not the `bytes` keys being used. This
occurs when the `read_from_payload_set` feature is enabled and segments are being
flushed.
Did we get this right? 👍 / 👎 to inform future reviews.
| if write_to_payload_set: | ||
| p.hdel(payload_set_redirect_map_key, *span_ids) | ||
| p.unlink( | ||
| *(self._get_payload_key(project_id, trace_id, s) for s in span_ids) |
There was a problem hiding this comment.
Bug: The _get_payload_key method is called with the wrong number and types of arguments, which will cause a TypeError.
Severity: CRITICAL
Suggested Fix
Combine and decode the project_id and trace_id bytes into a single project_and_trace string before calling _get_payload_key. The call should be updated to self._get_payload_key(f"{project_id.decode()}:{trace_id.decode()}", s).
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/spans/buffer.py#L956
Potential issue: The `_get_payload_key` method expects two string arguments:
`project_and_trace` and `span_id`. However, in the `done_flush_segments` method, it is
called with three separate `bytes` arguments: `project_id`, `trace_id`, and `s`. This
mismatch in argument count and types will cause a `TypeError` when the cleanup phase of
segment flushing is executed with the `write_to_payload_set` feature enabled.
Did we get this right? 👍 / 👎 to inform future reviews.
|
|
||
| if write_to_payload_set: | ||
| if ( | ||
| not self.add_buffer_payload_sha |
There was a problem hiding this comment.
Bug: The add_buffer_payload_sha attribute is accessed before it is initialized in __init__, which will cause an AttributeError.
Severity: CRITICAL
Suggested Fix
Initialize the attribute in the __init__ method by adding the line self.add_buffer_payload_sha: str | None = None, consistent with how similar attributes like add_buffer_sha are handled.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/spans/buffer.py#L609
Potential issue: The `SpansBuffer` class `__init__` method does not initialize the
`add_buffer_payload_sha` attribute. However, this attribute is accessed within the
`_ensure_scripts` method when the `write_to_payload_set` feature is enabled. This will
lead to an `AttributeError` when `process_spans` is called, as it tries to access an
attribute that has not been defined on the object instance.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 5 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| decompressed_spans.extend(self._decompress_batch(span_data)) | ||
|
|
||
| if payload_keys: | ||
| payloads = self.client.mget(*payload_keys) |
There was a problem hiding this comment.
Variable shadowing overwrites payloads dict with list
High Severity
The variable payloads is defined at line 795 as a dictionary mapping segment keys to lists of bytes. Line 824 reassigns payloads to the result of mget(), which is a list. This breaks lines 832 and 836 which expect payloads to remain a dictionary and attempt del payloads[key] and payloads[key].extend(...). When payload keys are fetched, this causes a TypeError.
Additional Locations (1)
| p.delete(b"span-buf:hrs:" + payload_set_key) | ||
| p.delete(b"span-buf:ic:" + payload_set_key) | ||
| p.delete(b"span-buf:ibc:" + payload_set_key) | ||
| p.unlink(payload_set_key) |
There was a problem hiding this comment.
Missing payload_set_key in queue zrem call
Medium Severity
The zrem call only removes zset_key and set_key from the flush queue, but not payload_set_key. When reading from payload sets, the queue contains payload_set_key entries, so they won't be removed during cleanup, leading to orphaned queue entries that could cause duplicate flush attempts.
| span_data = scan_value[0] if isinstance(scan_value, tuple) else scan_value | ||
| decompressed_spans.extend(self._decompress_batch(span_data)) | ||
| # Payload keys are prefixed with span-buf:p:ld: and have to be fetched separately | ||
| if span_data.starswith("span-buf:p:ld:"): |
There was a problem hiding this comment.
|
|
||
| if payload_keys: | ||
| payloads = self.client.mget(*payload_keys) | ||
| decompressed_spans.extend(map(self._decompressed_batch, payloads)) |
There was a problem hiding this comment.
map results not flattened before extend call
High Severity
The _decompress_batch method returns a list[bytes], so map(...) produces an iterator of lists. Using extend with this iterator adds each list as an element to decompressed_spans rather than flattening the bytes. The results need to be flattened, e.g., with itertools.chain.from_iterable.
| payload_key = self._get_payload_key(project_and_trace, parent_span_id) | ||
| compressed = prepared.keys()[0] # Compressed payloads only have one key | ||
| if payload_cutoff_size > 0 and len(compressed) > payload_cutoff_size: | ||
| p.set(payload_key, compressed) |
There was a problem hiding this comment.
Missing TTL on payload keys causes resource leak
Medium Severity
The p.set(payload_key, compressed) call creates a Redis key without any TTL. Other span buffer keys get TTL set either in the Lua scripts (via expire calls) or explicitly. If a segment is never flushed or cleanup fails, these payload keys will persist in Redis indefinitely, causing a resource leak. The call needs to include a TTL, e.g., using p.setex(payload_key, redis_ttl, compressed).


There is a hypothesis that some of the CPU usage on Redis is from copying the compressed payload
bytes into the new sets when the sets are merged. Add a new write/read method that stores payloads
over a certain size into a separate key. This key is then fetched and populated when flushing the
segment.
THIS IS MISSING TESTS