Skip to content

perf: cache backtrace line parsing and Line object creation#2905

Open
HazAT wants to merge 1 commit intomasterfrom
perf/backtrace-parse-caching
Open

perf: cache backtrace line parsing and Line object creation#2905
HazAT wants to merge 1 commit intomasterfrom
perf/backtrace-parse-caching

Conversation

@HazAT
Copy link
Copy Markdown
Member

@HazAT HazAT commented Mar 18, 2026

⚠️ Needs closer review — class-level mutable caches with bounded size

Part of #2901 (reduce memory allocations by ~53%)

Changes

Add two layers of caching to Backtrace::Line.parse to avoid redundant work when the same backtrace lines appear across multiple exceptions (which is the common case in production):

  1. Parse data cache: Caches the extracted (file, number, method, module_name) tuple by the raw unparsed line string. Avoids re-running the regex match and string extraction on cache hit.

  2. Line object cache: Caches complete Line objects by (unparsed_line, in_app_pattern) pair. Avoids creating new Line objects entirely when the same line has been seen with the same pattern.

Both caches are bounded to 2048 entries and clear entirely when the limit is reached (simple, no LRU overhead).

Also caches the compiled in_app_pattern Regexp in Backtrace.parse to avoid Regexp.new on every exception capture.

Review focus

  • Are 2048 entries a reasonable bound? In most apps the unique backtrace lines will be far fewer.
  • The full-clear strategy on limit hit — acceptable simplicity tradeoff?
  • Line objects are effectively immutable after creation — is this assumption safe across all usage patterns?

HazAT added a commit that referenced this pull request Mar 18, 2026
Reduce total allocated memory from 442k to 206k bytes (-53.5%) and
objects from 3305 to 1538 (-53.5%) per Rails exception capture.

All changes are internal optimizations with zero behavior changes.

Key optimizations:
- Cache longest_load_path and compute_filename results (class-level,
  invalidated on $LOAD_PATH changes)
- Cache backtrace line parsing and Line/Frame object creation (bounded
  at 2048 entries)
- Optimize LineCache with Hash#fetch, direct context setting, and
  per-(filename, lineno) caching
- Avoid unnecessary allocations: indexed regex captures, match? instead
  of =~, byteslice, single-pass iteration in StacktraceBuilder
- RequestInterface: avoid env.dup, cache header name transforms, ASCII
  fast-path for encoding
- Scope/BreadcrumbBuffer: shallow dup instead of deep_dup where inner
  values are not mutated after duplication
- Hub#add_breadcrumb: hint default nil instead of {} to avoid empty
  hash allocation

See sub-PRs for detailed review by risk level:
- #2902 (low risk) — hot path allocation avoidance
- #2903 (low risk) — LineCache optimization
- #2904 (medium risk) — load path and filename caching
- #2905 (needs review) — backtrace parse caching
- #2906 (needs review) — Frame object caching
- #2907 (needs review) — Scope/BreadcrumbBuffer shallow dup
- #2908 (medium risk) — RequestInterface optimizations
Comment on lines +38 to +42
@parse_cache = {}

# Cache complete Line objects by (unparsed_line, in_app_pattern) to avoid
# re-creating identical Line objects across exceptions.
@line_object_cache = {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The new caching mechanisms in Backtrace::Line use non-thread-safe Hash objects, creating race conditions in multi-threaded environments during cache read, write, and clear operations.
Severity: MEDIUM

Suggested Fix

Replace the plain Hash objects used for @parse_cache and @line_object_cache with a thread-safe alternative. Since the project already depends on concurrent-ruby, using Concurrent::Map is the recommended approach, as it is already used for other caches in the codebase.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: sentry-ruby/lib/sentry/backtrace/line.rb#L38-L42

Potential issue: The new caching mechanisms (`@parse_cache`, `@line_object_cache`) in
`Backtrace::Line` use plain `Hash` objects, which are not thread-safe. In a
multi-threaded environment, concurrent read/write operations can lead to race
conditions. Specifically, the non-atomic sequence of checking the cache size, clearing
it, and then writing a new entry can result in lost entries and an inconsistent cache
state. This can degrade cache effectiveness and potentially cause errors on non-GIL Ruby
implementations like JRuby.

Did we get this right? 👍 / 👎 to inform future reviews.


# Cache complete Line objects by (unparsed_line, in_app_pattern) to avoid
# re-creating identical Line objects across exceptions.
@line_object_cache = {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread-unsafe plain Hash caches risk corruption on JRuby

Medium Severity

The three new class-level caches (@parse_cache, @line_object_cache, @in_app_pattern_cache) use plain {} hashes, but they're accessed concurrently from multiple threads during exception capture. On JRuby (which the SDK explicitly supports via JAVA_INPUT_FORMAT), there is no GIL, and concurrent reads/writes to a plain Hash can corrupt its internal structure. The existing line_cache in the same file already uses Concurrent::Map for this exact reason.

Additional Locations (1)
Fix in Cursor Fix in Web

@sl0thentr0py sl0thentr0py force-pushed the perf/backtrace-parse-caching branch from 8741117 to c0c148c Compare March 30, 2026 10:18
⚠️ Needs closer review — introduces class-level mutable caches.

Add two layers of caching to Backtrace::Line.parse to avoid redundant
work when the same backtrace lines appear across multiple exceptions
(which is the common case in production):

1. Parse data cache: Caches the extracted (file, number, method,
   module_name) tuple by the raw unparsed line string. Avoids re-running
   the regex match and string extraction on cache hit.

2. Line object cache: Caches complete Line objects by (unparsed_line,
   in_app_pattern) pair. Avoids creating new Line objects entirely when
   the same line has been seen with the same pattern.

Both caches are bounded to 2048 entries and clear entirely when the
limit is reached (simple, no LRU overhead).

Also cache the compiled in_app_pattern Regexp in Backtrace.parse to
avoid Regexp.new on every exception capture.

Safety: Line objects are effectively immutable after creation (all
attributes are set in initialize and only read afterwards). The parse
inputs are deterministic — same unparsed_line always produces the same
parsed data.
@sl0thentr0py sl0thentr0py force-pushed the perf/backtrace-parse-caching branch from c0c148c to 7523126 Compare March 30, 2026 10:19
if @line_object_cache.size >= PARSE_CACHE_LIMIT
@line_object_cache.clear
end
pattern_cache = (@line_object_cache[object_cache_key] ||= {})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The variable object_cache_key is used without being defined, which will raise a NameError at runtime and crash the method.
Severity: CRITICAL

Suggested Fix

Replace the undefined variable object_cache_key on line 81 with the unparsed_line variable, which is the intended cache key.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: sentry-ruby/lib/sentry/backtrace/line.rb#L81

Potential issue: In the `Backtrace::Line.parse` method, line 81 references a local
variable `object_cache_key` that is never assigned a value. This will cause a
`NameError` every time the method is called, preventing the Sentry SDK from capturing
any exceptions. The variable appears to be a typo and was likely intended to be
`unparsed_line`, which is a parameter to the method. This crash makes the associated
caching logic non-functional.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

if @line_object_cache.size >= PARSE_CACHE_LIMIT
@line_object_cache.clear
end
pattern_cache = (@line_object_cache[object_cache_key] ||= {})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undefined method object_cache_key causes runtime error

High Severity

object_cache_key on line 81 is never defined anywhere — not as a method, variable, or constant. Every call to Line.parse will raise a NoMethodError. This appears to be an incomplete refactor; the intended key was likely unparsed_line.

Fix in Cursor Fix in Web

pattern_cache = (@line_object_cache[object_cache_key] ||= {})
pattern_cache[in_app_pattern] = line

line
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line object cache is written but never read

Medium Severity

The @line_object_cache is populated on every call but never consulted before creating a new Line object on line 75. A cache lookup (and early return on hit) is missing, so every invocation still allocates a new Line, defeating the stated purpose of the object cache. The cache currently only wastes memory.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant