Fix text rendering corruption from mid-render atlas bind stomp; font cache hardening#302
Conversation
Rasterizing a cache-missed glyph inside the render loop rebinds texture unit 0 to the upload target and leaves it bound (LLImageGL::setSubImage uploads with skip_unbind; nextOpenPos binds brand-new sheets to create their GL textures). Quads already queued for the previous atlas then flushed under the stomped binding and sampled another atlas page, rendering fragments of unrelated glyphs - the occasional "text shows other text / broken glyphs" corruption reported since the font rework. The legacy per-codepoint `last_char != wch` flush removed in 7c05b2d had bounded the misdraw to ~1 glyph, hiding it; idle-sheet eviction made re-rasterization (and thus the stomp window) recurrent. llfontgl.cpp: - render: route every batch submit through a flush_batch helper that re-asserts the batch's own atlas texture before drawing. bind() is a cached no-op when the binding didn't move, so the common path pays nothing. Applied to the shaped path, codepoint path, end-of-pass flush, and pass B. - render: skip quad emission when the slot's atlas page can't resolve (released sheet) instead of emitting quads against whatever texture happens to be bound. Pen advance still runs so layout is preserved. llfontbitmapcache.cpp: - Zero-clear Color (BGRA) sheets on allocation. LLImageRaw doesn't zero its buffer, so emoji atlas borders and inter-glyph gutters uploaded heap garbage - latent today under point sampling, visible the moment the shadow shader's dilated taps or linear filtering sample past a glyph edge. Grayscale sheets were already cleared. llfontfreetype.cpp: - getGlyphInfo / getGlyphInfoByIndex: treat Unspecified as match-any at cache probe time and resolve to Grayscale only when rasterizing. Resolving before the probe made every measurement-path lookup of an already-rendered color glyph miss and rasterize a duplicate grayscale copy (a full COLRv1 paint walk plus a wasted Grayscale atlas slot per emoji). llvertexbuffer.h: - LLVertexBufferData's value constructor cross-assigned model_view into mProjection and projection into mModelView. No live caller (drawWithMatrix has none), but wrong regardless. tests/llfontgl_test.cpp: - Regression test: capture a render of a Latin prefix plus a cold U+1F995 through gGL.beginList and require the prefix batch's recorded mTexName to be the grayscale atlas page its UVs were built against; pre-fix it carried the emoji sheet bound by the mid-render upload. Headless-only (OSMesa) like the rest of the render group. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e guards Follow-ups from the corruption audit, all in the glyph/shape cache layer. llfontfreetype.cpp: - getGlyphInfoByIndex: degrade Color requests to Grayscale on faces with neither a color table (CBDT/sbix/COLR) nor OT-SVG before the cache probe. FT_LOAD_COLOR on such a face rasterizes the same grayscale outline anyway, but the typed probe missed the existing Grayscale entry and rasterized a duplicate gray bitmap into fresh atlas slots published under a Color-typed entry. use_color=true is the UI default for every glyph, and reload paths pre-warm ASCII as Grayscale, so the whole ASCII set was stored twice per font - x8 phase slots on subpixel faces. llfontgl.cpp: - getCacheGeneration: return the sum of the per-instance atlas generations of the head face plus every fallback face instead of the global counter. Glyphs can only come from those atlases, so the sum ticks for exactly the mutations that can stale this font's captured UVs. The global counter invalidated EVERY cached text vertex/width buffer viewer-wide whenever any font rasterized a glyph; during glyph churn (first CJK chat fill, post-eviction warm-up) each regen rasterized more glyphs and re-invalidated everything again for several frames. Monotonic by construction - every component only takes fresh values from the shared counter, so no A+1/B-1 aliasing. - Holders of shapeLine references (render, getWidthF32, charFromPixelOffset, maxDrawableChars, firstDrawableChar) snapshot LLFontShaping::cacheMutationCount() after shaping and llassert equality after their last dereference, so a use-after-invalidation trips a debug assert instead of reading freed glyph runs. build_shape_layout snapshots up front too so the empty-slice early return carries the live count - a default 0 tripped the assert on every empty-label measurement (LLButton::resize with "") once anything had ever shaped. llfontshaping.cpp/.h: - New cacheMutationCount(): bumped by miss-inserts (covering their evictions) and clears. Cache hits don't bump - an LRU splice never touches the index or a glyph vector. - clearCacheForFace: also drop entries rooted at OTHER heads whose glyph runs reference the dying face. Every fallback-sourced LLShapedGlyph stores a raw pointer to the fallback that owns its glyph_id, while invalidation was keyed only on the entry's root. Today's lifecycle makes the sweep defensive (fallbacks outlive heads via LLPointer chains; reloads clear globally), but enforcing it here keeps a future runtime fallback-removal path correct by construction. llfontface.cpp/.h: - insertGlyphInfo: keep the already-published entry on a duplicate (glyph_index, type) publish, delete the incoming one, and return the survivor. The old replace-in-place deleted an entry that callers up the stack (render loop, kerning prefetch) may still hold. Duplicate publishes assert in debug - they mean an upstream dedup probe was skipped. addShapedGlyphFromFont continues with the return value. llfontbitmapcache.h: - Update getGlobalGeneration's comment: production no longer compares against it directly; it remains the uniqueness source per-instance generations draw from, plus a test/diagnostic accessor. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…gistry tidy The remaining low-priority items from the corruption audit. llfontbitmapcache.cpp/.h: - Recycle released sheet slots. nextOpenPos tracks the active allocation target explicitly (mCurrentSheet, per atlas type) instead of assuming it's the last slot, and a new sheet reuses the first released slot before growing the vectors. The purge-before-release contract (glyph entries erased before releaseSheet, generation bump rebuilds captured vertex buffers) guarantees nothing references a released index, so reuse is safe — and a long session cycling glyph working sets through eviction no longer grows the slot vectors monotonically. llfontface.cpp/.h, llfontfreetype.cpp/.h: - Move the atlas eviction sweep (collectGarbage) and its throttle clock onto LLFontFace. The atlas and glyph map are face state; with the throttle per-head, N freetypes sharing one face re-swept the same atlas once per head per interval. LLFontFreetype::collectGarbage forwards, so sweepGlyphCaches callers are unchanged and siblings no-op inside the shared interval. llfontfreetype.cpp: - getXKerning: probe the kern table of the face that owns BOTH glyphs instead of always the head's. Legacy 'kern' tables map pairs of that face's own glyph indices; feeding the head's table foreign fallback indices was benign for GPOS-era fonts (no legacy table) but garbage-prone on old fonts, and silently dropped real kerning for pairs sharing a fallback face. Mixed-face pairs have no defined kerning and return 0. Pen policy (UNFITTED vs DEFAULT) stays the head's — it owns the layout loop the result feeds. llfontregistry.cpp/.h, llfontgl.cpp, llfontfreetype.h: - getNumFaces becomes a static probe; createFont no longer allocates a throwaway LLFontGL per search path just to count collection faces, and allocates the load wrapper lazily inside the face loop. - New storeFont helper: mFontMap values are owned raw pointers, so assigning into an occupied slot leaked the incumbent; reload()'s re-seat paths and createFont now route through it. - createFont also publishes the head under its canonical (template-name + normalized-size) key when free, so a differently-spelled descriptor that normalizes to the same font shares the freetype through a thin wrapper instead of re-walking every font file. tests/llfontbitmapcache_test.cpp, tests/llfontfreetype_test.cpp: - Re-pin the eviction tests to the recycling contract: released slots come back live instead of staying nullptr placeholders, slot counts stay bounded across release cycles, generation still advances. Co-Authored-By: Claude Fable 5 (1M context) <noreply@anthropic.com>
…orms Closes the mixed-atlas limitation noted when the shader shadow path landed. atlasTexelSize was pushed once per string from the HEAD face's atlas dimensions, but one shadowed string can batch glyphs from differently-sized atlases (a 512px head sheet vs a 1024px fallback emoji sheet), putting the fallback glyphs' shadow taps at the wrong distance — and captured vertex buffers replay per-batch texture binds but cannot replay per-batch uniforms, so a uniform could never be right for mixed strings. uiF.glsl: - Compute atlasTexelSize from textureSize(diffuseMap, 0) inside the shadow branch. Exact for whichever atlas each draw actually binds, including captured-buffer replay. shadowMode is now the shader's only shadow uniform, per-pass constant by design. - The channel half of the old limitation was already solved when sampleAtlasAlpha switched to .a sampling (the grayscale atlas's RG swizzle maps coverage to .a; BGRA is native) — the C++ side just never caught up. llfontgl.cpp, llfontvertexbuffer.cpp/.h, llviewershadermgr.cpp: - Drop the atlasTexelSize pushes and the genBuffers texel-size capture (mLastAtlasTexelW/H), and the grayscaleAtlas pushes — that uniform no longer exists in the shader, so all three sites were setting nothing. Only shadowMode remains: pushed before pass A, reset before pass B, re-pushed on captured replay, defaulted to 0 at shader creation. Known pre-existing edge for the shader-shadow re-enable checklist (inert while sEnableShaderShadow is off): with shadow != NO_SHADOW the underline quad is emitted before pass A, so capture stores it in mShadowBufferList and replay draws it under shadowMode != 0 — DROP is visually identical (white-texture taps collapse to passthrough math), SOFT renders the underline slightly denser than direct mode. Co-Authored-By: Claude Fable 5 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 5 minutes and 26 seconds. Learn how PR review limits work. Your organization has reached its usage spending cap. Adjust your spending cap in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR refactors font rendering to safely handle glyph atlas mutations during frames by introducing sheet slot recycling, face-level garbage collection with throttling, and render-time mutation tracking that prevents texture sampling of stale atlases. ChangesFont Rendering Atlas Safety and Garbage Collection
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
indra/llrender/tests/llfontfreetype_test.cpp (1)
1011-1023:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPurge the face-owned glyph entries before releasing sheet 0 in this test.
This test now exercises slot recycling through
LLFontFreetype, but it callsreleaseSheet()while'A'is still published inLLFontFace::mGlyphInfoMap. That leaves the fixture in a state production explicitly avoids: a later lookup of'A'would reuse the stale entry and sample whatever glyph got rebuilt into slot 0. If the goal is to pin the real rasterizer lifecycle, delete the sheet's glyph entries first like tests8and9already do.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/llrender/tests/llfontfreetype_test.cpp` around lines 1011 - 1023, Before calling cache->releaseSheet(EFontGlyphType::Grayscale, 0) purge the face-owned glyph entries that reference sheet 0 so the test doesn't leave stale LLFontFace::mGlyphInfoMap entries; replicate the approach used in tests 8 and 9 by iterating the face's mGlyphInfoMap (or calling the existing helper used by those tests) to erase any glyph entries tied to sheet 0, then call releaseSheet and continue with the getGlyphInfo(L'B', ...) assertions.indra/llrender/llfontbitmapcache.cpp (1)
75-92:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon't retouch released slots on null reads.
releaseSheet()now relies on a released slot keepingmLastUsedTime == 0, but both accessors calltouchSheet()before checking whether the slot is null. Any diagnostic/test probe of a released sheet will make it look recently used again, which breaks the documented contract and makestest<9>timing-dependent.Suggested fix
LLImageRaw *LLFontBitmapCache::getImageRaw(EFontGlyphType bitmap_type, U32 bitmap_num) const { const U32 bitmap_idx = static_cast<U32>(bitmap_type); if (bitmap_type >= EFontGlyphType::Count || bitmap_num >= mImageRawVec[bitmap_idx].size()) return nullptr; + if (mImageRawVec[bitmap_idx][bitmap_num].isNull()) + return nullptr; + touchSheet(bitmap_type, bitmap_num); return mImageRawVec[bitmap_idx][bitmap_num]; } LLImageGL *LLFontBitmapCache::getImageGL(EFontGlyphType bitmap_type, U32 bitmap_num) const { const U32 bitmap_idx = static_cast<U32>(bitmap_type); if (bitmap_type >= EFontGlyphType::Count || bitmap_num >= mImageGLVec[bitmap_idx].size()) return nullptr; + if (mImageGLVec[bitmap_idx][bitmap_num].isNull()) + return nullptr; + touchSheet(bitmap_type, bitmap_num); return mImageGLVec[bitmap_idx][bitmap_num]; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/llrender/llfontbitmapcache.cpp` around lines 75 - 92, getImageRaw and getImageGL call touchSheet() before verifying the slot is non-null, which re-marks released slots as recently used and breaks releaseSheet()'s mLastUsedTime==0 contract; change both functions (getImageRaw, getImageGL) to first check bitmap_type bounds and that mImageRawVec[mIdx][bitmap_num] / mImageGLVec[mIdx][bitmap_num] is non-null (and/or that mLastUsedTime != 0) and only then call touchSheet(bitmap_type, bitmap_num) before returning the image, so released (null) slots are not retouched.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@indra/llrender/llfontgl.cpp`:
- Around line 186-223: LLFontGL::getCacheGeneration currently returns S32 and
sums LLFontBitmapCache::getCacheGeneration() values, which can overflow; change
this stamp to a wider unsigned type (e.g. uint64_t / U64) end-to-end: update
LLFontGL::getCacheGeneration's return type, the local variable gen, and any
callers/storage sites (notably LLFontVertexBuffer and LLFontWidthBuffer fields
and any comparisons or assignments) to use the wider unsigned type, update
LLFontBitmapCache::getCacheGeneration signature if necessary, and ensure all
comparisons/assignments use the new unsigned type to preserve the monotonic
non-overflowing property.
In `@indra/llrender/llfontregistry.cpp`:
- Around line 1982-1986: The current restore only re-seats saved heads via
storeFont(desc, head) after reload() fails, but reload() clears template entries
and members like mFontSizes, mFamilySizes, and mFamilyMeta; fix by snapshotting
the full registry state before calling reload() (including template entries,
mFontSizes, mFamilySizes, mFamilyMeta, and any other registry maps), and on
parse failure restore those saved members in addition to re-seating heads, or
alternatively perform the reload into a temporary LLFontRegistry instance and
swap it into place only on success; update code around reload(), storeFont, and
the places manipulating mFontSizes/mFamilySizes/mFamilyMeta so getFont(),
nameToSize(), and getAvailableFamilies() see the restored state after a failed
reload.
---
Outside diff comments:
In `@indra/llrender/llfontbitmapcache.cpp`:
- Around line 75-92: getImageRaw and getImageGL call touchSheet() before
verifying the slot is non-null, which re-marks released slots as recently used
and breaks releaseSheet()'s mLastUsedTime==0 contract; change both functions
(getImageRaw, getImageGL) to first check bitmap_type bounds and that
mImageRawVec[mIdx][bitmap_num] / mImageGLVec[mIdx][bitmap_num] is non-null
(and/or that mLastUsedTime != 0) and only then call touchSheet(bitmap_type,
bitmap_num) before returning the image, so released (null) slots are not
retouched.
In `@indra/llrender/tests/llfontfreetype_test.cpp`:
- Around line 1011-1023: Before calling
cache->releaseSheet(EFontGlyphType::Grayscale, 0) purge the face-owned glyph
entries that reference sheet 0 so the test doesn't leave stale
LLFontFace::mGlyphInfoMap entries; replicate the approach used in tests 8 and 9
by iterating the face's mGlyphInfoMap (or calling the existing helper used by
those tests) to erase any glyph entries tied to sheet 0, then call releaseSheet
and continue with the getGlyphInfo(L'B', ...) assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 026f1f4d-469e-4ed5-b3e9-f1e869a58b82
⛔ Files ignored due to path filters (1)
indra/newview/app_settings/shaders/class1/interface/uiF.glslis excluded by!**/*.glsl
📒 Files selected for processing (18)
indra/llrender/llfontbitmapcache.cppindra/llrender/llfontbitmapcache.hindra/llrender/llfontface.cppindra/llrender/llfontface.hindra/llrender/llfontfreetype.cppindra/llrender/llfontfreetype.hindra/llrender/llfontgl.cppindra/llrender/llfontregistry.cppindra/llrender/llfontregistry.hindra/llrender/llfontshaping.cppindra/llrender/llfontshaping.hindra/llrender/llfontvertexbuffer.cppindra/llrender/llfontvertexbuffer.hindra/llrender/llvertexbuffer.hindra/llrender/tests/llfontbitmapcache_test.cppindra/llrender/tests/llfontfreetype_test.cppindra/llrender/tests/llfontgl_test.cppindra/newview/llviewershadermgr.cpp
…, touch guards CodeRabbit findings, all four taken: - LLFontGL::getCacheGeneration (and the mLastFontCacheGen consumers in LLFontVertexBuffer / LLFontWidthBuffer) widen to U64. The stamp is a sum of per-face S32 generations; an S32 accumulator could in principle overflow (signed UB) and break the strictly-increases contract. The per-face generations stay S32 — they're raw draws from the global counter, summed into the wide accumulator. - LLFontRegistry::reload snapshots the full parse-time state (font map incl. templates, size tables, family metadata, applied-overrides LLSD) and restores it wholesale when parseFontInfo fails. Previously only the heads were re-seated, leaving nameToSize / getAvailableFamilies / uncached getFont running against an emptied registry until the next successful reload. The map copy is safe: a failed parse only adds nullptr template placeholders, so the restore can't double-own or leak. - LLFontBitmapCache::getImageRaw/getImageGL no longer touchSheet before the released-slot check. releaseSheet zeroes the slot's last-used time and recycling relies on it staying zero; a null read of a released sheet shouldn't make the slot look recently used. - llfontfreetype_test test<6> purges the face-owned glyph entries referencing sheet 0 before its direct releaseSheet, matching the purge-before-release contract the production sweep maintains (and the idiom tests 8/9 already use). Co-Authored-By: Claude Fable 5 (1M context) <noreply@anthropic.com>
Summary
Resolves the occasional text corruption reported since the HarfBuzz font rework — text rendering fragments of other strings or broken glyphs, unaffected by disabling
CollectFontVertexBuffers— plus the full set of findings from the audit that diagnosis came out of.Root cause (
f04a566b)LLFontGL::renderqueues up to 120 glyphs of quads in CPU arrays before submitting, but rasterizing a cache-missed glyph mid-loop rebinds texture unit 0 and leaves the upload target bound (LLImageGL::setSubImagewithskip_unbind;nextOpenPosbinding brand-new sheets). The pending batch then flushed under the stomped binding, sampling another atlas page — i.e. other glyphs. The legacy per-codepointlast_char != wchflush removed in 7c05b2d had bounded the misdraw to ~1 glyph (invisible); with real batching, whole runs garbled. Idle-sheet eviction made re-rasterization — and therefore the corruption window — recurrent.Every batch submit now routes through a
flush_batchhelper that re-asserts the batch's own atlas texture first (cached no-op when the binding didn't move). Also: Color atlas sheets are zero-cleared on allocation (gutters uploaded heap garbage),Unspecifiedglyph lookups restore match-any semantics, andLLVertexBufferData's ctor had its modelview/projection cross-assigned. Includes a headless regression test that captures a render viagGL.beginListand pins the prefix batch's recordedmTexName.Cache hardening (
04a89893)getCacheGeneration()returns a per-font sum of head + fallback atlas generations instead of the global counter — one font's glyph churn no longer regenerates every cached text buffer viewer-wide.insertGlyphInfokeeps the already-published entry on duplicate publish (callers may hold it) and asserts in debug.LLFontShaping::cacheMutationCount()tripwire: all fiveshapeLinereference holders assert validity after their last dereference.clearCacheForFacealso sweeps entries whose glyph runs reference the dying face, enforcing the fallback-pointer lifecycle invariant instead of assuming it.Follow-ups (
ffcf3d51)mCurrentSheettracking — slot vectors stay bounded across eviction cycles; eviction tests re-pinned.collectGarbage+ throttle moved onto the sharedLLFontFace— one sweep per face per interval instead of per head.getXKerningprobes the kern table of the face owning both glyphs (mixed-face pairs return 0; same-fallback pairs now actually kern).getNumFacesprobe,storeFontownership guard formFontMapslots, canonical-key alias so descriptor spellings share freetypes.Shader shadow (
c7bce2fc)uiF.glslderives the atlas texel size fromtextureSize()per draw instead of a per-string uniform sized off the head atlas — correct for mixed text+emoji strings whose batches bind differently-sized atlases, including captured-buffer replay (which rebinds textures but can't replay uniforms). DeadgrayscaleAtlaspushes removed (shadowModeis the only shadow uniform left). Inert whilesEnableShaderShadowremains off.Testing
llassert).llfontgl_test.cpptest<6>, BUILD_HEADLESS/OSMesa).🤖 Generated with Claude Code