Skip to content

Harden LLMessageSystem UDP paths, fix packet ring corruption, cut per-packet allocations#304

Merged
RyeMutt merged 7 commits into
developfrom
rye/message-system-fixes
Jun 11, 2026
Merged

Harden LLMessageSystem UDP paths, fix packet ring corruption, cut per-packet allocations#304
RyeMutt merged 7 commits into
developfrom
rye/message-system-fixes

Conversation

@RyeMutt

@RyeMutt RyeMutt commented Jun 11, 2026

Copy link
Copy Markdown
Member

Review sweep of LLMessageSystem and its hot UDP receive/decode/dispatch paths. Six commits, each standalone; defect fixes first, then the allocation work.

Security / robustness

  • Out-of-bounds read decoding variable-length fields (lltemplatemessagereader.cpp): decodeData() trusted the wire-claimed size of MVT_VARIABLE fields. A malformed or hostile packet could drive the copy up to 64KB past the receive buffer via a 2-byte length, or hand new U8[] a negative size via the 4-byte form. The claimed size is now clamped to the bytes remaining in the packet. Regression test included (test<46>, verified to fail against the unclamped decoder).
  • zeroCodeExpand() overflow case 3 kept decoding into a scrambled buffer after resetting outptr; it now discards the packet like cases 1 and 2.
  • Const-violation on Variable-1 truncation (lltemplatemessagebuilder.cpp): oversized payloads were NUL-terminated by casting away const and writing into the caller's buffer — UB through std::string::c_str(), a crash for a literal. The stored copy is terminated instead; wire bytes unchanged (test<47>).
  • getString(char*) could expose stale stack: the span between the copied data and buffer_size-1 was left uninitialized, so an unterminated wire string read back as garbage. Both overloads now terminate at the exact copied length (getData() reports it).

Packet ring (root cause behind the asserts disabled in cbf99f6)

bufferInboundPacket() received straight into the ring slot at mHeadIndex before knowing whether a packet had arrived. With the ring full, that slot is the oldest unread packet, and the would-block receive terminating every drainSocket() pass clobbered it to size 0 and desynced the byte accounting — once per drain under sustained overload. It now receives into a scratch buffer and only commits the slot for a real packet (matching the SOCKS branch), and the three asserts are restored as llassert since they're true invariants again.

Also: a runt SOCKS wrapper no longer terminates the drain loop early with packets still queued.

Correctness

  • sendMessage() computed the appended-ack budget with unsigned math, so over-MTU payloads (ChildAgentUpdate, SendXferPacket) wrapped to a huge space_left and got acks appended despite having no room.
  • checkPacketInID()'s gap filler compared ids with plain <, misreading a small forward gap across the 24-bit wrap as out-of-order (log spam + resync without loss marking). It now uses the modular forward distance.
  • Receive-count stats finally record the message number, so dumpReceiveCounts() (the message-storm diagnostic) attributes counts instead of printing nothing.
  • Outbound zerocode compression stats reinstated (TODO'd out since the babbage era); accounted in sendMessage() where the counters live.
  • operator<<(LLMessageSystem) probed mMessageNumbers with operator[], permanently inserting null entries; big-endian MVT_S16Array swizzle iterated n % 2 elements instead of n / 2.

Hot-path allocations

Previously every variable of every block of every inbound packet cost a map-node insert plus a new U8[] + copy, all freed again at clearMessage() — thousands of allocations per frame under object-update load.

  • LLMsgVarData stores payloads ≤ 24 bytes inline (covers every fixed wire type; largest is LLVector3d), so only long Variable fields heap-allocate. getData() computes the pointer, so vector growth can't dangle.
  • The per-block LLIndexedVector (vector + std::map index) is replaced by an insertion-ordered flat vector scanned by prehashed name pointer — same API and iteration order, zero per-entry node allocations. operator[] inserts keyed by name; the reader's getSize() overloads query with find() so lookups never insert.
  • decodeData() takes the variable ref from addVariable() instead of re-finding it by name; getString(std::string&) no longer zero-fills a 1501-byte buffer per call.

Testing

All llmessage suites pass in RelWithDebInfo (builder 47/47 incl. two new regression tests, llsdmessagebuilder 46/46, llsdmessagereader 21/21, message, parser, dispatcher 4/4); library also builds clean in Debug with the restored asserts active. Both new tests verified to fail against the pre-fix code.

🤖 Generated with Claude Code

RyeMutt and others added 6 commits June 11, 2026 00:43
…tore asserts

bufferInboundPacket() received straight into the ring slot at mHeadIndex
before knowing whether a packet had actually arrived. When the ring is
full, that slot holds the oldest *unread* packet, and the would-block
receive that terminates every drainSocket() pass reset its size to 0 and
desynced the byte/packet accounting. Since the drain path runs exactly
when message processing falls behind and the ring pins at its cap, every
drain under sustained overload corrupted one buffered packet and leaked
its byte count.

Receive into a scratch buffer and only commit the slot once we have a
real packet, the same way the SOCKS branch already works.

This was the single root cause behind all three asserts disabled in
cbf99f6 (they were raw assert(), so they only ever fired in Debug).
With the clobbering gone they are true invariants again; restore them as
llassert with comments on why each holds.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
…ive stats

- zeroCodeExpand() overflow case 3 reset outptr to the buffer start and
  kept decoding, handing a scrambled packet to the parser. Discard the
  malformed packet instead, mirroring overflow cases 1 and 2.
- sendMessage() computed the appended-ack budget with unsigned math, so
  an over-MTU payload (ChildAgentUpdate, SendXferPacket) wrapped to a
  huge space_left and got acks appended despite having no room. Signed
  math yields a negative count and skips the append.
- Record the message number in the receive-count list. It was never set
  (TODO from the babbage era), so dumpReceiveCounts() -- the diagnostic
  that fires on message storms -- has always attributed nothing.
- operator<< probed mMessageNumbers with operator[], permanently
  inserting a null entry for every missing id it touched. Use find().
- Log packets at exactly the minimum valid size (>= vs >).
- Fix the big-endian MVT_S16Array swizzle iterating n % 2 elements
  instead of n / 2. Inert on little-endian builds.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
…truncation

When more than 255 bytes were stuffed into a Variable-1 field, addData()
NUL-terminated the payload by casting away const and writing into the
caller's buffer -- undefined behavior through std::string::c_str() (both
addString overloads route here) and a straight crash for a string
literal.

Clamp the size, let addData() copy the payload, then terminate the
*stored* copy. Wire bytes and stored bytes are unchanged; only the
caller mutation is gone.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
…raffic

Hardening:
- decodeData() trusted the wire-claimed size of MVT_VARIABLE fields. A
  malformed or hostile packet could drive the copy up to 64KB past the
  receive buffer via a 2-byte length, or hand new U8[] a negative size
  via the 4-byte form. Clamp the claimed size to the bytes remaining in
  the packet and log it as ran-off-end. Regression test included
  (test<46>, verified to fail against the unclamped decoder).

Allocation overhaul -- previously every variable of every block of every
inbound packet cost a map-node insert plus a new U8[] + copy, all freed
again at clearMessage():
- LLMsgVarData stores payloads up to 24 bytes inline, sized to cover
  every fixed wire type (largest is LLVector3d), so only long Variable
  fields heap-allocate. getData() computes the pointer so vector growth
  cannot dangle into a moved-from element.
- Replace the per-block LLIndexedVector (vector + std::map index) with
  an insertion-ordered flat vector scanned by prehashed name pointer:
  same API and iteration order, zero per-entry node allocations.
  operator[] inserts keyed by name so repeated misses return the same
  entry, and the reader's getSize() overloads query with find() so
  lookups never insert at all.
- decodeData() takes the variable ref returned by addVariable() instead
  of re-finding it by name for addData(); reader getData() does one
  lookup instead of find() + operator[].

test<47> covers the Variable-1 truncation fix from the previous commit:
caller buffer untouched, stored copy clamped and NUL-terminated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
…sion

getData() now returns the number of bytes it copied, and both getString()
overloads NUL-terminate at exactly that length. The std::string overload
no longer zero-fills a 1501-byte stack buffer on every chat/name/string
read, and the char* overload no longer leaves the span between the copied
data and buffer_size-1 uninitialized, where an unterminated wire string
could expose stale stack contents to the caller's strlen.

Reinstate the outbound zerocode compression stats (TODO'd out since the
babbage era, leaving summarizeLogs reporting zeros) by accounting them in
LLMessageSystem::sendMessage(), which owns the counters and can detect
the buffer swap compressMessage() performs only when compression wins.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
…ss the wrap

A SOCKS datagram no larger than its header made bufferInboundPacket()
report 0, which drainSocket() reads as "socket empty", stranding any
packets still queued behind it. Discard the runt but report its raw
size so the drain keeps going; the existing accounting counts it as
received-but-dropped.

checkPacketInID()'s gap filler compared packet ids with plain <, so a
small forward gap spanning the 24-bit wrap (expecting 0xFFFFF8, got
0x000002) was misread as an out-of-order arrival: log spam plus a
resync that never marked the skipped ids as potentially lost. Use the
modular forward distance instead; the fill loop already wraps. A
genuinely old id yields a huge modular distance and still takes the
out-of-order path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@RyeMutt, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 14 minutes and 5 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 963ac4e1-879c-4d15-ab11-7a71ad537473

📥 Commits

Reviewing files that changed from the base of the PR and between 5541ce2 and a07fa99.

📒 Files selected for processing (2)
  • indra/llmessage/llmessagetemplate.cpp
  • indra/llmessage/lltemplatemessagereader.cpp
📝 Walkthrough

Walkthrough

This PR refactors the LL message system to support inline storage for small variable-data payloads, adds bounds-aware decoding to prevent buffer overruns, and strengthens packet ring buffering invariants. The changes include a new LLMsgVarDataMap container, safe iterator-based variable lookup, and explicit handling of truncated strings with proper NUL-termination.

Changes

Message System Safety and Inline Storage

Layer / File(s) Summary
Inline variable-data storage foundation
indra/llmessage/llmessagetemplate.h, indra/llmessage/llmessagetemplate.cpp, indra/llmessage/llsdmessagebuilder.cpp
LLMsgVarData gains inline storage (mInlineData buffer and INLINE_DATA_SIZE constant); accessors return heap or inline pointers based on size. New LLMsgVarDataMap container replaces LLIndexedVector. LLMsgBlkData::addVariable returns LLMsgVarData& and assignment-based initialization enables proper routing to inline or heap storage in addData.
Message reader return-type and variable lookup
indra/llmessage/lltemplatemessagereader.h, indra/llmessage/lltemplatemessagereader.cpp
getData returns S32 byte-count instead of void. Variable lookup switched to iterator-based find to avoid implicit insertion. Error paths return 0 on missing blocks/variables or size mismatches, enabling safe downstream operations.
Variable-length field bounds checking
indra/llmessage/lltemplatemessagereader.cpp
During message decode, wire-claimed length for variable fields is bounded against remaining packet bytes; oversized lengths are clamped to zero with logging. Decoding uses local LLMsgVarData& reference and vardata.addData calls instead of routing through the block, ensuring safe writes into the new inline/heap storage.
String reading with proper termination
indra/llmessage/lltemplatemessagereader.cpp
Both getString overloads use the S32 return value from getData to NUL-terminate at actual copied length (clamped to buffer capacity), rather than fixed termination points.
Message builder truncation with stored NUL-termination
indra/llmessage/lltemplatemessagebuilder.cpp
For oversized MVT_VARIABLE strings, truncation now clamps to 255 bytes, encodes via addData with explicit size, and NUL-terminates the stored buffer at index 254. Caller's input buffer remains unmodified.
Packet ring buffering with stronger invariants
indra/llmessage/llpacketring.cpp
receiveOrDropBufferedPacket adds debug assertions for invariants. bufferInboundPacket explicitly discards SOCKS runt datagrams and refactors the non-SOCKS path to receive into a scratch buffer first, only committing the ring slot after confirming packet_size > 0. Prevents clobbering unread packets and keeps accounting aligned.
Message system robustness updates
indra/llmessage/message.cpp, indra/llmessage/message.h
Packet logging threshold relaxed to >= minimum size. Compression counters increment only on buffer swap. ACK-append space uses signed math. Message-number tracking derives from template with null-safe fallback. Stream output uses find() to avoid mutation. zeroCodeExpand discards malformed packets. Big-endian MVT_S16Array byte-swap loop corrected to n/2 element count. Packet-gap check uses modular forward distance.
Message reader and builder round-trip tests
indra/llmessage/tests/lltemplatemessagebuilder_test.cpp
test<46> validates safe clamping of oversized on-wire variable-length fields to zero-length. test<47> validates string truncation and NUL-termination: confirms builder does not mutate caller input and reader round-trips a truncated, NUL-terminated 254-character string.

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

c/cpp

🐰 A variable's home, no longer far,
Now fits inside, both near and par.
With bounds that check before they read,
No overflow, no panic—we're freed!
The builder truncates with NUL in place,
And readers safely claim their space.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the three main objectives: hardening LLMessageSystem UDP paths, fixing packet ring corruption, and reducing per-packet allocations.
Description check ✅ Passed The PR description comprehensively covers security/robustness fixes, packet ring issues, correctness improvements, and allocation optimizations with specific details and testing confirmation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@indra/llmessage/llmessagetemplate.cpp`:
- Around line 49-56: The copy at the end of the constructor uses
htolememcpy(dest, data, mType, size) which applies endian-swizzling based on
mType even though new entries created via LLMsgBlkData::addData() are
initialized as MVT_U8; change the call to use the incoming/declared type (the
variable that represents the incoming data's MVT, not mType) so the initial copy
uses that type (e.g., pass the incomingType or dataType variable) when calling
htolememcpy, leaving mType for later updates; update references around
mData/mInlineData/INLINE_DATA_SIZE and htolememcpy accordingly.

In `@indra/llmessage/lltemplatemessagereader.cpp`:
- Around line 676-681: The code zeroes tsize but leaves decode_pos unchanged,
causing remaining payload bytes to be mis-parsed; when the on-wire length is
invalid (tsize > llmax(mReceiveSize - decode_pos, 0)), after calling
logRanOffEndOfPacket (and respecting custom), advance decode_pos to mReceiveSize
(i.e. consume the rest of the packet) and set tsize = 0 so later field reads
will see no bytes rather than reinterpret packet garbage; alternatively you may
choose to bail out by returning an error from the enclosing decode
function—apply this change in the block referencing decode_pos, tsize,
mReceiveSize and logRanOffEndOfPacket.
- Around line 684-685: The code calls vardata.addData(&buffer[decode_pos],
tsize, mvci.getType()) which discards the original wire data_size causing
LLMsgVarData::mDataSize to remain -1; update the call site in
lltemplatemessagereader.cpp so addData is invoked with the original data_size
(not tsize) so the stored LLMsgVarData preserves the wire length-width; this
will restore correct behavior for LLSDMessageBuilder::copyFromMessageData() /
copyToBuilder() round-trips by ensuring getDataSize() reflects the wire format
used when decoding.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 25127de0-c891-4f96-85ec-5b2fcd90e50f

📥 Commits

Reviewing files that changed from the base of the PR and between 9f7fe9c and 5541ce2.

📒 Files selected for processing (11)
  • indra/llmessage/llcircuit.cpp
  • indra/llmessage/llmessagetemplate.cpp
  • indra/llmessage/llmessagetemplate.h
  • indra/llmessage/llpacketring.cpp
  • indra/llmessage/llsdmessagebuilder.cpp
  • indra/llmessage/lltemplatemessagebuilder.cpp
  • indra/llmessage/lltemplatemessagereader.cpp
  • indra/llmessage/lltemplatemessagereader.h
  • indra/llmessage/message.cpp
  • indra/llmessage/message.h
  • indra/llmessage/tests/lltemplatemessagebuilder_test.cpp

Comment thread indra/llmessage/llmessagetemplate.cpp Outdated
Comment thread indra/llmessage/lltemplatemessagereader.cpp
Comment thread indra/llmessage/lltemplatemessagereader.cpp
…ared type

When a variable-length field claims more bytes than remain, advance
decode_pos to the end of the packet so subsequent fields zero-fill --
consistent with the other ran-off-end paths -- rather than parsing the
malformed field's payload as later fields.

LLMsgVarData::addData() now swizzles with the caller-declared type
instead of mType, which is still the default MVT_U8 when an entry is
created without addVariable() (pre-existing; identical under the old
LLIndexedVector). No effect on little-endian builds.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Signed-off-by: Rye <rye@alchemyviewer.org>
@RyeMutt RyeMutt merged commit e1a56e0 into develop Jun 11, 2026
17 checks passed
@RyeMutt RyeMutt deleted the rye/message-system-fixes branch June 11, 2026 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant