Skip to content

Conversation

@lL1l1
Copy link
Contributor

@lL1l1 lL1l1 commented Dec 20, 2025

Issue

The profiler can send unserializable data through the upvalues provided in function debug information. The code handles basic cases with functions and cfunctions, but not everything. Tables are the main issue since they're easy to find reasons to upvalue and they can have cyclic references (the engine recursively serializes without cycle protection, so it causes a stack overflow) and they can reference unserializable data of a wide variety of types.

Description of the proposed changes

  1. Look in the engine to find serializable types, and document them in utilities.lua in the SERIALIZABLE_TYPES table
  2. Add the utility function SerializableDeepCopy to convert unserializable values/tables to serializable versions.
  3. Add a benchmark comparing accessing values via cfunc vs lua. This is what led me to discover the issue in the first place, since it upvalued ArmyBrains, which has cyclic reference through Brain.CDR.Brain and references userdata (_c_objects)/other objects (threads).

Testing done on the proposed changes

The benchmark loads when you navigate to its file in the profiler and select one of the local/upvalue lua functions (this means the ArmyBrains table was serialized successfully).

An example command:

local a = {ForkThread(function()end)}
local b = {}
local c = {}
local d = {}
a.b = b
a.c = c
b.d = d
c.d = d
d.a = a
local utils = import('/lua/utilities.lua')
reprsl(utils.SerializableDeepCopy(a), {depth = 4})

Checklist

Summary by CodeRabbit

  • Chores
    • Added a new benchmarking suite to measure and analyze performance metrics.
    • Enhanced serialization utilities to improve data handling and compatibility across system boundaries.
    • Improved profiler infrastructure for better data processing and cycle detection.

✏️ Tip: You can customize this high-level summary in your review settings.

utility for converting unserializable data to serializable data that can be sent across the sim-ui boundary
@lL1l1 lL1l1 requested review from 4z0t, Hdt80bro and speed2CZ December 20, 2025 04:37
@lL1l1 lL1l1 added type: bug area: sim Area that is affected by the Simulation of the Game labels Dec 20, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 20, 2025

📝 Walkthrough

Walkthrough

Three files updated to add Lua benchmark infrastructure and serialization utilities: a new benchmark module measuring ArmyBrains value-access performance across six different patterns, a serialization helper function with cycle detection for UI-sim boundary data transfer, and Profiler refactored to use the new serialization approach instead of in-place upvalue sanitization.

Changes

Cohort / File(s) Summary
Benchmark module
lua/benchmarks/value-access.lua
New file adding six benchmark functions (BrainGlobalAccess, BrainUpvalueAccess, BrainLocalAccess, BrainGlobalCFuncAccess, BrainUpvaluedCFuncAccess, BrainLocalCFuncAccess) measuring elapsed time for different ArmyBrains data access patterns; includes ModuleName and BenchmarkData metadata table.
Serialization utilities
lua/utilities.lua
New public function SerializableDeepCopy(t) that recursively deep-copies tables while converting non-serializable types to strings; includes cycle detection via backrefs map and introduces SERIALIZABLE_TYPES enumeration for identifying safe-to-transfer types.
Profiler refactoring
lua/sim/Profiler.lua
Imports SerializableDeepCopy and replaces in-place upvalue sanitization loop (which substituted functions with placeholder strings) with direct deep-copy of info.upvalues.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hoppy times with benchmarks bright,
Access patterns measured right,
Cycles caught, data blessed,
Serialization puts strings to test!
From loop to table, deep we go,
Copy safely, fast or slow. 🎯

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately describes the main change: fixing the profiler's handling of unserializable data in function upvalues.
Description check ✅ Passed The pull request description comprehensively covers the issue, proposed changes, testing performed, and includes all checklist items marked as complete.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fb6b69 and cf050a9.

📒 Files selected for processing (3)
  • lua/benchmarks/value-access.lua
  • lua/sim/Profiler.lua
  • lua/utilities.lua
🧰 Additional context used
🧬 Code graph analysis (2)
lua/sim/Profiler.lua (2)
engine/Library.lua (1)
  • import (51-52)
lua/utilities.lua (1)
  • SerializableDeepCopy (589-631)
lua/benchmarks/value-access.lua (1)
engine/Sim.lua (1)
  • GetArmyBrain (588-589)
🔇 Additional comments (7)
lua/utilities.lua (2)

572-580: LGTM - Clear documentation of serializable types.

The constant provides a clear enumeration of types that can safely cross the UI-sim boundary. Using a lookup table is efficient for type checking.


583-631: Well-structured recursive implementation with cycle detection.

The function correctly handles the serialization requirements. A few observations:

  1. The generic type annotation (@generic T) is slightly misleading since the function may transform types (e.g., functions → strings), but this is acceptable for documentation purposes.

  2. Line 624: Serializing keys with CreateSerializableAny(k) could theoretically cause key collisions if two distinct non-serializable keys (e.g., different function references) produce the same string representation. This is an unlikely edge case but worth noting for complex inputs.

lua/sim/Profiler.lua (2)

10-11: LGTM - Clean import of the new utility.

The import follows the established pattern in this file and provides the serialization capability needed for upvalues.


403-405: Good refactoring to use the centralized serialization utility.

This is cleaner than the previous approach of manually replacing non-serializable upvalues with placeholder strings. The SerializableDeepCopy function handles cycles and all non-serializable types uniformly.

lua/benchmarks/value-access.lua (3)

1-18: Good benchmark documentation with clear methodology.

The header comments provide useful context with actual timing results and a clear conclusion. The BenchmarkData table properly maps function names to descriptive titles for the benchmark UI.


20-65: Consistent benchmark structure for Lua table access patterns.

The three functions (BrainGlobalAccess, BrainUpvalueAccess, BrainLocalAccess) correctly isolate the variable lookup mechanism being tested:

  • Global: resolves ArmyBrains through _G each iteration
  • Upvalue: uses the pre-captured local at line 35
  • Local: uses the function-scoped local created before timing starts

The pattern of caching the timer and creating an unused assignment variable is appropriate for preventing optimization-related measurement artifacts.


67-113: CFunc benchmarks mirror the structure of Lua access benchmarks.

The three CFunc-based functions follow the same pattern, providing a fair comparison. The upvalue at line 82 for GetArmyBrain is the type of reference that triggered the serialization issue this PR addresses.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lL1l1 lL1l1 marked this pull request as ready for review December 20, 2025 04:39
b._seenTableId = tostring(_t)
end
-- format makes it easy to find in repr output
return '_seenTableId = "' .. b._seenTableId .. '"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the table ID never changes, why dirty the table with an unoriginal field to store the seen table ID? We could use tostring every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table ID should change after being copied across the sim-ui boundary, so the table tostring won't give useful information.
An alternative to not dirty the table would be to return an extra table documenting all the references that were converted to strings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you already are using tostring; as you've written it right now, if a table's _seenTableId field exists, it is always filled with tostring applied to that table. You gave a fine reason to use tostring in your previous reply, so I'm not sure why you're saying it doesn't give any useful information here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a table's _seenTableId field exists, it is always filled with tostring applied to that table

This is not true because the table is presumedly copied across sim-ui, where the field will be filled with tostring applied to the old table before the copying. When you read the copied table you need the old tostring ID to rebuild cyclic references.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see - I thought _seenTableId was only being used for serialization (the return value on line 618). But you also want to leave it as a marker as part of the data in case you want to deserialize it (which we haven't set up) rather than only using the data we pass for textual purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: sim Area that is affected by the Simulation of the Game type: bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants