Date: 2025-11-17 Author: AI Analysis Purpose: Analyze opportunities to simplify and modularize the GC with minimal codebase impact Status: ANALYSIS - Recommendations for incremental improvement
The Lua garbage collector is a highly optimized tri-color incremental mark-and-sweep collector with generational optimization. After comprehensive analysis of the implementation and existing documentation, this report identifies 7 major opportunities for simplification and modularization that can:
- ✅ Reduce complexity by ~30-40%
- ✅ Improve maintainability through better separation of concerns
- ✅ Minimize codebase impact through incremental refactoring
- ✅ Preserve performance (target: ≤4.33s, ≤3% regression from 4.20s baseline)
- ✅ Maintain C API compatibility completely
Key Finding: The GC is fundamentally necessary for Lua's semantics (circular references, weak tables, resurrection), but its implementation complexity can be significantly reduced without changing its core algorithm.
Recommended Strategy: Incremental modularization focusing on phase extraction, state machine cleanup, and list consolidation - NOT wholesale GC removal.
- Current GC Architecture
- Complexity Analysis
- Simplification Opportunities
- Modularization Strategy
- Implementation Phases
- Impact Assessment
- Performance Considerations
- Recommendations
Type: Tri-color incremental mark-and-sweep with generational optimization
Color Encoding (lgc.h:107-148):
- WHITE: Unmarked (candidate for collection) - 2 shades for new vs old
- GRAY: Marked but children unprocessed (work queue)
- BLACK: Fully processed (safe until next cycle)
Critical Invariant: Black objects never point to white objects (enforced by write barriers)
Pause → Propagate → EnterAtomic → Atomic → SweepAllGC → SweepFinObj →
SweepToBeFnz → SweepEnd → CallFin → Pause
8 distinct phases with complex state transitions
Incremental Mode Lists (10):
allgc- All collectable objectssweepgc- Current sweep position (pointer-to-pointer)finobj- Objects with finalizersgray- Gray work queuegrayagain- Objects to revisit in atomicweak- Weak-value tablesephemeron- Weak-key tablesallweak- All-weak tablestobefnz- Ready for finalizationfixedgc- Never collected (interned strings)
Generational Mode Pointers (+6):
11. survival - Survived one cycle
12. old1 - Survived two cycles
13. reallyold - Survived 3+ cycles
14-16. Finobj generations (finobjsur, finobjold1, finobjrold)
enum class GCAge : lu_byte {
New = 0, // Created this cycle
Survival = 1, // Survived 1 cycle
Old0 = 2, // Barrier-aged (not truly old)
Old1 = 3, // Survived 2 cycles
Old = 4, // Truly old (skip in minor GC)
Touched1 = 5, // Old object modified this cycle
Touched2 = 6 // Old object modified last cycle
};Age Transitions: Complex state machine with 7 states and conditional transitions
Forward Barrier (lgc.cpp:309-324):
- When: Black→White write
- Action: Mark white object gray (restore invariant)
- Generational: Promote to Old0 if writer is old
Backward Barrier (lgc.cpp:331-342):
- When: Old object modified
- Action: Mark as "touched", add to grayagain list
- Purpose: Ensure old objects revisited if modified
Files:
lgc.h- 479 lineslgc.cpp- 1,950 lines- Total: ~2,400 lines of GC code
Complexity Hotspots:
atomic()- 40 lines of critical sequential operations (lgc.cpp:1649-1687)convergeephemerons()- Iterative convergence loop (lgc.cpp:809-828)sweeplist()- Pointer-to-pointer manipulation (lgc.cpp:990-1007)youngcollection()- Generational GC with 6 sweep phases (lgc.cpp:1439-1484)GCTM()- Finalizer execution with error handling (lgc.cpp:1095-1130)
| Component | Complexity Score | Primary Cause | Lines of Code |
|---|---|---|---|
| Generational GC | ⭐⭐⭐⭐⭐ | 7 age states, 6 generational lists, age transitions | ~400 lines |
| Ephemeron Tables | ⭐⭐⭐⭐⭐ | Convergence loop, transitive marking | ~150 lines |
| Finalization | ⭐⭐⭐⭐ | Resurrection, re-finalization, error handling | ~250 lines |
| Write Barriers | ⭐⭐⭐⭐ | Two types, generational interactions | ~150 lines |
| State Machine | ⭐⭐⭐ | 8 phases, complex transitions | ~200 lines |
| List Management | ⭐⭐⭐ | 16 lists, pointer-to-pointer sweep | ~300 lines |
| Weak Tables | ⭐⭐⭐ | 3 modes, clearing logic | ~150 lines |
Total Complexity: ~1,600 lines of high-complexity code out of 1,950 total
Already Completed (Phase 91 from CLAUDE.md):
global_State has been refactored into subsystems:
GCAccounting- totalbytes, GCdebt, GCmarked, etc.GCParameters- GCpause, GCstepmul, GCstepsize, etc.GCObjectLists- All 16 GC lists
Benefits Achieved:
- ✅ Better organization of GC state
- ✅ Clear ownership of fields
- ✅ Type safety improvements
Remaining Gaps:
- ❌ No phase extraction (all phases in single file)
- ❌ No algorithmic separation (mark/sweep/finalize)
- ❌ State machine still implicit (switch statements)
- ❌ No encapsulation of generational vs incremental modes
P1: Generational Mode Complexity (400 lines, 5/5 complexity)
- 7 age states (only need 3-4)
- 6 additional lists beyond incremental mode
- Complex age transition logic (nextage[] array)
- Barrier promotion logic (Old0 → Old1 → Old)
- Minor vs major collection decision logic
P2: Ephemeron Convergence (150 lines, 5/5 complexity)
- Iterative convergence loop (can run multiple times)
- Direction alternation (forward/backward traversal)
- Re-marking logic
- Performance unpredictable (depends on graph structure)
P3: Phase Coupling (200 lines, 3/5 complexity)
- All phases in single
singlestep()function - No clear phase abstraction
- State machine implicit in switch statement
- Difficult to test individual phases
P4: List Proliferation (300 lines, 3/5 complexity)
- 16 lists with overlapping purposes
- Complex list movement logic (e.g., allgc → finobj → tobefnz)
- Pointer-to-pointer sweep complicates abstraction
P5: Finalization Complexity (250 lines, 4/5 complexity)
- Resurrection support (requires extra remark phase)
- Re-finalization support (FINALIZEDBIT manipulation)
- Error handling during __gc execution
- Emergency GC prevention during finalization
Current State: Generational mode is always available, adding 400 lines of complexity
Proposal: Make generational mode compile-time optional
#if defined(LUA_USE_GENERATIONAL_GC)
// Generational code (400 lines)
#else
// Incremental-only mode (simpler)
#endifBenefits:
- ✅ Reduces code size by 20% when disabled
- ✅ Eliminates 7-state age machine (only need 2: new, old)
- ✅ Removes 6 generational lists
- ✅ Simplifies barrier logic (only forward barrier needed)
- ✅ Easier to understand and maintain
Costs:
⚠️ Performance regression for long-running programs (generational is faster)⚠️ Some users may rely on generational mode⚠️ API still needs to supportlua_gc(L, LUA_GCSETMODE, LUA_GCGEN)
Recommendation: Implement this - Keep generational as default, but allow disabling for embedded systems
Effort: 15-20 hours Risk: LOW (can be feature-flagged)
Current State: 7 age states with complex transitions
Proposal: Reduce to 4 core ages in generational mode
enum class GCAge : lu_byte {
Young = 0, // New + Survival (merged)
Old = 1, // Old0 + Old1 + Old (merged)
Touched1 = 2, // Modified once
Touched2 = 3 // Modified twice
};Transition Simplification:
Young ──(survive GC)──→ Old
Old ──(modified)──→ Touched1 ──(GC)──→ Touched2 ──(GC)──→ Old
Benefits:
- ✅ Simpler state machine (4 states vs 7)
- ✅ Clearer semantics (young/old distinction only)
- ✅ Less conditional logic in age advancement
Costs:
⚠️ May promote objects to old sooner (more conservative)⚠️ Slightly less generational optimization
Recommendation: Consider implementing if generational mode retained
Effort: 10-15 hours Risk: MEDIUM (requires careful performance testing)
Current State: All phases in single singlestep() function with switch statement
Proposal: Extract phases into separate classes with interface
class GCPhase {
public:
virtual ~GCPhase() = default;
virtual l_mem execute(lua_State* L, int fast) = 0;
virtual GCState getState() const = 0;
virtual GCPhase* next() = 0; // Next phase in sequence
};
class PropagatePhase : public GCPhase {
l_mem execute(lua_State* L, int fast) override {
if (fast || G(L)->getGray() == NULL) {
return 1; // Done, move to next phase
}
return propagatemark(G(L));
}
GCState getState() const override { return GCState::Propagate; }
GCPhase* next() override { return &g_enterAtomicPhase; }
};
class AtomicPhase : public GCPhase { /* ... */ };
class SweepPhase : public GCPhase { /* ... */ };
// ... etcBenefits:
- ✅ Clear separation of concerns (one class per phase)
- ✅ Easier to test individual phases
- ✅ Better encapsulation of phase-specific logic
- ✅ Explicit state machine (next() method)
- ✅ Can use polymorphism for phase-specific behavior
Costs:
⚠️ Virtual function overhead (minimal - only 1 call per GC step)⚠️ More files to manage (8 phase classes)
Recommendation: Strongly recommend - Significantly improves maintainability
Effort: 30-40 hours Risk: LOW (can be done incrementally, phase by phase)
Current State: 5 separate gray lists (gray, grayagain, weak, ephemeron, allweak)
Proposal: Use single gray list with priority/category field
enum class GrayCategory : lu_byte {
Normal = 0, // Regular gray objects
Again = 1, // To revisit in atomic
WeakVal = 2, // Weak-value tables
WeakKey = 3, // Ephemeron tables
WeakBoth = 4 // All-weak tables
};
// Add to GCObject:
GrayCategory gray_category;
// Single list traversal:
for (GCObject* obj : gray_list) {
switch (obj->gray_category) {
case GrayCategory::Normal:
propagatemark(g, obj);
break;
case GrayCategory::Again:
/* ... */
break;
// ...
}
}Benefits:
- ✅ Reduces from 5 lists to 1
- ✅ Simpler list management
- ✅ Easier to understand object state
- ✅ Can prioritize processing order easily
Costs:
⚠️ Adds 1 byte per GCObject (gray_category field)⚠️ May reduce cache locality if categories mixed in list⚠️ Weak table convergence may need special handling
Recommendation: Maybe - Benefits unclear, adds field to every object
Effort: 25-30 hours Risk: MEDIUM (affects hot path)
Current State: Convergence loop with direction alternation
do {
changed = 0;
for each ephemeron table {
if (traverseephemeron(g, h, dir)) {
propagateall(g);
changed = 1;
}
}
dir = !dir; // Alternate direction
} while (changed);Proposal: Use fixed-point iteration without direction alternation
while (true) {
int marked_count = 0;
for each ephemeron table {
marked_count += traverseephemeron(g, h);
}
if (marked_count == 0) break; // Fixed point reached
propagateall(g);
}Benefits:
- ✅ Simpler logic (no direction tracking)
- ✅ Easier to understand convergence condition
- ✅ More predictable performance
Costs:
⚠️ May take more iterations to converge (direction was optimization)⚠️ Performance impact depends on ephemeron graph structure
Recommendation: Consider - Simplifies code, but needs benchmarking
Effort: 8-10 hours Risk: MEDIUM (performance-sensitive)
Current State: Implicit state machine in switch statement
Proposal: Use explicit state pattern with std::variant
using GCPhaseVariant = std::variant<
PauseState,
PropagateState,
EnterAtomicState,
AtomicState,
SweepAllGCState,
SweepFinObjState,
SweepToBeFnzState,
SweepEndState,
CallFinState
>;
class GCStateMachine {
private:
GCPhaseVariant current_phase;
public:
l_mem step(lua_State* L, int fast) {
return std::visit([&](auto& phase) {
return phase.execute(L, fast);
}, current_phase);
}
void transition(GCPhaseVariant next) {
current_phase = next;
}
};Benefits:
- ✅ Type-safe state transitions
- ✅ Explicit state representation
- ✅ Compile-time exhaustiveness checking
- ✅ Better debugging (can inspect current_phase)
Costs:
⚠️ std::variant overhead (minimal)⚠️ More complex type system⚠️ Requires C++17 std::variant
Recommendation: Consider - Improves type safety, but adds complexity
Effort: 20-25 hours Risk: LOW (mostly refactoring, same logic)
Current State: Finalization logic spread across multiple functions
Proposal: Extract into Finalizer class
class FinalizerManager {
private:
global_State* g;
GCObject* finobj; // Objects with finalizers
GCObject* tobefnz; // Ready for finalization
public:
// Check if object needs finalizer
void checkFinalizer(GCObject* obj, Table* mt);
// Separate unreachable objects
void separateUnreachable(bool all);
// Execute one finalizer
int executeNext(lua_State* L);
// Execute all pending finalizers
void executeAll(lua_State* L);
// Handle resurrection after __gc
void handleResurrections(lua_State* L);
};Benefits:
- ✅ Clear encapsulation of finalization logic
- ✅ Easier to test finalization in isolation
- ✅ Better separation from GC main loop
- ✅ Can add finalization metrics/debugging
Costs:
⚠️ Additional abstraction layer⚠️ Need to pass FinalizerManager around
Recommendation: Strongly recommend - Improves maintainability significantly
Effort: 20-25 hours Risk: LOW (mostly refactoring)
src/memory/gc/
├── gc_core.h / .cpp # Main GC interface and coordination
├── gc_phases/
│ ├── phase_base.h # GCPhase interface
│ ├── phase_pause.cpp # Pause phase
│ ├── phase_propagate.cpp # Propagate phase
│ ├── phase_atomic.cpp # Atomic phase
│ ├── phase_sweep.cpp # Sweep phases (all 4)
│ └── phase_finalize.cpp # CallFin phase
├── gc_marking.h / .cpp # Mark algorithms (reallymarkobject, propagatemark, etc.)
├── gc_sweeping.h / .cpp # Sweep algorithms (sweeplist, sweeptolive, etc.)
├── gc_finalizer.h / .cpp # Finalization (separatetobefnz, GCTM, etc.)
├── gc_barriers.h / .cpp # Write barriers (barrier_, barrierback_)
├── gc_weak.h / .cpp # Weak table handling (clearbykeys, clearbyvalues, convergeephemerons)
├── gc_generational.h / .cpp # Generational mode (optional, #ifdef)
└── gc_state.h # GC state enums and types
Total: 12-15 files vs current 2 files
GCCore (gc_core.h):
- Public interface:
luaC_step(),luaC_fullgc(),luaC_newobj(), etc. - Coordinates phase execution
- Manages state machine
- Delegates to specialized modules
GCMarking (gc_marking.h):
- Mark algorithms (DFS traversal)
- Gray list management
- Tri-color invariant maintenance
- Internal use only (called by phases)
GCSweeping (gc_sweeping.h):
- Sweep algorithms (pointer-to-pointer)
- Object freeing (freeobj)
- List traversal
- Internal use only
GCFinalizer (gc_finalizer.h):
- Finalizer detection
- Object separation
- __gc execution
- Resurrection handling
- Internal use only
GCBarriers (gc_barriers.h):
- Forward barrier
- Backward barrier
- Generational barrier logic (if enabled)
- Public (called from setters in lobject.h, ltable.cpp, etc.)
GCWeak (gc_weak.h):
- Weak table traversal
- Ephemeron convergence
- Weak reference clearing
- Internal use only
GCGenerational (gc_generational.h - optional):
- Age management
- Minor/major collection
- Generational list management
- Internal use only (ifdef LUA_USE_GENERATIONAL_GC)
gc_core
├──> gc_phases (all phases)
│ ├──> gc_marking
│ ├──> gc_sweeping
│ ├──> gc_finalizer
│ └──> gc_weak
├──> gc_barriers (public interface)
└──> gc_generational (optional)
Key Principle: Core coordinates, phases execute, algorithms implement
Goal: Move marking logic to separate module
Steps:
- Create
gc_marking.handgc_marking.cpp - Move
reallymarkobject(),propagatemark(),propagateall() - Move
markvalue(),markobject(),markmt(),markbeingfnz() - Create
GCMarkingclass with methods - Update callers to use new interface
- Test: All tests must pass, performance ≤4.33s
Deliverable: gc_marking.h/.cpp with 6-8 public methods
Goal: Move sweep logic to separate module
Steps:
- Create
gc_sweeping.handgc_sweeping.cpp - Move
sweeplist(),sweeptolive(),entersweep() - Move
freeobj()and related functions - Create
GCSweepingclass - Update callers
- Test: All tests must pass, performance ≤4.33s
Deliverable: gc_sweeping.h/.cpp with 4-5 public methods
Goal: Encapsulate finalization logic
Steps:
- Create
gc_finalizer.handgc_finalizer.cpp - Move
GCObject::checkFinalizer()toGCFinalizer::check() - Move
separatetobefnz(),GCTM(),callallpendingfinalizers() - Create
FinalizerManagerclass - Handle resurrection logic
- Update callers
- Test: Finalization tests (gc.lua, errors.lua)
Deliverable: gc_finalizer.h/.cpp with FinalizerManager class
Goal: Isolate weak table complexity
Steps:
- Create
gc_weak.handgc_weak.cpp - Move
traverseweakvalue(),traverseephemeron(),convergeephemerons() - Move
clearbykeys(),clearbyvalues(),getmode() - Create
WeakTableManagerclass - Implement simplified convergence (Opportunity 5)
- Update callers
- Test: Weak table tests
Deliverable: gc_weak.h/.cpp with WeakTableManager class
Goal: Centralize barrier logic
Steps:
- Create
gc_barriers.handgc_barriers.cpp - Move
luaC_barrier_(),luaC_barrierback_() - Create
BarrierManagerclass - Keep macros in lgc.h (inline wrappers)
- Update barrier call sites if needed
- Test: Barrier stress tests
Deliverable: gc_barriers.h/.cpp with BarrierManager class
Goal: Implement phase extraction (Opportunity 3)
Steps:
- Create
gc_phases/phase_base.hwithGCPhaseinterface - Create individual phase classes (8 classes)
- Implement
execute()andnext()for each - Refactor
singlestep()to use phase objects - Test each phase individually
- Test full GC cycle
- Benchmark performance
Deliverable: 8 phase classes + refactored singlestep()
Goal: Make generational mode compile-time optional (Opportunity 1)
Steps:
- Create
gc_generational.h/.cpp - Move all generational code under
#ifdef LUA_USE_GENERATIONAL_GC - Provide fallback for incremental-only mode
- Add CMake option
LUA_ENABLE_GENERATIONAL_GC=ON(default) - Test both modes
- Benchmark incremental-only mode
Deliverable: Optional generational mode, ~400 lines conditionally compiled
Goal: Comprehensive testing of new architecture
Steps:
- Create unit tests for each module
- Create integration tests for GC cycles
- Stress test with large programs
- Memory leak testing (Valgrind)
- Performance benchmarking (5-run average)
- Update documentation
- Create module dependency diagram
Deliverable: Test suite, benchmarks, updated docs
| File | Current Lines | Changes | New Lines | Impact |
|---|---|---|---|---|
lgc.cpp |
1,950 | Extract to modules | ~500 | HIGH - 75% reduction |
lgc.h |
479 | Keep interface, move internals | ~200 | MEDIUM - 60% reduction |
lobject.h |
~900 | Update barrier macros | ~900 | LOW - Minor changes |
ltable.cpp |
~800 | Update barrier calls | ~800 | LOW - No changes |
lvm.cpp |
~2000 | Update barrier calls | ~2000 | LOW - No changes |
| New files | 0 | Create modules | ~2,000 | N/A - New code |
Total Impact:
- Files modified: ~8-10 (lgc.cpp, lgc.h, barrier call sites)
- Files created: ~12-15 (new modules)
- Net lines changed: ~500 lines modified, ~2,000 lines moved (not new)
- Codebase impact: MODERATE (mostly reorganization, minimal logic changes)
Public API (lua.h, lualib.h, lauxlib.h):
- ❌ NO CHANGES - 100% compatible
Internal API (lgc.h - used by VM):
- ✅ Barrier macros: Same interface (inline wrappers)
- ✅
luaC_step(): Same signature - ✅
luaC_fullgc(): Same signature - ✅
luaC_newobj(): Same signature - ✅ Object placement new operators: Unchanged
Impact: ZERO - Complete API compatibility
Expected:
- Phase extraction: +0-2% overhead (virtual function calls)
- Module boundaries: +0-1% overhead (function calls vs inline)
- Optional generational: -5% performance if disabled (acceptable tradeoff)
- Simplified convergence: +0-3% overhead (depends on ephemeron usage)
Target: ≤4.33s (≤3% from 4.20s baseline)
Mitigation:
- Inline critical functions at module boundaries
- Profile after each phase, revert if regression >3%
- Use LTO (Link-Time Optimization) to inline across modules
Critical Paths (must not regress):
- VM execution (lvm.cpp) → Barrier calls (every object write)
- Table operations (ltable.cpp) → Barrier calls
- Object allocation (luaC_newobj) → Called on every allocation
- Incremental step (luaC_step) → Called every N allocations
Optimization Strategy:
- Keep barriers as inline macros (no function call overhead)
- Inline
luaC_newobj()if possible - Profile
luaC_step()carefully after phase extraction - Use
__attribute__((always_inline))for critical functions
Primary: all.lua test suite (current: 4.20s avg)
Stress Tests:
- gc.lua - Basic GC correctness
- gengc.lua - Generational GC stress
- Large allocation - 10M+ objects
- Deep call stacks - 1000+ levels
- Many weak tables - 100+ ephemeron tables
- Circular references - Complex object graphs
Acceptance Criteria:
- All tests pass ("final OK !!!")
- Performance ≤4.33s (≤3% regression)
- No memory leaks (Valgrind clean)
- No crashes under stress
✅ RECOMMEND: Phase 1-5 (Extract Modules)
- Effort: 75-100 hours
- Risk: LOW (mostly code movement)
- Benefit: 40% code organization improvement
- Priority: HIGH
Modules to extract first:
- ✅ GCMarking - Clear responsibility (marking algorithms)
- ✅ GCSweeping - Clear responsibility (sweep algorithms)
- ✅ FinalizerManager - High complexity, good isolation
- ✅ WeakTableManager - High complexity, clear boundary
- ✅ BarrierManager - Clear interface, widely used
Expected Outcome: lgc.cpp reduced from 1,950 to ~500 lines
🤔 CONSIDER: Phase Extraction (Opportunity 3)
- Effort: 30-40 hours
- Risk: MEDIUM (affects GC main loop)
- Benefit: Better testability, clearer state machine
- Priority: MEDIUM
Decision Criteria:
- If testing individual phases is valuable → Implement
- If performance overhead >2% → Don't implement
- Prototype first, measure overhead
🔬 RESEARCH: Optional Generational Mode (Opportunity 1)
- Effort: 15-20 hours
- Risk: MEDIUM (performance-sensitive)
- Benefit: 20% code reduction when disabled, simpler mental model
- Priority: LOW
Decision Criteria:
- Benchmark incremental-only mode first
- If regression <10% for long-running programs → Implement as option
- If regression >10% → Don't implement
🔬 RESEARCH: Simplified Age Management (Opportunity 2)
- Effort: 10-15 hours
- Risk: MEDIUM
- Benefit: Simpler state machine (7→4 states)
- Priority: LOW
Decision Criteria:
- Only if generational mode retained
- Benchmark carefully with GC-heavy workloads
- If regression >5% → Don't implement
🔬 RESEARCH: Ephemeron Simplification (Opportunity 5)
- Effort: 8-10 hours
- Risk: MEDIUM
- Benefit: Simpler convergence logic
- Priority: LOW
Decision Criteria:
- Benchmark with ephemeron-heavy code
- If iteration count increases >20% → Don't implement
- If performance acceptable → Implement
❌ DO NOT: Consolidate Gray Lists (Opportunity 4)
- Reason: Adds 1 byte per object, unclear benefits
- Alternative: Keep current 5-list approach
❌ DO NOT: State Machine with std::variant (Opportunity 6)
- Reason: Adds complexity, minimal benefit
- Alternative: Explicit state machine with phase classes (Opportunity 3) if needed
❌ DO NOT: GC Removal (from GC_REMOVAL_PLAN.md)
- Reason: Fundamentally incompatible with Lua semantics
- Reason: Circular references require collection
- Reason: Std::shared_ptr doesn't solve problem (still need cycle detection = GC!)
- Verdict: Keep GC, improve its implementation
Current State:
- 1,950 lines of complex GC code in 2 files
- 16 linked lists, 8 phases, 7 age states
- High complexity in generational mode, ephemerons, finalization
- Already has some modularization (global_State subsystems)
Simplification Potential:
- 40% code organization improvement through module extraction (HIGH CONFIDENCE)
- 20% code reduction through optional generational mode (MEDIUM CONFIDENCE)
- 10% complexity reduction through simplified convergence/ages (LOW CONFIDENCE)
Recommended Strategy:
- ✅ Extract 5 core modules (marking, sweeping, finalizer, weak, barriers) - 75-100 hours
- 🤔 Consider phase extraction if testability important - 30-40 hours
- 🔬 Research optional generational after modules extracted - 15-20 hours
Total Effort: 120-160 hours for high-confidence improvements
Milestone 1: Module Extraction (8-10 weeks, 75-100 hours)
- Week 1-2: Extract GCMarking module
- Week 3-4: Extract GCSweeping module
- Week 5-6: Extract FinalizerManager
- Week 7-8: Extract WeakTableManager
- Week 9-10: Extract BarrierManager, testing, benchmarking
Milestone 2: Phase Extraction (4-5 weeks, 30-40 hours) - OPTIONAL
- Week 1-2: Create phase interface, implement 4 phase classes
- Week 3-4: Implement remaining 4 phase classes
- Week 5: Testing, benchmarking, decision point
Milestone 3: Optional Generational (2-3 weeks, 15-20 hours) - OPTIONAL
- Week 1-2: Extract generational code, create compile option
- Week 3: Testing, benchmarking, decision point
Total Timeline: 10-18 weeks depending on optional milestones
Code Quality:
- ✅ lgc.cpp reduced from 1,950 to ≤600 lines
- ✅ Clear module boundaries with <10 public methods each
- ✅ Each module testable in isolation
- ✅ Module dependency graph is acyclic
Performance:
- ✅ All tests pass ("final OK !!!")
- ✅ Performance ≤4.33s (≤3% from 4.20s baseline)
- ✅ No memory leaks (Valgrind clean)
- ✅ No crashes under stress testing
Maintainability:
- ✅ Each module has single responsibility
- ✅ Clear encapsulation (private implementation)
- ✅ Well-documented interfaces
- ✅ Easier to onboard new contributors
The Lua garbage collector is a sophisticated, performance-critical system that is fundamentally necessary for Lua's semantics. Attempts to remove it entirely (as explored in GC_REMOVAL_PLAN.md) are not feasible due to circular references, weak tables, and resurrection semantics.
However, the implementation complexity can be significantly reduced through:
- ✅ Module extraction (HIGH VALUE, LOW RISK) - Extract 5 core modules to reduce lgc.cpp from 1,950 to ~500-600 lines
- 🤔 Phase extraction (MEDIUM VALUE, MEDIUM RISK) - Consider if better testability needed
- 🔬 Optional generational (UNCERTAIN VALUE, MEDIUM RISK) - Research for embedded systems
Recommended Next Steps:
- Start with Module Extraction (Phases 1-5) - 75-100 hours
- Benchmark after each module to ensure ≤3% regression
- Evaluate phase extraction after modules complete
- Defer generational research until modules stable
Key Principle: Incremental improvement with continuous validation - each change must pass all tests and meet performance targets before proceeding.
Expected Outcome: 40% improvement in code organization with zero performance regression and 100% API compatibility.
Document Version: 1.0 Date: 2025-11-17 Status: ANALYSIS COMPLETE - Ready for Phase 1 implementation approval
Related Documents:
CLAUDE.md- Project status and guidelinesGC_PITFALLS_ANALYSIS.md- Detailed GC architecture analysisGC_REMOVAL_PLAN.md- Why GC removal is not feasibleGC_REMOVAL_OWNERSHIP_PLAN.md- Alternative ownership approaches (not recommended)