Skip to content

code-coach review: PR #7 perf series — 1 must-fix + 2 should-fix + 2 suggestions #8

@psmon

Description

@psmon

Source

Verdict

Severity Count
Must-fix 1
Should-fix 2
Suggestion 2

PR #7 closed without merge so the contributor can iterate. The ExePath fix (04baa6f) is independently correct and welcome as its own PR.

Findings

M1 (Must-fix) — AgentBotActor leaks live HashSet<string> across actor boundary

File: Project/ZeroCommon/Actors/AgentBotActor.cs:173

// PR
Sender.Tell(new ActiveConversationsReply(_activeConversations));

// Fix
Sender.Tell(new ActiveConversationsReply([.. _activeConversations]));

_activeConversations is a HashSet<string> only safe to touch on the AgentBotActor's dispatcher. Akka's invariant: messages must not share mutable state. Once the bare HashSet reference crosses the Tell boundary, the receiver (or its Ask continuation) can enumerate while the bot processes a MarkConversationActive / ClearConversationActive / ResetReactorSessionInvalidOperationException: Collection was modified during enumeration, or undefined results from a mid-rehash Contains. The HashSet typically holds 1–3 entries; the perf upside is essentially zero versus the correctness regression.

[.. _activeConversations] (collection-expression spread → string[]) keeps the allocation-cheap shape the optimization plan wanted without breaking the actor invariant.

S1 (Should-fix) — ActiveConversationsReply.Active narrowed IReadOnlyList<string>IReadOnlyCollection<string>

File: Project/ZeroCommon/Actors/Messages.cs:271

// PR
public sealed record ActiveConversationsReply(IReadOnlyCollection<string> Active);

// Fix
public sealed record ActiveConversationsReply(IReadOnlyList<string> Active);

Public actor message contract. IReadOnlyList<T> exposes indexer + stable order; IReadOnlyCollection<T> does not. Existing tests happen not to use the indexer, so they pass — but downstream consumers that do (or future code that wants positional comparison) silently break. The narrowing isn't required for the M1 fix: [.. _activeConversations] returns string[], which already implements IReadOnlyList<string>.

S2 (Should-fix) — ExternalChatSession.Messages = _history.AsReadOnly() exposes a live view of mutable history

File: Project/ZeroCommon/Llm/ExternalChatSession.cs:55 (and Project/ZeroCommon/Llm/Tools/ExternalAgentToolLoop.cs:202)

// PR
Messages = _history.AsReadOnly(),

// Fix
Messages = _history.ToArray(),

List<T>.AsReadOnly() is a wrapper over the same backing list; the previous _history.ToList() was a snapshot. PR ships a test pinning the regressed semantic:

[Fact]
public void LlmRequest_Messages_is_not_mutated_through_readonly_wrapper()
{
    var history = new List<LlmMessage> { LlmMessage.User("a") };
    var request = new LlmRequest { Messages = history.AsReadOnly() };
    history.Add(LlmMessage.User("b"));
    Assert.Equal(2, request.Messages.Count);  // ← demonstrates external mutation IS visible
}

Test name says "is_not_mutated" but the assertion proves external mutation IS reflected through the wrapper.

In OpenAiCompatibleProvider.BuildRequestBody body construction + serialization is synchronous before the first await, so single in-flight requests don't race today. But any future await inserted between body-build and HttpClient.SendAsync, or any concurrent fan-out, would silently include later-appended turns on the wire. Same applies to ExternalAgentToolLoop.

_history.ToArray() keeps the perf shape (single allocation, no wrapper indirection) and restores the snapshot semantic. OptimizationTests.cs should be updated and the misleading test name fixed.

Sg1 (Suggestion) — AppLogger size estimate undercounts UTF-8 by ~3× for CJK

File: Project/ZeroCommon/AppLogger.cs:138

// PR
_fileSizeEstimate += entry.Length + 2;

// Suggested
_fileSizeEstimate += entry.Length * 3 + 2;   // conservative CJK-aware bound

string.Length is UTF-16 code units. CJK BMP characters encode to 3 UTF-8 bytes. The optimization plan claims "대부분 ASCII 위주" but this codebase's logs are Korean-heavy. With the new estimator, rotation triggers at ~3× the configured MaxLogBytes for Korean-dominant log streams. Multiply by 3 (conservative bound) or restore GetByteCount — the cost on a single short log line is negligible.

The PR's own test (AppLogger.Log("한글 테스트 메시지 日本語テスト 中文测试")) only asserts no-crash, so the regression slips past CI.

Sg2 (Suggestion) — VoiceSegmenterFlow._buffer could stay typed as MemoryStream? (LSP)

File: Project/ZeroCommon/Voice/Streams/VoiceSegmenterFlow.cs:63, 99

// PR
private RecyclableMemoryStream? _buffer;
_buffer = (RecyclableMemoryStream)PoolManager.GetStream("VoiceSegmenter", capacityHint);

// Suggested
private MemoryStream? _buffer;
_buffer = PoolManager.GetStream("VoiceSegmenter", capacityHint);   // no cast

No RecyclableMemoryStream-specific API is used. RecyclableMemoryStreamManager.GetStream returns MemoryStream precisely so callers don't couple to the derived type. Pooling behavior is preserved either way — Dispose is virtual.

Recommendation

  1. Re-open as a fresh PR (or new branch) with M1, S1, S2 applied.
  2. Sg1, Sg2 can land in the new PR or a follow-up — author's call.
  3. Consider splitting 04baa6f (ExePath quoting) into its own PR so the bug fix can merge immediately while the perf work iterates.
  4. Consider adding harness/knowledge/akka-message-immutability.md so the next contributor doesn't hit the same .ToList() removal trap.

Closes-when

  • M1 applied: AgentBotActor:173 uses defensive copy
  • S1 applied: ActiveConversationsReply.Active typed IReadOnlyList<string>
  • S2 applied: ExternalChatSession.cs:55 and ExternalAgentToolLoop.cs:202 use snapshot semantics; misleading test name fixed
  • Sg1 applied or explicitly accepted (CJK undercount documented)
  • Sg2 applied or explicitly accepted

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions