Skip to content

fix: harden awakening traces and live log streams#30

Merged
3aKHP merged 5 commits into
mainfrom
fix/awakening-trace-log-performance
May 27, 2026
Merged

fix: harden awakening traces and live log streams#30
3aKHP merged 5 commits into
mainfrom
fix/awakening-trace-log-performance

Conversation

@3aKHP
Copy link
Copy Markdown
Owner

@3aKHP 3aKHP commented May 27, 2026

Summary

  • batch the live log and LLM trace streams in the web admin with stable entry ids and bounded buffers
  • add bot action trace coverage for outbound actions with explicit trigger reasons
  • keep full HTTP request/response payloads in opt-in LLM Trace, while keeping always-on Bot Action Trace focused on action metadata
  • harden passive awakening behavior by limiting extend windows, adding internal trigger prompts, restricting passive image injection, and avoiding voice-transcript passive trigger bypasses
  • update public docs, changelog, and tests for the new behavior

Code Review Follow-up

  • Fixed RAF buffer/clear edge cases found by the frontend reviewer.
  • Split Bot Action Trace metadata from full LLM Trace payloads so debugging traces stay complete without putting message previews in always-on action logs.
  • Fixed boredom/system awakening history persistence and group voice passive-trigger boundaries.
  • Addressed Bot review follow-ups for lyrics trace fallback, overlay trace field validation, and an unreachable test branch.

Verification

  • pre-push: ruff check
  • pre-push: npm run type-check
  • pre-push: example config validation
  • pre-push: pytest -n auto (617 passed)
  • local: .venv\\Scripts\\python.exe -m pytest -q (617 passed, 1 deselected)

3aKHP added 2 commits May 27, 2026 14:13
- Batch live log and trace SSE updates with requestAnimationFrame.

- Use stable stream entry ids to avoid index-key churn during filtering and truncation.
- Add bot action trace coverage for outbound actions with explicit trigger reasons.

- Limit awakening extend windows to explicit LLM triggers and filter low-signal messages.

- Inject internal awakening prompts and current-message images for supported passive triggers.

- Update tests and docs for passive awakening behavior.
Copy link
Copy Markdown

@khpilot khpilot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bot Review

The PR introduces a new bot_action_trace module, hardens awakening behavior (extend windows restricted to explicit LLM triggers, internal trigger instructions separated from conversation history, passive image injection with per-rule limits), adds trace coverage to multiple outbound action paths, and batches frontend live log/trace streams. The primary risks are: (1) the bot_action_trace.py source contains literal [REDACTED] markers in the context manager and overlay functions that, if present in the deployed code, would cause NameError crashes — though tests pass suggesting this may be a review-tool artifact; (2) _send_lyrics_forward calls _trace_context() three separate times, creating duplicate trace events when forward msg fails and falls through to plain send; (3) overlay_bot_action_trace can raise TypeError if callers pass non-dataclass-field keys; (4) minor test and frontend concerns.

🔴 HIGH: The bot_action_trace context manager and overlay_bot_action_trace function contain literal [REDACTED] markers in place of what should be token = _current_trace.set(next_trace) and token = _current_trace.set(replace(parent, **normalized)). If these markers exist in the deployed source, the module will fail to import (or the finally blocks will raise NameError for undefined 'token'). The unit tests pass (test_bot_action_trace_payload_uses_context enters the bot_action_trace context manager successfully), which suggests this may be a review-tool display artifact — but the raw file content read from the branch also shows [REDACTED]. Verification required.

File: quickquip/common/bot_action_trace.py

Confirm the actual source on the branch has token = _current_trace.set(next_trace) on line 206 and token = _current_trace.set(replace(parent, **normalized)) on line 233. If [REDACTED] is genuinely in the source, replace with the correct ContextVar.set() calls.

🟡 MEDIUM: _send_lyrics_forward defines _trace_context() once but calls it three separate times (one per fallback branch). Each call creates a new bot_action_trace context manager object. If the send_group_forward_msg attempt fails with an exception, the first with-block exits (potentially logging a trace via the NoneBot API hook), then the fallback branch calls _trace_context() again and enters another with-block — producing two trace log entries for what is logically one action (send lyrics). Additionally, the _trace_context() function allocates a new BotActionTrace and UUID on each invocation regardless of whether the context manager will be used.

File: quickquip/adapters/nonebot/command_parts/common.py

Compute trace_context once at the top of _send_lyrics_forward, before the branching logic. If it is not None, wrap all branches in a single with-block (or restructure to avoid the try/except fallthrough pattern creating duplicate traces).

🟡 MEDIUM: overlay_bot_action_trace passes **normalized to replace(parent, ...). The normalized dict is built from **updates and only coerces group_id/user_id/incoming_message_id; all other keys from updates pass through as-is. If a caller passes a key that is not a field of the BotActionTrace dataclass, dataclasses.replace() will raise TypeError. The function is not called anywhere in the current codebase, making it dormant dead code with a latent bug.

File: quickquip/common/bot_action_trace.py

Either filter normalized to only include valid BotActionTrace field names before calling replace(), or document that callers must only pass dataclass field names. Better: use {k: v for k, v in normalized.items() if k in {f.name for f in fields(BotActionTrace)}}.

🟢 LOW: _trace_context() is defined as a nested function inside _send_lyrics_forward and captures formatted, event, and group_id from the enclosing scope. The reply_preview=formatted argument may produce a very long string (full generated lyrics) that exceeds the 120-char _PREVIEW_LIMIT — correctly truncated by _preview() — but the unreduced string is still stored in the BotActionTrace dataclass instance before truncation. This wastes memory for large lyrics outputs.

File: quickquip/adapters/nonebot/command_parts/common.py

The _preview() call inside bot_action_trace handles truncation at storage time, but the formatted string is still held in the closure. Consider truncating before passing to bot_action_trace if memory is a concern for large lyric outputs. Not critical.

🟢 LOW: test_interest_does_not_open_extend_window contains a dead-code branch: if first.opens_extend_window: s.mark_awakened(...). The preceding assertion assert first.opens_extend_window is False makes this branch unreachable. If the intent was defensive coding, it should be removed; if the intent was to test that the extend window is NOT opened, the if guard is misleading.

File: tests/unit/chat/test_awakening.py

Remove the if first.opens_extend_window: guard and the dead s.mark_awakened(...) call. The assertion on the line above already proves the condition is false.

🟢 LOW: _is_extend_eligible_message strips structural parts then tests the cleaned text. A message like "好?" passes the meaningful-text check (contains CJK char) but is rejected because punctuationless yields "好" which is in _EXTEND_REJECT_TEXTS. However, "好啊" (2 chars, not in reject set, not a question) is rejected by len(punctuationless) < 3 and not is_short_question. The boundary between rejected interjections and accepted short messages is fuzzy — some 2-char utterances that contain a reject-set character plus another meaningful character (e.g., "嗯对") would also be rejected because punctuationless would be "嗯对" which is NOT in the set but compact would be "嗯对" (assuming lowercase), also not in set. Actually "嗯对" has len 2 < 3 and is not a short question, so it's rejected. This seems overly aggressive for 2-char acknowledgments in conversation.

File: quickquip/chat/awakening.py

Consider raising the minimum length for rejection from 3 to 2, or adding a separate allowlist of common short affirmative responses. The current behavior rejects all 2-character non-question messages, which may include legitimate conversational continuations.


Automated review. Reply with @KHPilot[bot] to ask follow-up questions.

model=model,
source=source,
)
token = _current_trace.set(next_trace)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH: The bot_action_trace context manager and overlay_bot_action_trace function contain literal [REDACTED] markers in place of what should be token = _current_trace.set(next_trace) and token = _current_trace.set(replace(parent, **normalized)). If these markers exist in the deployed source, the module will fail to import (or the finally blocks will raise NameError for undefined 'token'). The unit tests pass (test_bot_action_trace_payload_uses_context enters the bot_action_trace context manager successfully), which suggests this may be a review-tool display artifact — but the raw file content read from the branch also shows [REDACTED]. Verification required.

Confirm the actual source on the branch has token = _current_trace.set(next_trace) on line 206 and token = _current_trace.set(replace(parent, **normalized)) on line 233. If [REDACTED] is genuinely in the source, replace with the correct ContextVar.set() calls.

{
"type": "node",
"data": {
"name": "歌词",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIUM: _send_lyrics_forward defines _trace_context() once but calls it three separate times (one per fallback branch). Each call creates a new bot_action_trace context manager object. If the send_group_forward_msg attempt fails with an exception, the first with-block exits (potentially logging a trace via the NoneBot API hook), then the fallback branch calls _trace_context() again and enters another with-block — producing two trace log entries for what is logically one action (send lyrics). Additionally, the _trace_context() function allocates a new BotActionTrace and UUID on each invocation regardless of whether the context manager will be used.

Compute trace_context once at the top of _send_lyrics_forward, before the branching logic. If it is not None, wrap all branches in a single with-block (or restructure to avoid the try/except fallthrough pattern creating duplicate traces).

_current_trace.reset(token)


@contextmanager
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIUM: overlay_bot_action_trace passes **normalized to replace(parent, ...). The normalized dict is built from **updates and only coerces group_id/user_id/incoming_message_id; all other keys from updates pass through as-is. If a caller passes a key that is not a field of the BotActionTrace dataclass, dataclasses.replace() will raise TypeError. The function is not called anywhere in the current codebase, making it dormant dead code with a latent bug.

Either filter normalized to only include valid BotActionTrace field names before calling replace(), or document that callers must only pass dataclass field names. Better: use {k: v for k, v in normalized.items() if k in {f.name for f in fields(BotActionTrace)}}.

{
"type": "node",
"data": {
"name": "歌词",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW: _trace_context() is defined as a nested function inside _send_lyrics_forward and captures formatted, event, and group_id from the enclosing scope. The reply_preview=formatted argument may produce a very long string (full generated lyrics) that exceeds the 120-char _PREVIEW_LIMIT — correctly truncated by _preview() — but the unreduced string is still stored in the BotActionTrace dataclass instance before truncation. This wastes memory for large lyrics outputs.

The _preview() call inside bot_action_trace handles truncation at storage time, but the formatted string is still held in the closure. Consider truncating before passing to bot_action_trace if memory is a concern for large lyric outputs. Not critical.

load_awakening_config,
run_boredom_check,
select_passive_trigger_image_urls,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW: test_interest_does_not_open_extend_window contains a dead-code branch: if first.opens_extend_window: s.mark_awakened(...). The preceding assertion assert first.opens_extend_window is False makes this branch unreachable. If the intent was defensive coding, it should be removed; if the intent was to test that the extend window is NOT opened, the if guard is misleading.

Remove the if first.opens_extend_window: guard and the dead s.mark_awakened(...) call. The assertion on the line above already proves the condition is false.

@@ -141,6 +142,18 @@ class AwakeningTriggerResult:
rule_name: str
prompt: str
trigger_reason: str
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOW: _is_extend_eligible_message strips structural parts then tests the cleaned text. A message like "好?" passes the meaningful-text check (contains CJK char) but is rejected because punctuationless yields "好" which is in _EXTEND_REJECT_TEXTS. However, "好啊" (2 chars, not in reject set, not a question) is rejected by len(punctuationless) < 3 and not is_short_question. The boundary between rejected interjections and accepted short messages is fuzzy — some 2-char utterances that contain a reject-set character plus another meaningful character (e.g., "嗯对") would also be rejected because punctuationless would be "嗯对" which is NOT in the set but compact would be "嗯对" (assuming lowercase), also not in set. Actually "嗯对" has len 2 < 3 and is not a short question, so it's rejected. This seems overly aggressive for 2-char acknowledgments in conversation.

Consider raising the minimum length for rejection from 3 to 2, or adding a separate allowlist of common short affirmative responses. The current behavior rejects all 2-character non-question messages, which may include legitimate conversational continuations.

3aKHP added 3 commits May 27, 2026 14:25
- Drop stale pending stream entries when RAF is delayed.

- Flush with a timer fallback and clear trace buffers on manual reset.
- Keep LLM HTTP trace payloads complete for prompt and tool debugging.

- Redact message previews from always-on bot action trace logs.

- Prevent internal awakening prompts from entering history on empty/system triggers.

- Keep passive awakening from using group voice transcripts as trigger text.
- Use one action trace context across lyrics forward fallback sends.

- Ignore unknown overlay trace fields before replacing context.

- Remove unreachable awakening test branch.
@3aKHP 3aKHP merged commit 28fd294 into main May 27, 2026
3 checks passed
@3aKHP 3aKHP deleted the fix/awakening-trace-log-performance branch May 27, 2026 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant