Skip to content

Commit 2502a59

Browse files
committed
refactor(llm): rename _get_message_tokens to public API name
1 parent 689eb86 commit 2502a59

4 files changed

Lines changed: 269 additions & 4 deletions

File tree

.sisyphus/plans/pr-review-fixes.md

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# PR Review Feedback Fixes
2+
3+
## TL;DR
4+
5+
> **Quick Summary**: Address reviewer feedback from Greptile and Copilot across PRs 381-384. Fix a real bug (thinking blocks dropped on interrupted messages), improve code quality (XML escaping, private symbol export), and add clarifying comments (Qwen compliance block).
6+
>
7+
> **Deliverables**:
8+
> - PR 381: `get_message_tokens` (dropped leading underscore)
9+
> - PR 382: Split WSTG category codes + compliance block comment
10+
> - PR 383: Thinking blocks fix in interrupted path
11+
> - PR 384: XML escaping + generic corrective message
12+
>
13+
> **Estimated Effort**: Quick
14+
> **Parallel Execution**: YES - 4 PRs, each fix is independent
15+
16+
---
17+
18+
## Context
19+
20+
### Original Request
21+
User asked to read reviewer feedback on the 4 split PRs and implement fixes.
22+
23+
### Reviewer Findings
24+
25+
**PR 381 (Memory Fix)** - Greptile: `_get_message_tokens` is a private symbol imported across module boundaries. Rename to `get_message_tokens`.
26+
27+
**PR 382 (WSTG Prompts)** - Greptile: `IDNT/ATHN` and `ATHZ/SESS` are combined codes inconsistent with single codes used elsewhere. `<compliance>` block wording is too aggressive for prompt injection resistance. User clarified: keep the aggressive wording (needed for Qwen-series models), add a comment explaining why.
28+
29+
**PR 383 (TUI Status)** - Greptile: **BUG** - thinking blocks silently dropped when `metadata["interrupted"]` is true. Early return bypasses `renderables` list.
30+
31+
**PR 384 (Agent Workflow)** - Greptile: (1) URLs with `&` produce invalid XML in `<scan_task>`, (2) corrective message hardcodes StrixAgent-specific tool names in BaseAgent, (3) message content not escaped in `<agent_message>` XML. User approved: HTML escape for all targets, generic corrective message.
32+
33+
---
34+
35+
## Work Objectives
36+
37+
### Core Objective
38+
Apply all non-blocking review feedback to make PRs cleaner and more robust.
39+
40+
### Concrete Deliverables
41+
- `_get_message_tokens` renamed to `get_message_tokens` in both files
42+
- WSTG category codes split into separate phases
43+
- Compliance block comment added (Qwen rationale)
44+
- Thinking blocks included in interrupted message path
45+
- `html.escape()` on all XML-interpolated values
46+
- Generic corrective message in BaseAgent
47+
- CDATA-wrapping for `<agent_message>` content
48+
49+
---
50+
51+
## TODOs
52+
53+
- [ ] 1. PR 381: Rename `_get_message_tokens``get_message_tokens`
54+
55+
**What to do**:
56+
- Rename function in `strix/llm/memory_compressor.py` (definition)
57+
- Update import in `strix/llm/llm.py` (consumption)
58+
- Commit on `fix/memory-compressor-token-budget` branch
59+
60+
**Must NOT do**:
61+
- Change function behavior or signature
62+
- Touch unrelated files
63+
64+
**Acceptance Criteria**:
65+
- [ ] `grep -r "_get_message_tokens"` returns no results in source files
66+
- [ ] `python -c "from strix.llm.memory_compressor import get_message_tokens"` succeeds
67+
- [ ] `git diff` shows only the rename, no behavioral changes
68+
69+
**Commit**: `refactor(llm): rename _get_message_tokens to public API name`
70+
71+
- [ ] 2. PR 382: Split WSTG category codes in `<methodology>` phases
72+
73+
**What to do**:
74+
- In `system_prompt.jinja`, split `IDNT/ATHN` into separate `IDNT` and `ATHN` phases
75+
- Split `ATHZ/SESS` into separate `ATHZ` and `SESS` phases
76+
- Renumber subsequent phases accordingly
77+
- Ensure codes match what `<skill_triggers>` and `<phase2>` use
78+
79+
**Must NOT do**:
80+
- Change the `<skill_triggers>` or `<phase2>` sections (they already use correct single codes)
81+
- Alter non-phase content in the methodology section
82+
83+
**Acceptance Criteria**:
84+
- [ ] No combined codes like `IDNT/ATHN` remain in system_prompt.jinja
85+
- [ ] All methodology phases use single WSTG codes matching `<skill_triggers>`
86+
87+
**Commit**: `fix(prompts): split combined WSTG category codes for consistency`
88+
89+
- [ ] 3. PR 382: Add comment explaining aggressive compliance wording
90+
91+
**What to do**:
92+
- Add a Jinja comment or XML comment above the `<compliance>` block explaining that the aggressive wording is intentional and was needed when testing with Qwen-series models (e.g., Qwen3.5-Plus)
93+
- Note that softer language caused unnecessary refusals during authorized scans
94+
95+
**Must NOT do**:
96+
- Change the actual compliance wording
97+
- Remove or weaken the `<compliance>` block
98+
99+
**Acceptance Criteria**:
100+
- [ ] Comment exists above `<compliance>` block referencing Qwen-series models
101+
- [ ] Functional output of the Jinja template is unchanged
102+
103+
**Commit**: `docs: add comment explaining aggressive compliance block rationale`
104+
105+
- [ ] 4. PR 383: Fix thinking blocks dropped on interrupted messages
106+
107+
**What to do**:
108+
- In `_render_chat_content` in `strix/interface/tui.py`, fix the `metadata["interrupted"]` branch
109+
- Change `self._merge_renderables([streaming_result, interrupted_text])` to include `renderables`:
110+
`self._merge_renderables([*renderables, streaming_result, interrupted_text])`
111+
112+
**Must NOT do**:
113+
- Change the interrupted message rendering logic beyond including renderables
114+
- Touch other branches of `_render_chat_content`
115+
116+
**Acceptance Criteria**:
117+
- [ ] Code shows `[*renderables, streaming_result, interrupted_text]` in interrupted branch
118+
- [ ] No other changes in the diff
119+
120+
**Commit**: `fix(ui): include thinking blocks in interrupted message render`
121+
122+
- [ ] 5. PR 384: Add `html.escape()` to XML-interpolated values in strix_agent.py
123+
124+
**What to do**:
125+
- `import html` at top of `strix/agents/StrixAgent/strix_agent.py`
126+
- Wrap all target values in `html.escape()` before embedding in XML:
127+
- `url` values
128+
- `repo["url"]` values
129+
- `code["path"]` values
130+
- IP address entries
131+
- Same approach for `base_agent.py` `<agent_message>` attributes (`sender_name`, `sender_id`)
132+
133+
**Must NOT do**:
134+
- Change the XML structure or element names
135+
- Escape already-safe static strings (only user/target-derived values)
136+
137+
**Acceptance Criteria**:
138+
- [ ] `html` is imported in both files
139+
- [ ] All interpolated target values wrapped in `html.escape()`
140+
- [ ] `<agent_message>` attributes `from="{html.escape(sender_name)}"` and `id="{html.escape(sender_id)}"` escape properly
141+
142+
**Commit**: `fix(agent): escape XML special characters in target values`
143+
144+
- [ ] 6. PR 384: Escape message content in `<agent_message>` XML
145+
146+
**What to do**:
147+
- In `base_agent.py`, wrap `message.get("content", "")` in CDATA or `html.escape()` before embedding in `<agent_message>` element content
148+
- CDATA is preferred here since content is free-form text that shouldn't be interpreted as XML
149+
150+
**Must NOT do**:
151+
- Change the XML element structure
152+
- Break existing `clean_content()` regex patterns
153+
154+
**Acceptance Criteria**:
155+
- [ ] Message content is CDATA-wrapped or HTML-escaped in the `<agent_message>` element
156+
- [ ] Content containing `</agent_message>` does not break the XML structure
157+
158+
**Commit**: `fix(agent): escape message content in compact agent_message format`
159+
160+
- [ ] 7. PR 384: Make corrective message generic in BaseAgent
161+
162+
**What to do**:
163+
- In `base_agent.py`, change the corrective message from mentioning specific tool names to generic guidance:
164+
```
165+
"You responded with plain text instead of a tool call.
166+
While the agent loop is running, EVERY response MUST be a tool call.
167+
Do NOT send plain text messages. Act via your available tools.
168+
Review your task and take action now."
169+
```
170+
- Remove references to `create_agent`, `terminal_execute`, `wait_for_message`
171+
172+
**Must NOT do**:
173+
- Change the `add_message("user", corrective_message)` call structure
174+
- Change the `return None` behavior
175+
176+
**Acceptance Criteria**:
177+
- [ ] Corrective message contains no StrixAgent-specific tool names
178+
- [ ] Message still conveys "use tools, not plain text"
179+
180+
**Commit**: `fix(agent): use generic corrective message in BaseAgent`
181+
182+
---
183+
184+
## Execution Strategy
185+
186+
Each fix is independent and lives on a separate branch. Apply fixes to the appropriate branch, commit, and force-push.
187+
188+
```
189+
Wave 1 (all parallel):
190+
├── Task 1: PR 381 branch - rename function
191+
├── Task 2: PR 382 branch - split WSTG codes
192+
├── Task 3: PR 382 branch - compliance comment
193+
├── Task 4: PR 383 branch - thinking blocks fix
194+
├── Task 5: PR 384 branch - XML escaping
195+
├── Task 6: PR 384 branch - CDATA wrapping
196+
└── Task 7: PR 384 branch - generic corrective message
197+
```
198+
199+
Tasks 2+3 share branch (PR 382). Tasks 5+6+7 share branch (PR 384). Work sequentially within a branch, but all 4 branches can be updated in parallel.
200+
201+
**Branch workflow per PR:**
202+
1. `git checkout <branch>`
203+
2. Apply edits
204+
3. `git add -A && git commit -m "message"`
205+
4. `git push origin <branch> --force-with-lease`
206+
207+
---
208+
209+
## Success Criteria
210+
211+
- [ ] All 4 PRs still `MERGEABLE` after fixes
212+
- [ ] No reviewer comments remain unaddressed
213+
- [ ] Each branch has exactly one additional commit with the fix

strix/llm/llm.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
from strix.config import Config
1212
from strix.llm.config import LLMConfig
13-
from strix.llm.memory_compressor import MemoryCompressor, _get_message_tokens
13+
from strix.llm.memory_compressor import MemoryCompressor, get_message_tokens
1414
from strix.llm.utils import (
1515
_truncate_to_first_function,
1616
fix_incomplete_tool_call,
@@ -182,7 +182,7 @@ def _prepare_messages(self, conversation_history: list[dict[str, Any]]) -> list[
182182
)
183183

184184
reserved_tokens = sum(
185-
_get_message_tokens(msg, self.config.litellm_model) for msg in messages
185+
get_message_tokens(msg, self.config.litellm_model) for msg in messages
186186
)
187187
compressed = list(
188188
self.memory_compressor.compress_history(conversation_history, reserved_tokens)

strix/llm/memory_compressor.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ def _count_tokens(text: str, model: str) -> int:
5252
return len(text) // 4 # Rough estimate
5353

5454

55-
def _get_message_tokens(msg: dict[str, Any], model: str) -> int:
55+
def get_message_tokens(msg: dict[str, Any], model: str) -> int:
5656
content = msg.get("content", "")
5757
if isinstance(content, str):
5858
return _count_tokens(content, model)
@@ -209,7 +209,7 @@ def compress_history(
209209
model_name: str = self.model_name # type: ignore[assignment]
210210

211211
total_tokens = reserved_tokens + sum(
212-
_get_message_tokens(msg, model_name) for msg in system_msgs + regular_msgs
212+
get_message_tokens(msg, model_name) for msg in system_msgs + regular_msgs
213213
)
214214

215215
if total_tokens <= MAX_TOTAL_TOKENS * 0.9:

test_run.sh

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
#!/bin/bash
2+
# Wrapper script to run Strix with GLM-5 and Podman
3+
4+
# Podman socket for Docker compatibility
5+
export DOCKER_HOST="unix://$XDG_RUNTIME_DIR/podman/podman.sock"
6+
7+
# Default Configuration (GLM-5 via Astaria Proxy)
8+
STRIX_LLM="openai/code/glm-5"
9+
OPENAI_API_BASE="https://apiproxy.astaria.cc/v1"
10+
LLM_API_KEY="sk-3c3753665e8203c691ed000d755a739816209a68c5d1d9722186dc71766962b1"
11+
# Check for OpenRouter flag
12+
if [[ "$1" == "--openrouter" ]]; then
13+
echo "Switching to OpenRouter..."
14+
STRIX_LLM="openrouter/openrouter/hunter-alpha"
15+
OPENAI_API_BASE="https://openrouter.ai/api/v1"
16+
LLM_API_KEY="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"
17+
shift
18+
fi
19+
20+
# Export configuration
21+
export STRIX_LLM
22+
export OPENAI_API_BASE
23+
export LLM_API_KEY
24+
25+
echo "----------------------------------------"
26+
echo "Using Model: $STRIX_LLM"
27+
echo "API Base: $OPENAI_API_BASE"
28+
echo "Container Backend: Podman"
29+
echo "----------------------------------------"
30+
31+
PYTHON_BIN="${PYTHON_BIN:-python3.14}"
32+
if ! command -v "$PYTHON_BIN" >/dev/null 2>&1; then
33+
echo "Python interpreter not found: $PYTHON_BIN"
34+
exit 1
35+
fi
36+
37+
if [ $# -eq 0 ]; then
38+
echo "Usage: ./test_run.sh [--openrouter] --target <your_target>"
39+
echo "Example: ./test_run.sh --target example.com"
40+
echo "Example: ./test_run.sh --openrouter --target example.com"
41+
exit 1
42+
fi
43+
44+
echo "Setting up Python virtual environment..."
45+
if [ ! -d ".venv" ]; then
46+
"$PYTHON_BIN" -m venv .venv
47+
fi
48+
source .venv/bin/activate
49+
pip install -e . -q
50+
51+
echo "Running Strix..."
52+
strix "$@"

0 commit comments

Comments
 (0)