feat: WSTG methodology and UI updates by 0xhis · Pull Request #2 · 0xhis/strix

0xhis · 2026-02-25T03:55:17Z

Adding WSTG methodology prompts and live UI statuses.

…init

Copilot

Pull request overview

This PR updates Strix’s prompting/methodology content to be OWASP WSTG-aligned and adds live “system message” status text to the TUI via telemetry updates.

Changes:

Reworked multiple system prompts/skill markdowns to explicitly map phases and subagents to OWASP WSTG categories.
Added system_message support to telemetry agents and surfaced it in the TUI status line.
Hooked LLM generation + sandbox initialization to emit live status messages (e.g., “Waiting for LLM provider…”).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`strix/tools/web_search/web_search_actions.py`	Reformats the web-search system prompt into structured sections.
`strix/telemetry/tracer.py`	Adds agent `system_message` tracking and an update helper.
`strix/skills/scan_modes/standard.md`	Updates Standard mode to be WSTG-aligned and wraps content in tags.
`strix/skills/scan_modes/quick.md`	Updates Quick mode to be WSTG-aligned; adds `<instructions>/<constraints>/<mindset>` blocks.
`strix/skills/scan_modes/deep.md`	Updates Deep mode to be WSTG-aligned and expands advanced categories.
`strix/skills/coordination/root_agent.md`	Updates root agent coordination guidance to be WSTG-aligned.
`strix/llm/llm.py`	Emits tracer system messages during memory compression and streaming generation.
`strix/llm/dedupe.py`	Reformats dedupe judge prompt into structured sections with explicit output rules.
`strix/interface/tui.py`	Displays `system_message` as the animated “Initializing/Waiting/…” status text.
`strix/agents/base_agent.py`	Emits a system message during sandbox setup.
`strix/agents/StrixAgent/system_prompt.jinja`	Reworks the main agent system prompt to include WSTG-aligned methodology and headings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T04:01:16Z

strix/telemetry/tracer.py


    def update_agent_status(
-        self, agent_id: str, status: str, error_message: str | None = None
+        self, agent_id: str, status: str, error_message: str | None = None, system_message: str | None = None


This function signature is over the configured 100-char line length (ruff E501) and will likely fail linting. Please wrap the parameters across multiple lines (the rest of the file generally formats multi-arg defs this way).

Suggested change

self, agent_id: str, status: str, error_message: str | None = None, system_message: str | None = None

self,

agent_id: str,

status: str,

error_message: str | None = None,

system_message: str | None = None,

Copilot · 2026-02-25T04:01:17Z

strix/llm/llm.py

                async for response in self._stream(messages):
+                    if tracer and self.agent_id:
+                        tracer.update_agent_system_message(self.agent_id, "Generating response...")


update_agent_system_message() is called for every streamed chunk. Since _stream() yields many partial chunks, this can create a very hot loop of tracer/UI updates and unnecessary contention. Consider setting the system message once before entering the async for loop (or only on the first chunk), and avoid updating it on every yield.

Suggested change

async for response in self._stream(messages):

if tracer and self.agent_id:

tracer.update_agent_system_message(self.agent_id, "Generating response...")

if tracer and self.agent_id:

tracer.update_agent_system_message(self.agent_id, "Generating response...")

async for response in self._stream(messages):

…ike sqlmap

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T05:16:59Z

strix/llm/llm.py

            try:
+                if tracer and self.agent_id:
+                    tracer.update_agent_system_message(self.agent_id, "Waiting for LLM provider...")
+                    tracer.update_agent_system_message(self.agent_id, "Generating response...")


The two consecutive update_agent_system_message calls here overwrite each other before the coroutine ever hits an await, so the "Waiting for LLM provider..." state will never be visible. Also, setting "Generating response..." before _stream() starts means the UI will say "Generating" while it's actually blocked awaiting the provider response.

Consider setting only one message before the first await (e.g., "Waiting…"), and then switching to "Generating…" only after the first stream chunk is received (or insert a yield point before updating again).

Suggested change

tracer.update_agent_system_message(self.agent_id, "Generating response...")

Copilot · 2026-02-25T05:17:00Z

strix/agents/base_agent.py

        final_response = None

+        if tracer:
+            tracer.update_agent_system_message(self.state.agent_id, "Thinking...")


update_agent_system_message("Thinking...") is immediately followed by llm.generate(...), which currently updates the system message again before the first await in that coroutine. In practice this means "Thinking..." will be overwritten before the UI can ever render it.

To make the status useful, either centralize status updates in one layer (agent loop vs LLM), or ensure the "Thinking..." update occurs right before a real await/yield point so it can be observed.

Suggested change

tracer.update_agent_system_message(self.state.agent_id, "Thinking...")

tracer.update_agent_system_message(self.state.agent_id, "Thinking...")

# Yield control so the "Thinking..." status can be observed before LLM updates it.

await asyncio.sleep(0)

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T05:34:48Z

strix/interface/tui.py

+            if hasattr(msg_renderable, "renderable") and hasattr(msg_renderable.renderable, "_text") and not msg_renderable.renderable._text:
+                pass
+            elif getattr(msg_renderable, "plain", True): 
+                renderables.append(msg_renderable)
+
+        if not renderables:
+            return None
+
+        if len(renderables) == 1:
+            return renderables[0]
+


AgentMessageRenderer.render_simple() returns a rich.text.Text, so the hasattr(msg_renderable, "renderable") ... _text branch will never run and adds unnecessary complexity. You can simplify this to a single emptiness check (e.g., if msg_renderable.plain:) and also remove the trailing whitespace on the elif line to satisfy style checks.

Suggested change

if hasattr(msg_renderable, "renderable") and hasattr(msg_renderable.renderable, "_text") and not msg_renderable.renderable._text:

pass

elif getattr(msg_renderable, "plain", True):

renderables.append(msg_renderable)

if not renderables:

return None

if len(renderables) == 1:

return renderables[0]

if msg_renderable.plain:

renderables.append(msg_renderable)

if not renderables:

return None

if len(renderables) == 1:

return renderables[0]

Copilot · 2026-02-25T05:34:49Z

strix/agents/StrixAgent/system_prompt.jinja

 7. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
 8. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
 9. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent


The numbered list under “Simple Workflow Rules” has duplicate item numbers (two items labeled 7.). This can confuse readers and downstream prompt parsing; renumber the list so each item number is unique and sequential.

Suggested change

7. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing

8. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children

9. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent

8. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing

9. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children

10. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent

Copilot · 2026-02-25T05:34:49Z

strix/interface/tui.py

        if role == "user":
            return UserMessageRenderer.render_simple(content)

+        renderables = []


thinking_blocks rendering is gated by the if not content: return None guard just above this block. Since BaseAgent logs assistant messages with clean_content(...), tool-call-only outputs often become empty strings, so the UI will drop the entire message (including thinking_blocks/interrupted metadata). Consider moving the empty-content early return below the thinking_blocks/interrupted handling (or only applying it for role == "user").

Copilot · 2026-02-25T05:34:49Z

strix/agents/base_agent.py

+            metadata = {}
+            if thinking_blocks:
+                metadata["thinking_blocks"] = thinking_blocks
+


There’s trailing whitespace on the blank line here (and potentially other nearby blank lines). The repo runs pre-commit’s trailing-whitespace hook, so this will fail checks; remove the extra spaces so the line is truly empty.

Suggested change

Copilot · 2026-02-25T05:34:49Z

strix/interface/tui.py

+            elif getattr(msg_renderable, "plain", True): 
+                renderables.append(msg_renderable)
+
+        if not renderables:
+            return None
+
+        if len(renderables) == 1:
+            return renderables[0]
+


There are several lines with trailing whitespace in this block (e.g., after the elif condition and on blank lines). Since the repo uses the trailing-whitespace pre-commit hook, please strip trailing spaces here to avoid CI/pre-commit failures.

Suggested change

elif getattr(msg_renderable, "plain", True):

renderables.append(msg_renderable)

if not renderables:

return None

if len(renderables) == 1:

return renderables[0]

elif getattr(msg_renderable, "plain", True):

renderables.append(msg_renderable)

if not renderables:

return None

if len(renderables) == 1:

return renderables[0]

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…modes

Co-authored-by: 0xallam <ahmed39652003@gmail.com>

# Conflicts: # strix/telemetry/tracer.py

…dundant sleep(0)

…hering - Restructures Phase 1 into explicit subagent delegation rules - Root agent no longer runs recon/crawling/code analysis directly - Adds black-box, white-box, and combined mode subagent templates - Renames Phase 2 section to reflect dependency on gathered context

- Extract .renderable from ThinkRenderer.render() in tui.py for consistency - Remove dead thinking_blocks parameter from add_message() in state.py - Pass tracer into _stream() instead of importing in hot path in llm.py - Add overflow indicator (+N more) when truncating tool displays in base_agent.py

…gent naming

…eation - Add SKILLS ARE MANDATORY rule to Critical Rules section - Update BLACK-BOX examples to include skills= in every agent creation - Update WHITE-BOX examples to include skills= in every agent creation - Add Skill Assignment Triggers section with 15 scenario→skill mappings - Add warning that agents without skills lack vulnerability methodology Fixes regression where subagents were spawning without vulnerability skills loaded, causing shallow testing (no SQLi, XSS, etc.)

…cker perspective constraints

…gent names

…nation

…dates

…t guard and prompt cleanup

…g model context limit

Add regex patterns to normalize <function>name> and <parameter>key> into proper <function=name> and <parameter=key> format before parsing.

…g Verification agents

0xhis added 3 commits February 24, 2026 19:29

refactor: align prompts and scan modes with owasp wstg methodology

9f0c625

Merge branch 'main' into prompt-optimization

a54ba27

feat(ui): add live status updates during agent initialization

4b72fc0

0xhis requested a review from Copilot February 25, 2026 03:55

Copilot started reviewing on behalf of 0xhis February 25, 2026 03:55 View session

0xhis marked this pull request as ready for review February 25, 2026 03:59

fix(ui): show live status messages during all agent phases, not just …

8c5d946

…init

Copilot AI reviewed Feb 25, 2026

View reviewed changes

0xhis added 6 commits February 24, 2026 20:13

fix(ui): stabilize live agent status updates

c56631e

style: wrap update_agent_status signature to fix line length lint

0439d70

feat: enforce WSTG ID prefixes and deep agent chaining

8f02d52

feat: enforce testing of newly exposed surfaces after a bypass

6c02017

feat: enforce spawning specialized subagents for heavy exploitation l…

8859f2b

…ike sqlmap

feat: add WAF & rate limit adaptation rule to execution guidelines

8abbb58

0xhis requested a review from Copilot February 25, 2026 05:10

Copilot started reviewing on behalf of 0xhis February 25, 2026 05:10 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

fix(tui): persist thinking blocks & apply copilot review feedback

e5b0464

0xhis requested a review from Copilot February 25, 2026 05:29

Copilot started reviewing on behalf of 0xhis February 25, 2026 05:29 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

style: address copilot review styling suggestions

bf6ea9c

0xhis requested a review from Copilot February 25, 2026 05:38

Copilot started reviewing on behalf of 0xhis February 25, 2026 05:38 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

0xhis and others added 4 commits February 24, 2026 21:59

feat(prompt): add attacker perspective verification to deep/standard …

4a3cc13

…modes

style: address PR usestrix#328 review suggestions

64aa3b5

refactor: drop thinking_blocks from AgentState.messages and dedup tui.py

24b5147

fix: address Copilot review suggestions

76fcf75

0xhis force-pushed the prompt-optimization branch from 1039de9 to 6d4e9e8 Compare February 26, 2026 01:28

dependabot bot and others added 4 commits March 8, 2026 09:46

chore(deps): bump pypdf from 6.7.4 to 6.7.5 (usestrix#343)

048be1f

Add OpenTelemetry observability with local JSONL traces (usestrix#347)

a60cb4b

Co-authored-by: 0xallam <ahmed39652003@gmail.com>

chore: simplify PR by removing thinking blocks and redundant code

650ec46

Merge remote-tracking branch 'origin/main' into pr-328

e7e03e0

# Conflicts: # strix/telemetry/tracer.py

0xhis force-pushed the prompt-optimization branch from 52468cc to e7e03e0 Compare March 9, 2026 17:46

ST-2 and others added 25 commits March 9, 2026 10:59

Fix agent telemetry update events

5be1025

fix: address Copilot review suggestions

82bbc11

fix: revert get_conversation_history copy (memory leak) and remove re…

ff30eee

…dundant sleep(0)

refactor(prompt): mitigate exploitation phase refusals and simplify a…

a567677

…gent naming

chore: ignore test_run.sh

19631e2

refactor(prompt): update deep scan mode with authorization framing

877af2b

fix(agent): mitigate LLM refusals via explicit authorization and atta…

4785d4b

…cker perspective constraints

fix(agent): add todo list instruction and remove WSTG prefixes from a…

88ffb3c

…gent names

fix(prompt): tighter legal mandate & target infra bypass framing

62bdf09

Enhance prompt structure with XML bounding and refusal suppression

25f8bd7

fix(tool): strictly constrain todo priority values to prevent halluci…

1fc997d

…nation

fix(agent): fix XML tag nesting and UI rendering issues from PR review

2f6c1ed

fix(agent): stabilize sender attribution and align scan/TUI prompt up…

e9f43c3

…dates

refactor(prompt): condense quick scan mode to baseline-style flow

a913f76

fix(tui): sanitize merged text spans to prevent render crash

95e2f88

fix(agent): address review comments for thinking blocks, empty conten…

9dcb302

…t guard and prompt cleanup

fix(tui): sanitize text spans on all single-renderable bypass paths

2bc2522

fix(llm): reduce conversation token budget to 80k to prevent exceedin…

1236065

…g model context limit

fix(llm): include system prompt tokens in memory compressor budget

ce2353a

fix(llm): handle malformed function/parameter open tags from GLM-5

b15d3d6

Add regex patterns to normalize <function>name> and <parameter>key> into proper <function=name> and <parameter=key> format before parsing.

Fix GLM-5 regex lookahead and tracer payload None regression

9573242

Refactor verification workflow to mirror upstream 3-step process usin…

cfb8b35

…g Verification agents

Conversation

0xhis commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants