Skip to content

feat: WSTG methodology and UI updates#2

Open
0xhis wants to merge 53 commits intomainfrom
prompt-optimization
Open

feat: WSTG methodology and UI updates#2
0xhis wants to merge 53 commits intomainfrom
prompt-optimization

Conversation

@0xhis
Copy link
Copy Markdown
Owner

@0xhis 0xhis commented Feb 25, 2026

Adding WSTG methodology prompts and live UI statuses.

@0xhis 0xhis requested a review from Copilot February 25, 2026 03:55
@0xhis 0xhis marked this pull request as ready for review February 25, 2026 03:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Strix’s prompting/methodology content to be OWASP WSTG-aligned and adds live “system message” status text to the TUI via telemetry updates.

Changes:

  • Reworked multiple system prompts/skill markdowns to explicitly map phases and subagents to OWASP WSTG categories.
  • Added system_message support to telemetry agents and surfaced it in the TUI status line.
  • Hooked LLM generation + sandbox initialization to emit live status messages (e.g., “Waiting for LLM provider…”).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
strix/tools/web_search/web_search_actions.py Reformats the web-search system prompt into structured sections.
strix/telemetry/tracer.py Adds agent system_message tracking and an update helper.
strix/skills/scan_modes/standard.md Updates Standard mode to be WSTG-aligned and wraps content in tags.
strix/skills/scan_modes/quick.md Updates Quick mode to be WSTG-aligned; adds <instructions>/<constraints>/<mindset> blocks.
strix/skills/scan_modes/deep.md Updates Deep mode to be WSTG-aligned and expands advanced categories.
strix/skills/coordination/root_agent.md Updates root agent coordination guidance to be WSTG-aligned.
strix/llm/llm.py Emits tracer system messages during memory compression and streaming generation.
strix/llm/dedupe.py Reformats dedupe judge prompt into structured sections with explicit output rules.
strix/interface/tui.py Displays system_message as the animated “Initializing/Waiting/…” status text.
strix/agents/base_agent.py Emits a system message during sandbox setup.
strix/agents/StrixAgent/system_prompt.jinja Reworks the main agent system prompt to include WSTG-aligned methodology and headings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


def update_agent_status(
self, agent_id: str, status: str, error_message: str | None = None
self, agent_id: str, status: str, error_message: str | None = None, system_message: str | None = None
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function signature is over the configured 100-char line length (ruff E501) and will likely fail linting. Please wrap the parameters across multiple lines (the rest of the file generally formats multi-arg defs this way).

Suggested change
self, agent_id: str, status: str, error_message: str | None = None, system_message: str | None = None
self,
agent_id: str,
status: str,
error_message: str | None = None,
system_message: str | None = None,

Copilot uses AI. Check for mistakes.
strix/llm/llm.py Outdated
Comment on lines +129 to +131
async for response in self._stream(messages):
if tracer and self.agent_id:
tracer.update_agent_system_message(self.agent_id, "Generating response...")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_agent_system_message() is called for every streamed chunk. Since _stream() yields many partial chunks, this can create a very hot loop of tracer/UI updates and unnecessary contention. Consider setting the system message once before entering the async for loop (or only on the first chunk), and avoid updating it on every yield.

Suggested change
async for response in self._stream(messages):
if tracer and self.agent_id:
tracer.update_agent_system_message(self.agent_id, "Generating response...")
if tracer and self.agent_id:
tracer.update_agent_system_message(self.agent_id, "Generating response...")
async for response in self._stream(messages):

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

strix/llm/llm.py Outdated
try:
if tracer and self.agent_id:
tracer.update_agent_system_message(self.agent_id, "Waiting for LLM provider...")
tracer.update_agent_system_message(self.agent_id, "Generating response...")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two consecutive update_agent_system_message calls here overwrite each other before the coroutine ever hits an await, so the "Waiting for LLM provider..." state will never be visible. Also, setting "Generating response..." before _stream() starts means the UI will say "Generating" while it's actually blocked awaiting the provider response.

Consider setting only one message before the first await (e.g., "Waiting…"), and then switching to "Generating…" only after the first stream chunk is received (or insert a yield point before updating again).

Suggested change
tracer.update_agent_system_message(self.agent_id, "Generating response...")

Copilot uses AI. Check for mistakes.
final_response = None

if tracer:
tracer.update_agent_system_message(self.state.agent_id, "Thinking...")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update_agent_system_message("Thinking...") is immediately followed by llm.generate(...), which currently updates the system message again before the first await in that coroutine. In practice this means "Thinking..." will be overwritten before the UI can ever render it.

To make the status useful, either centralize status updates in one layer (agent loop vs LLM), or ensure the "Thinking..." update occurs right before a real await/yield point so it can be observed.

Suggested change
tracer.update_agent_system_message(self.state.agent_id, "Thinking...")
tracer.update_agent_system_message(self.state.agent_id, "Thinking...")
# Yield control so the "Thinking..." status can be observed before LLM updates it.
await asyncio.sleep(0)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1692 to +1702
if hasattr(msg_renderable, "renderable") and hasattr(msg_renderable.renderable, "_text") and not msg_renderable.renderable._text:
pass
elif getattr(msg_renderable, "plain", True):
renderables.append(msg_renderable)

if not renderables:
return None

if len(renderables) == 1:
return renderables[0]

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AgentMessageRenderer.render_simple() returns a rich.text.Text, so the hasattr(msg_renderable, "renderable") ... _text branch will never run and adds unnecessary complexity. You can simplify this to a single emptiness check (e.g., if msg_renderable.plain:) and also remove the trailing whitespace on the elif line to satisfy style checks.

Suggested change
if hasattr(msg_renderable, "renderable") and hasattr(msg_renderable.renderable, "_text") and not msg_renderable.renderable._text:
pass
elif getattr(msg_renderable, "plain", True):
renderables.append(msg_renderable)
if not renderables:
return None
if len(renderables) == 1:
return renderables[0]
if msg_renderable.plain:
renderables.append(msg_renderable)
if not renderables:
return None
if len(renderables) == 1:
return renderables[0]

Copilot uses AI. Check for mistakes.
Comment on lines 270 to 272
7. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
8. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
9. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbered list under “Simple Workflow Rules” has duplicate item numbers (two items labeled 7.). This can confuse readers and downstream prompt parsing; renumber the list so each item number is unique and sequential.

Suggested change
7. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
8. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
9. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent
8. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
9. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
10. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent

Copilot uses AI. Check for mistakes.
Comment on lines 1666 to +1669
if role == "user":
return UserMessageRenderer.render_simple(content)

renderables = []
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking_blocks rendering is gated by the if not content: return None guard just above this block. Since BaseAgent logs assistant messages with clean_content(...), tool-call-only outputs often become empty strings, so the UI will drop the entire message (including thinking_blocks/interrupted metadata). Consider moving the empty-content early return below the thinking_blocks/interrupted handling (or only applying it for role == "user").

Copilot uses AI. Check for mistakes.
metadata = {}
if thinking_blocks:
metadata["thinking_blocks"] = thinking_blocks

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s trailing whitespace on the blank line here (and potentially other nearby blank lines). The repo runs pre-commit’s trailing-whitespace hook, so this will fail checks; remove the extra spaces so the line is truly empty.

Suggested change

Copilot uses AI. Check for mistakes.
Comment on lines +1694 to +1702
elif getattr(msg_renderable, "plain", True):
renderables.append(msg_renderable)

if not renderables:
return None

if len(renderables) == 1:
return renderables[0]

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several lines with trailing whitespace in this block (e.g., after the elif condition and on blank lines). Since the repo uses the trailing-whitespace pre-commit hook, please strip trailing spaces here to avoid CI/pre-commit failures.

Suggested change
elif getattr(msg_renderable, "plain", True):
renderables.append(msg_renderable)
if not renderables:
return None
if len(renderables) == 1:
return renderables[0]
elif getattr(msg_renderable, "plain", True):
renderables.append(msg_renderable)
if not renderables:
return None
if len(renderables) == 1:
return renderables[0]

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@0xhis 0xhis force-pushed the prompt-optimization branch from 1039de9 to 6d4e9e8 Compare February 26, 2026 01:28
@0xhis 0xhis force-pushed the prompt-optimization branch from 52468cc to e7e03e0 Compare March 9, 2026 17:46
ST-2 and others added 25 commits March 9, 2026 10:59
…hering

- Restructures Phase 1 into explicit subagent delegation rules
- Root agent no longer runs recon/crawling/code analysis directly
- Adds black-box, white-box, and combined mode subagent templates
- Renames Phase 2 section to reflect dependency on gathered context
- Extract .renderable from ThinkRenderer.render() in tui.py for consistency
- Remove dead thinking_blocks parameter from add_message() in state.py
- Pass tracer into _stream() instead of importing in hot path in llm.py
- Add overflow indicator (+N more) when truncating tool displays in base_agent.py
…eation

- Add SKILLS ARE MANDATORY rule to Critical Rules section
- Update BLACK-BOX examples to include skills= in every agent creation
- Update WHITE-BOX examples to include skills= in every agent creation
- Add Skill Assignment Triggers section with 15 scenario→skill mappings
- Add warning that agents without skills lack vulnerability methodology

Fixes regression where subagents were spawning without vulnerability
skills loaded, causing shallow testing (no SQLi, XSS, etc.)
Add regex patterns to normalize <function>name> and <parameter>key> into
proper <function=name> and <parameter=key> format before parsing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants