Skip to content

Fix web testing flag completion and safe log rendering#143

Open
Ledraw wants to merge 1 commit into
ipa-lab:mainfrom
Ledraw:fix/web-testing-logging-and-flags
Open

Fix web testing flag completion and safe log rendering#143
Ledraw wants to merge 1 commit into
ipa-lab:mainfrom
Ledraw:fix/web-testing-logging-and-flags

Conversation

@Ledraw
Copy link
Copy Markdown

@Ledraw Ledraw commented May 12, 2026

Summary

This PR fixes two issues found while running WebTestingWithExplanation against CTF-style web targets:

  1. Prevent logger crashes when Rich tries to parse untrusted tool output as markup.
  2. Allow WebTestingWithExplanation to stop after a discovered CTF flag is submitted.

Root Cause

Rich MarkupError in logger output

LocalLogger and RemoteLogger passed raw message/tool output directly into rich.panel.Panel.

Rich parses plain strings as markup by default. If an HTTP response contains binary-looking content or text such as [/...], Rich treats it as a closing markup tag and raises MarkupError.

This can happen when the web testing agent requests resources such as favicon, images, fonts, archives, or arbitrary binary/text responses.
20260512170019_191_220

WebTestingWithExplanation did not stop after unknown CTF flags

SubmitFlag only accepted flags from the preconfigured flags list. This works for known benchmark flags, but real CTF flags are often unknown before discovery.

As a result, when the model found and submitted a valid CTF flag that was not preconfigured, the tool returned Not a valid flag, the success callback was not called, and the run continued instead of stopping.
image

Changes

  • Render logged message content, tool arguments, and tool results as rich.text.Text.
  • Keep the original raw values stored in the log database.
  • Add accept_any_flag support to SubmitFlag.
  • Enable accept_any_flag by default for WebTestingWithExplanation.
  • Keep the old strict flag behavior available by setting --accept_any_flag=False.
  • Add regression tests for safe Rich rendering and WebTesting flag completion.

Testing

ruff check src/hackingBuddyGPT/utils/logging.py src/hackingBuddyGPT/capabilities/submit_flag.py src/hackingBuddyGPT/usecases/web/with_explanation.py tests/test_logging.py tests/test_web_testing.py
pytest

Results:

All checks passed
81 passed, 3 warnings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant