feat(examples): defense-in-depth security analyzer by Fieldnote-Echo · Pull Request #2472 · OpenHands/software-agent-sdk

Fieldnote-Echo · 2026-03-16T22:08:09Z

Summary

Single-file custom SecurityAnalyzerBase example with four layers: two-corpus extraction (executable fields for shell patterns, all fields for injection patterns), Unicode normalization (NFKC + invisible/bidi stripping), segment-aware deterministic policy rails, and regex pattern scanning with max-severity ensemble fusion.

Two design decisions intentionally diverge from #2067:

Max-severity fusion instead of noisy-OR — analyzers are correlated (same input) and the SDK boundary is categorical
Stdlib-only Unicode normalization instead of homoglyph replacement — TR39 confusable detection is documented as a known limitation via strict xfails, not silently omitted

Test coverage: baseline suite (103 tests) covering extraction, two-corpus separation, normalization, rails, pattern classification, ensemble fusion, and confirmation policy. Adversarial suite (32 tests) with TDD bug fixes, strict xfails for irreducible limitations, and hostile-input stress tests.

Closes #2067

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

…al tests Single-file example implementing layered security analysis for agent tool calls: whitelisted extraction with field boundaries, Unicode normalization (NFKC + invisible/bidi stripping), segment-aware deterministic policy rails, regex pattern scanning, and max-severity ensemble fusion. Includes baseline test suite (93 tests) covering extraction, normalization, rails, pattern classification, ensemble fusion, and confirmation policy. Adversarial test suite (32 tests) with TDD-driven bug fixes, strict xfails documenting irreducible limitations, and hostile input stress tests. Closes OpenHands#2067

The 30k-char extraction cap silently drops content past the limit. Integrators who don't read the adversarial test suite won't know they're exposed if their agent processes large inputs.

The digit-prefixed example filename (45_...) requires importlib to import. Move the hack into conftest.py so it runs once at collection time; both test files now just reference sys.modules.

enyst · 2026-03-16T22:26:04Z

@OpenHands Do a /codereview on this PR. Investigate deeply the relevant code in the codebase.

openhands-ai · 2026-03-16T22:26:29Z

I'm on it! enyst can track my progress at all-hands.dev

enyst

I did a deeper pass on the example + tests and found two important correctness issues that look worth fixing before merge. Current targeted tests/examples pass, but both of the cases below reproduce locally with small ActionEvent probes.

[examples/01_standalone_sdk/45_defense_in_depth_security.py, Lines 153-161 and 563-577] 🔍 Correctness: _extract_segments() feeds thought, reasoning_content, and summary into the same corpus that shell-destructive rails/patterns scan. That means non-executable reasoning text can flip an otherwise safe action to HIGH. Repro: an action whose actual command is ls /tmp but whose thought says I should avoid rm -rf / currently returns HIGH in both PatternSecurityAnalyzer and EnsembleSecurityAnalyzer. That seems to undercut the stated goal of avoiding reasoning-related false positives. I think the safest fix is to keep shell/permission detection scoped to executable fields (tool_call.arguments, maybe tool metadata) and, if you still want thought/summary coverage, handle that in a separate prompt-injection/text-only path.

[examples/01_standalone_sdk/45_defense_in_depth_security.py, Lines 345-346 and 446] 🛠️ Correctness: the raw-disk dd detection only matches if=... of=/dev/.... dd operands are order-independent, so common destructive forms like dd of=/dev/sda if=/dev/zero and dd bs=1M of=/dev/sda if=/dev/zero currently fall through as LOW end-to-end (rail PASS, pattern LOW, ensemble LOW). Please make this check order-independent and add a regression test for of= before if=.

Fieldnote-Echo · 2026-03-16T22:39:36Z

Thank you @enyst for the feedback. I am drafting fixes and will push again when they are ready.

Finding 1: shell-destructive patterns now scan only executable fields (tool_name, tool_call.name, tool_call.arguments). Thought, reasoning, and summary are scanned only for injection/social-engineering patterns. Prevents false positives when an agent reasons about dangerous commands it chose not to run. Finding 2: dd raw-disk detection now matches of=/dev/ regardless of operand order. Adds regression tests for reversed and interleaved operands.

Fieldnote-Echo · 2026-03-16T22:53:28Z

@enyst Thanks OpenHands for the thorough review. Both findings are fixed in c398cf2: extraction now splits into executable and text corpora so shell patterns never see reasoning content, and the dd rail matches of=/dev/ regardless of operand order.

Co-authored-by: openhands <openhands@all-hands.dev>

Companion to OpenHands/software-agent-sdk#2472.

Fieldnote-Echo · 2026-03-17T21:13:20Z

Companion docs PR: OpenHands/docs#402

csmith49

Thanks for the contribution here! I'll leave some file-level comments/questions, but I also have a small project org. concern with the tests.

It's definitely a change in how things are organized to add SDK tests to things that are only in the examples. The current pattern is that the examples surface things in the SDK, so we don't need "example tests" because they're captured by the standard unit tests.

We could definitely just roll with this, but an alternative I prefer would be to move the relevant security analyzers to the SDK proper and just have the example surface how to use/configure them. They're general enough and useful enough that people building on the SDK might want to use them without the dynamic module loading tricks relied upon in the examples.

ETA: another alternative might be turning the example into a folder and package the tests directly with it instead of in tests/sdk.

@xingyaoww Any thoughts on the organization here?

csmith49 · 2026-03-18T19:37:56Z

examples/01_standalone_sdk/45_defense_in_depth_security.py

+    return _PASS
+
+
+def _evaluate_rail(content: str) -> RailDecision:


I'm a little confused on the distinction between this function and the PatternSecurityAnalyzer. Could we just map the has_* fields in the _evaluate_rail_segments function to a pattern and get the same result?

csmith49 · 2026-03-18T19:39:52Z

examples/01_standalone_sdk/45_defense_in_depth_security.py

+       {LOW, MEDIUM, HIGH} and UNKNOWN. If any concrete result exists,
+       return the highest. If ALL are UNKNOWN, propagate UNKNOWN.
+
+    Why max-severity instead of noisy-OR: the analyzers are correlated


The docs here present max-severity as an alternative to noisy-OR but I'm not sure where the latter is actually defined. Was this part of an earlier design iteration or do we use it elsewhere in the SDK?

csmith49 · 2026-03-18T19:43:41Z

examples/01_standalone_sdk/45_defense_in_depth_security.py

+        # Step 1-2: Policy rails (on executable-field segments only)
+        # Rails detect shell-level threats; reasoning text would cause
+        # false positives (e.g. thought "avoid rm -rf" on a safe command).
+        if self.enable_policy_rails:


Related to my question on PatternSecurityAnalyzer and _evaluate_rail_segments: the rails get converted directly to SecurityRisk.HIGH, which may not be needed if those rails are just expressed as patterns.

One alternative: keep the EnsembleSecurityAnalyzer a true ensemble that doesn't add any other logic except combining other security analyzers, and move this enable_policy_rails-guarded logic to a more primitive security analyzer (either as patterns or a separate PolicyRailSecurityAnalyzer).

csmith49 · 2026-03-18T19:48:43Z

examples/01_standalone_sdk/45_defense_in_depth_security.py

+# ---------------------------------------------------------------------------
+
+# Severity ordering for concrete (non-UNKNOWN) risk levels.
+_SEVERITY_ORDER = {SecurityRisk.LOW: 0, SecurityRisk.MEDIUM: 1, SecurityRisk.HIGH: 2}


I understand you're probably trying to keep everything self-contained in this example. If we end up wanting to move this to the SDK proper this kind of logic should probably live in an overloaded SecurityRisk.__lt__ method, which would let us replace the

max(concrete, key=lambda r: _SEVERITY_ORDER[r])

logic on line 778 with just

max(concrete)

Fieldnote-Echo · 2026-03-18T21:51:51Z

@csmith49 Thank you Calvin for the substantive feedback. It seems the requested changes only make sense if this is moved into the SDK proper, so I will wait for feedback from @xingyaoww before pushing any new commits.

xingyaoww · 2026-03-19T17:45:37Z

prefer would be to move the relevant security analyzers to the SDK proper and just have the example surface how to use/configure them. They're general enough and useful enough that people building on the SDK might want to use them without the dynamic module loading tricks relied upon in the examples.

Agree with this! We could potentially just come up with a new type of SecurityAnalyzer instead of dumping all the logic in the example script.

I'd love to get this actually into SDK, maybe even as a security analyzer that always run since it doesn't seems to consume too much resource 🙏

Fieldnote-Echo · 2026-03-20T17:48:13Z

@xingyaoww @csmith49 Solid direction, thank you. I'll refactor it into a new SecurityAnalyzer type in the SDK. Before I push anything, I'll study the existing analyzer module structure to make sure it fits the current patterns. Any preferences on where it should live under openhands_aci/?

csmith49 · 2026-03-20T17:54:28Z

I expect most everything can live as a sub-module in openhands.sdk.security, much like the grayswan security analyzer we already have. But your contribution will represent a significant portion of that module, so do what makes sense for this approach!

all-hands-bot · 2026-03-26T12:34:02Z

[Automatic Post]: It has been a while since there was any activity on this PR. @Fieldnote-Echo, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

all-hands-bot · 2026-03-26T12:34:10Z

[Automatic Post]: It has been a while since there was any activity on this PR. @Fieldnote-Echo, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

Fieldnote-Echo · 2026-03-26T16:46:23Z

@all-hands-bot still working, preparing offline before sending any pushes. If it's being considered for the SDK proper I want to make sure it earns the spot.

Fieldnote-Echo mentioned this pull request Mar 16, 2026

docs(examples): Custom SecurityAnalyzerBase with defense-in-depth patterns #2067

Open

Fieldnote-Echo and others added 3 commits March 16, 2026 17:15

docs: surface _EXTRACT_HARD_CAP tradeoff for downstream integrators

e18fc80

The 30k-char extraction cap silently drops content past the limit. Integrators who don't read the adversarial test suite won't know they're exposed if their agent processes large inputs.

Merge branch 'main' into feat/defense-in-depth-security-analyzer

3433b1e

refactor(tests): extract importlib hack into conftest.py

1e50a1a

The digit-prefixed example filename (45_...) requires importlib to import. Move the hack into conftest.py so it runs once at collection time; both test files now just reference sys.modules.

enyst reviewed Mar 16, 2026

View reviewed changes

This comment was marked as duplicate.

Sign in to view

Fieldnote-Echo and others added 2 commits March 17, 2026 15:41

style: apply pre-commit ruff-format fixes

90bc325

style: fix ruff-format in conftest.py

86ce55f

Co-authored-by: openhands <openhands@all-hands.dev>

Fieldnote-Echo added a commit to Fieldnote-Echo/docs that referenced this pull request Mar 17, 2026

docs: add defense-in-depth security analyzer section

181da50

Companion to OpenHands/software-agent-sdk#2472.

Fieldnote-Echo mentioned this pull request Mar 17, 2026

docs: add defense-in-depth security analyzer section OpenHands/docs#402

Open

3 tasks

enyst requested review from csmith49 and neubig March 18, 2026 08:03

csmith49 requested changes Mar 18, 2026

View reviewed changes

csmith49 reviewed Mar 18, 2026

View reviewed changes

		return _PASS


		def _evaluate_rail(content: str) -> RailDecision:

Conversation

Fieldnote-Echo commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

enyst commented Mar 16, 2026

Uh oh!

openhands-ai bot commented Mar 16, 2026

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as duplicate.

Fieldnote-Echo commented Mar 16, 2026

Uh oh!

Fieldnote-Echo commented Mar 16, 2026

Uh oh!

Fieldnote-Echo commented Mar 17, 2026

Uh oh!

csmith49 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csmith49 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

csmith49 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

csmith49 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

csmith49 Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Fieldnote-Echo commented Mar 18, 2026

Uh oh!

xingyaoww commented Mar 19, 2026

Uh oh!

Fieldnote-Echo commented Mar 20, 2026

Uh oh!

csmith49 commented Mar 20, 2026

Uh oh!

all-hands-bot commented Mar 26, 2026

Uh oh!

all-hands-bot commented Mar 26, 2026

Uh oh!

Fieldnote-Echo commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fieldnote-Echo commented Mar 16, 2026 •

edited

Loading

csmith49 left a comment •

edited

Loading