You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For more details on the base class implementation, see the [source code](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/security/analyzer.py).
445
445
</Tip>
446
446
447
+
### Defense-in-Depth Security Analyzer
448
+
449
+
The LLM-based analyzer above relies on the model to assess risk. But what if
450
+
the model itself is compromised, or the action contains encoding evasions that
451
+
trick the LLM into rating a dangerous command as safe?
452
+
453
+
A **defense-in-depth** approach stacks multiple independent layers so each
454
+
covers the others' blind spots. The example below implements four layers in
455
+
a single file, using the standard library plus the SDK and Pydantic — no
456
+
model calls, no external services, and no extra dependencies beyond the
457
+
SDK's normal runtime environment.
458
+
459
+
1.**Extraction with two corpora** — separates *what the agent will do*
460
+
(tool metadata and tool-call content) from *what it thought about*
461
+
(reasoning, summary).
462
+
Shell-destructive patterns only scan executable fields, so an agent that
463
+
thinks "I should avoid rm -rf /" while running `ls /tmp` is correctly
# Every agent action now passes through the analyzer.
511
+
# HIGH -> confirmation prompt. MEDIUM -> allowed.
512
+
# UNKNOWN -> confirmed by default (confirm_unknown=True).
513
+
```
514
+
515
+
<Warning>
516
+
`conversation.execute_tool()` bypasses the analyzer and confirmation policy.
517
+
This example protects normal agent actions in the conversation loop; hard
518
+
enforcement for direct tool calls requires hooks.
519
+
</Warning>
520
+
521
+
#### Key design decisions
522
+
523
+
Understanding *why* the example is built this way helps you decide what to
524
+
keep, modify, or replace when adapting it:
525
+
526
+
-**Two corpora, not one.** Shell patterns on reasoning text produce false
527
+
positives whenever the model discusses dangerous commands it chose not to
528
+
run. Injection patterns (instruction overrides, mode switching) are
529
+
textual attacks that make sense in any field. The split eliminates the
530
+
first problem without losing the second.
531
+
532
+
-**Max-severity, not noisy-OR.** The analyzers scan the same input, so
533
+
they're correlated. Noisy-OR assumes independence. Max-severity is
534
+
simpler, correct, and auditable.
535
+
536
+
-**UNKNOWN is first-class.** Some analyzers may return UNKNOWN when they
537
+
cannot assess an action or are not fully configured. The ensemble
538
+
preserves UNKNOWN unless at least one analyzer returns a concrete risk.
539
+
If the ensemble promoted UNKNOWN to HIGH, composing with optional
540
+
analyzers would be unusable.
541
+
542
+
-**Stdlib-only normalization.** NFKC normalization plus invisible/bidi
543
+
stripping covers the most common encoding evasions. Full confusable
544
+
detection (TR39) is documented as a known limitation, not silently
545
+
omitted.
546
+
547
+
#### Known limitations
548
+
549
+
The example documents its boundaries explicitly:
550
+
551
+
| Limitation | Why it exists | What would fix it |
552
+
|---|---|---|
553
+
| No hard-deny at the `SecurityAnalyzer` boundary | The SDK analyzer returns `SecurityRisk`, not block/allow decisions | Hook-based enforcement |
554
+
|`conversation.execute_tool()` bypasses analyzer checks | Direct tool execution skips the normal agent decision path | Avoid bypass path or enforce through hooks |
555
+
| No Cyrillic/homoglyph detection | NFKC maps compatibility variants, not cross-script confusables | Unicode TR39 tables (not in stdlib) |
556
+
| Content beyond the 30k extraction cap is not scanned | Hard cap prevents regex denial-of-service | Raise the cap (increases ReDoS exposure) |
557
+
|`thinking_blocks` not scanned | Scanning reasoning artifacts would create high false-positive risk by treating internal deliberation as executable intent | Separate injection-only scan of CoT |
558
+
|`curl \| node` not detected | Interpreter list covers sh/bash/python/perl/ruby only | Expand the list (increases false positives) |
559
+
560
+
#### Ready-to-run example
561
+
562
+
<Note>
563
+
Full defense-in-depth example: [examples/01_standalone_sdk/45_defense_in_depth_security.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/45_defense_in_depth_security.py)
0 commit comments