feat: asymmetric trigger detection — reduce false positives for negative queries by melanie531 · Pull Request #5 · aws-samples/sample-agent-skill-eval

melanie531 · 2026-03-19T05:18:39Z

Problem

Skills with common-word names (e.g. text-summary, acme-compliance) suffer high false-positive rates in trigger evaluation. When a should_trigger=false query is evaluated, the agent often casually mentions the skill name in its text output (e.g. "I could use text-summary here") without actually activating the skill. The current code treats any text mention as a trigger, causing negative queries to fail.

This was observed in CI pipelines where:

acme-compliance trigger score dropped to 50/100
text-summary trigger eval failed entirely

Solution

Refactor _detect_skill_trigger_from_parsed() into _classify_trigger_signal() that returns signal strength:

Signal	Meaning	Examples
`"tool"`	Strong — agent used a tool to activate the skill	Read SKILL.md, Bash script execution, Skill tool
`"text"`	Weak — agent mentioned skill name in text output	`"Using text-summary to..."`
`"none"`	No trigger detected	—

Asymmetric detection logic:

should_trigger=true queries: both tool and text signals count as triggers (unchanged)
should_trigger=false queries: only tool signals count — text mentions are ignored

This means a negative query won't fail just because the agent casually mentioned the skill name.

Backward Compatibility

_detect_skill_trigger_from_parsed() still exists and returns bool (wraps _classify_trigger_signal)
_detect_skill_trigger() unchanged
All 663 existing tests pass + 12 new tests added

Tests Added

TestClassifyTriggerSignal (8 tests) — signal classification for tool/text/none
TestAsymmetricTriggerDetection (3 tests) — asymmetric behavior for positive vs negative queries

…gative queries Refactor _detect_skill_trigger_from_parsed into _classify_trigger_signal that returns signal strength ('tool', 'text', or 'none') instead of a boolean. Key change: for should_trigger=false queries, only 'tool' signals (Read SKILL.md, Bash script execution, Skill tool invocation) count as triggers. Text-only mentions of the skill name are ignored, since agents commonly mention skill names (e.g. 'text-summary', 'compliance') in their output without intending to activate the skill. For should_trigger=true queries, both 'tool' and 'text' signals count, preserving existing behavior. This fixes false positives that caused trigger eval failures for skills with common-word names (like text-summary, acme-compliance).

melanie531 merged commit c5639a5 into aws-samples:main Mar 19, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: asymmetric trigger detection — reduce false positives for negative queries#5

feat: asymmetric trigger detection — reduce false positives for negative queries#5
melanie531 merged 1 commit intoaws-samples:mainfrom
melanie531:feat/trigger-detection-improvements

melanie531 commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

melanie531 commented Mar 19, 2026

Problem

Solution

Backward Compatibility

Tests Added

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant