fix(skills): prevent LLM from refusing pagination and denying filter support#248
Open
yuvalk wants to merge 3 commits into
Open
fix(skills): prevent LLM from refusing pagination and denying filter support#248yuvalk wants to merge 3 commits into
yuvalk wants to merge 3 commits into
Conversation
The LLM was paginating through all CVEs (or refusing to) when the user asked "how many critical CVEs" — a question answerable from meta.total_items on a single filtered call. Add an explicit counting workflow example that teaches this pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dling The LLM was refusing to paginate large CVE result sets, claiming it was "beyond operational capacity." Add a [STRICT] section that explicitly forbids refusal, teaches the meta.total_items counting pattern, and reinforces the filter-first approach for large datasets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LLM falsely claimed vulnerability__get_system_cves does not support severity or remediation filtering. Elevate the filter parameter section to [STRICT] priority and add an explicit instruction not to deny the existence of listed parameters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dmartinol
approved these changes
May 27, 2026
Collaborator
dmartinol
left a comment
There was a problem hiding this comment.
Will you raise a separate PR for the other issue about using the wrong tool for counting?
how many systems are in the lightspeed inventory?
[Responded with:]
/PLANNING/
Call vulnerability__get_systems without any filters to get the total number of systems in the Lightspeed Vulnerability inventory.
...
So it pulled the list of hosts which are talking to vulnerability service, not the inventory service
...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The LLM (Gemini 2.5 Flash) was refusing to paginate through CVE results and falsely claiming that
vulnerability__get_system_cvesdoesn't support severity or remediation filtering. It responded with "processing such a large volume of data page by page...is beyond my current operational capacity" — language not grounded in any instruction we gave it.Root cause: the LLM hallucinated both the tool limitation and a capacity constraint. Our existing skills documented the correct filters and pagination behavior, but the instructions weren't assertive enough to prevent the model from inventing reasons to give up.
Changes
pagination-handling/SKILL.md— Added a[STRICT]"Never Refuse to Paginate or Count" section that explicitly forbids refusal, teaches themeta.total_itemscounting pattern (no need to paginate for "how many" queries), and reinforces the filter-first approachmulti-step-workflows/SKILL.md— Elevated filter parameter lists to[STRICT]priority with an explicit instruction not to deny the existence of listed parameters. Added a counting workflow example showing how to answer "how many critical remediable CVEs on host X" with a single filtered callTest plan
make lintpassesmake testpasses (5 pre-existing LiteLLM failures unrelated to this change)severity=Criticalfilter instead of fetching all 4,134 CVEsmeta.total_itemswithout paginating🤖 Generated with Claude Code