Skip to content

fix(skills): prevent LLM from refusing pagination and denying filter support#248

Open
yuvalk wants to merge 3 commits into
RHEcosystemAppEng:mainfrom
yuvalk:fix/prevent-llm-pagination-refusal
Open

fix(skills): prevent LLM from refusing pagination and denying filter support#248
yuvalk wants to merge 3 commits into
RHEcosystemAppEng:mainfrom
yuvalk:fix/prevent-llm-pagination-refusal

Conversation

@yuvalk
Copy link
Copy Markdown
Collaborator

@yuvalk yuvalk commented May 26, 2026

Summary

The LLM (Gemini 2.5 Flash) was refusing to paginate through CVE results and falsely claiming that vulnerability__get_system_cves doesn't support severity or remediation filtering. It responded with "processing such a large volume of data page by page...is beyond my current operational capacity" — language not grounded in any instruction we gave it.

Root cause: the LLM hallucinated both the tool limitation and a capacity constraint. Our existing skills documented the correct filters and pagination behavior, but the instructions weren't assertive enough to prevent the model from inventing reasons to give up.

Changes

  • pagination-handling/SKILL.md — Added a [STRICT] "Never Refuse to Paginate or Count" section that explicitly forbids refusal, teaches the meta.total_items counting pattern (no need to paginate for "how many" queries), and reinforces the filter-first approach
  • multi-step-workflows/SKILL.md — Elevated filter parameter lists to [STRICT] priority with an explicit instruction not to deny the existence of listed parameters. Added a counting workflow example showing how to answer "how many critical remediable CVEs on host X" with a single filtered call

Test plan

  • make lint passes
  • make test passes (5 pre-existing LiteLLM failures unrelated to this change)
  • Deploy to staging and test with the query: "How many critical CVEs need to be fixed on dur-pgs02.hcc-lab.com?"
  • Verify the agent uses severity=Critical filter instead of fetching all 4,134 CVEs
  • Verify the agent reports a count from meta.total_items without paginating

🤖 Generated with Claude Code

yuvalk and others added 3 commits May 26, 2026 18:50
The LLM was paginating through all CVEs (or refusing to) when the user
asked "how many critical CVEs" — a question answerable from
meta.total_items on a single filtered call. Add an explicit counting
workflow example that teaches this pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dling

The LLM was refusing to paginate large CVE result sets, claiming it was
"beyond operational capacity." Add a [STRICT] section that explicitly
forbids refusal, teaches the meta.total_items counting pattern, and
reinforces the filter-first approach for large datasets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The LLM falsely claimed vulnerability__get_system_cves does not support
severity or remediation filtering. Elevate the filter parameter section
to [STRICT] priority and add an explicit instruction not to deny the
existence of listed parameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@luis5tb luis5tb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@dmartinol dmartinol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you raise a separate PR for the other issue about using the wrong tool for counting?

how many systems are in the lightspeed inventory?
[Responded with:]
/PLANNING/
Call vulnerability__get_systems without any filters to get the total number of systems in the Lightspeed Vulnerability inventory.
...
So it pulled the list of hosts which are talking to vulnerability service, not the inventory service
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants