Skip to content

eval: rh-sre-fleet-inventory#22

Open
GuyZivRH wants to merge 20 commits into
mainfrom
eval/rh-sre-fleet-inventory
Open

eval: rh-sre-fleet-inventory#22
GuyZivRH wants to merge 20 commits into
mainfrom
eval/rh-sre-fleet-inventory

Conversation

@GuyZivRH
Copy link
Copy Markdown
Collaborator

@GuyZivRH GuyZivRH commented May 5, 2026

A/B evaluation for rh-sre fleet inventory skill.

Made with Cursor

GuyZivRH and others added 2 commits May 5, 2026 09:42
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@GuyZivRH GuyZivRH marked this pull request as ready for review May 5, 2026 11:10
@GuyZivRH GuyZivRH marked this pull request as draft May 5, 2026 12:04
@GuyZivRH GuyZivRH marked this pull request as ready for review May 6, 2026 04:18
@GuyZivRH GuyZivRH marked this pull request as ready for review May 6, 2026 04:18
@GuyZivRH GuyZivRH marked this pull request as draft May 6, 2026 07:20
GuyZivRH and others added 3 commits May 6, 2026 11:34
Strip methodology from instruction, add tests for stale <7d check-in
heuristic, Vulnerable/Patched/Not Affected status strings, system UUID
tracking, and EOL RHEL compliance flagging.

Co-authored-by: Cursor <cursoragent@cursor.com>
Replace broad keyword matching with case-sensitive checks for exact
fields: Vulnerable/Patched/Not Affected status strings, stale, last_seen,
remediation_available, get_cve_systems, display_name, fqdn.

Co-authored-by: Cursor <cursoragent@cursor.com>
CVE- appears in any CVE report trivially. Now requires the exact
tool name get_cve_systems for meaningful discrimination.

Co-authored-by: Cursor <cursoragent@cursor.com>
@GuyZivRH GuyZivRH marked this pull request as ready for review May 6, 2026 20:53
GuyZivRH and others added 3 commits May 6, 2026 23:53
…wledge

Tests now check for: case-sensitive status strings (Vulnerable/Patched/
Not Affected), last_seen staleness, remediation_available flag,
display_name/fqdn identifiers, get_host_details/get_cve_systems tools.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add CLAUDE.md as treatment-only system prompt
- Move docs from supportive/docs/ to docs/ at submission root
- Nest skills under skills/fleet-inventory/ with sibling mcp-lightspeed-validator
- Fix all doc reference paths to /docs/ for container resolution
- Add skill usage hint to instruction.md
- Sync enhanced mock-lightspeed-mcp.py with pagination support

Co-authored-by: Cursor <cursoragent@cursor.com>
@GuyZivRH GuyZivRH marked this pull request as draft May 7, 2026 10:16
GuyZivRH and others added 3 commits May 7, 2026 13:54
Wire up the .ai-index/ semantic index so the skilled agent
discovers the docs library via CLAUDE.md system prompt.
Also deduplicate prior accidental repetitions.

Co-authored-by: Cursor <cursoragent@cursor.com>
Subagents were refusing to call MCP tools (e.g. inventory__list_hosts)
because CLAUDE.md's skill-first rule was interpreted too literally,
causing deadlock loops and timeouts during fleet-inventory evaluation.

Co-authored-by: Cursor <cursoragent@cursor.com>
Tests checked for exact snake_case API field names (display_name, fqdn,
last_seen, remediation_available) but the agent writes readable column
headers. Accept both forms. Also handle empty LLM responses in judge.

Co-authored-by: Cursor <cursoragent@cursor.com>
@GuyZivRH GuyZivRH marked this pull request as ready for review May 7, 2026 14:02
@GuyZivRH GuyZivRH marked this pull request as draft May 11, 2026 06:03
@GuyZivRH GuyZivRH marked this pull request as ready for review May 11, 2026 23:49
@GuyZivRH GuyZivRH marked this pull request as draft May 14, 2026 08:16
@GuyZivRH GuyZivRH marked this pull request as ready for review May 14, 2026 10:54
@GuyZivRH GuyZivRH marked this pull request as draft May 18, 2026 11:21
Replace vague conceptual checks with get_host_details tool, lightspeed
validator prereq, fleet size (63 systems), specific CVE references
(CVE-2024-12345), environment breakdown, and remediation transition.

Co-authored-by: Cursor <cursoragent@cursor.com>
@GuyZivRH GuyZivRH marked this pull request as ready for review May 18, 2026 14:36
@GuyZivRH GuyZivRH marked this pull request as draft May 24, 2026 18:09
Co-authored-by: Cursor <cursoragent@cursor.com>
@GuyZivRH GuyZivRH marked this pull request as ready for review May 24, 2026 18:24
@GuyZivRH GuyZivRH marked this pull request as draft May 24, 2026 18:32
@GuyZivRH GuyZivRH marked this pull request as ready for review May 25, 2026 06:05
@GuyZivRH GuyZivRH marked this pull request as draft May 25, 2026 06:07
@GuyZivRH GuyZivRH marked this pull request as ready for review May 25, 2026 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant