You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is intentionally less prescriptive than #1152. Implementers have latitude in how they approach the problem. #1152 exists as a technical reference with additional design hints (e.g., specific libraries, library placement) but should not be treated as a requirements document for this task.
Problem Context
We already have a structured SSVC domain model in Python (Pydantic-based), including DecisionPoint, DecisionPointValue, and Selection. What is missing is a reusable evaluation skill that can take a single decision point plus arbitrary evidence and produce a valid Selection in a consistent, repeatable way.
This is the first vertical slice toward agent-assisted SSVC classification. It intentionally does not include full SSVC tree evaluation or multi-decision-point orchestration.
Objective
Implement a reusable "decision-point evaluation skill" that enables an agent (or tool-using LLM) to evaluate exactly one SSVC DecisionPoint against a bounded evidence set and produce a Selection.
The evaluation model is explicitly elimination-based (via negativa):
Start from the full set of allowed DecisionPoint.values
Use provided evidence to eliminate values that are contradicted or unsupported
Return the remaining viable values as the Selection
If no values can be eliminated with confidence, return the full set with an explanation that evidence is insufficient to disambiguate
This is not a "best answer selection" problem. It is a constraint reduction problem over a closed value set.
Inputs
DecisionPoint (src/ssvc/decision_points/base.py)
Fully defined Pydantic object
Provides: name, description, version, namespace, key, and a tuple of valid DecisionPointValue objects
Each DecisionPointValue has a name, key, and description — these descriptions are the primary semantic source for reasoning about which values apply
No requirement for preprocessing or normalization upstream
Core Behavior
The evaluator must:
Treat the DecisionPoint.values as the complete hypothesis space
Analyze evidence to determine which values are:
contradicted — directly ruled out by evidence → eliminate
weakly supported — mentioned or partially consistent with evidence but not conclusively confirmed → remain possible
not addressed — evidence says nothing about the value → remain possible
Produce a reduced candidate set of values
Note: eliminating all values would be an error — if evidence would eliminate all candidates, treat this as insufficient evidence and return the full set
If ambiguity remains, explicitly surface:
why multiple values remain viable
what evidence would disambiguate them
(in interactive mode) optionally ask the user for that disambiguating information
If no evidence is sufficient to eliminate any values:
return all values as viable
explicitly indicate insufficiency of evidence
Output Contract
Return a valid Selection object (src/ssvc/selection.py):
Must reference the original DecisionPoint (via matching namespace, key, and version)
values must be a non-empty list of MinimalDecisionPointValue objects (key only) drawn from the original DecisionPoint.values
Must pass Pydantic validation — the JSON schema is already generated and available at data/schema/v2/SelectionList_2_0_0.schema.json and can be regenerated via make regenerate_json
Note on SelectionList:Selection is the appropriate output for a single decision-point evaluation. If the calling context needs a timestamped, multi-selection record, wrapping in a SelectionList is the caller's responsibility — it is out of scope here.
"Valid output" means schema-conformant: a Selection that passes Pydantic validation. LLM inference variability is acceptable; outputs that fail schema validation should be retried or corrected before being returned.
Interaction Modes
The skill should support two execution modes:
1. Pipeline mode
Fully automated
Returns Selection only
No external interaction; evidence provided is all there is
2. Interactive mode
May request additional evidence if necessary
May ask clarifying questions when evidence is insufficient to distinguish between remaining values
Must still converge to a valid Selection — if the user cannot or does not provide disambiguating information, fall back to returning all remaining viable values
What is an "Agent Skill"?
A skill in this project is a SKILL.md file (plus any supporting code or resources) placed under .agents/skills/. A stub already exists at .agents/skills/ssvc/evaluate-decision-point/SKILL.md. This issue is to implement that stub.
See .agents/skills/README.md for the skill format. A first working version can be purely a well-structured SKILL.md with supporting Python tooling; sophistication can be added iteratively.
Key Design Constraints
The evaluator is generic across all SSVC decision points
No per-decision-point custom logic should be required
DecisionPoint definition (name, description, value names/descriptions) is the primary source of semantics
Evidence is treated as unstructured input with no schema guarantees
Output must always be schema-valid (see Output Contract above)
Non-Goals (Explicit Out of Scope)
Full SSVC decision tree evaluation
Multi-decision-point orchestration or batching
External retrieval / search / indexing systems
Designing new SSVC ontology or modifying existing decision points
Building a general-purpose autonomous agent framework
Implementation Notes (Guidance, Not Requirements)
Implementers are expected to decide how to structure:
prompt construction strategy (if using LLMs)
evidence selection/filtering heuristics
intermediate reasoning representations (if any)
orchestration between Python and LLM components
However, the following are required:
reuse existing Pydantic models for all structured IO
enforce schema validation on outputs
ensure the evaluator operates purely within the constraints of the provided DecisionPoint
Suggestions:
A first version could consist solely of a well-written SKILL.md and a small Python helper that loads/validates objects — get the full workflow working before refining quality of each step
Note
This issue is intentionally less prescriptive than #1152. Implementers have latitude in how they approach the problem. #1152 exists as a technical reference with additional design hints (e.g., specific libraries, library placement) but should not be treated as a requirements document for this task.
Problem Context
We already have a structured SSVC domain model in Python (Pydantic-based), including
DecisionPoint,DecisionPointValue, andSelection. What is missing is a reusable evaluation skill that can take a single decision point plus arbitrary evidence and produce a validSelectionin a consistent, repeatable way.This is the first vertical slice toward agent-assisted SSVC classification. It intentionally does not include full SSVC tree evaluation or multi-decision-point orchestration.
Objective
Implement a reusable "decision-point evaluation skill" that enables an agent (or tool-using LLM) to evaluate exactly one SSVC
DecisionPointagainst a bounded evidence set and produce aSelection.The evaluation model is explicitly elimination-based (via negativa):
DecisionPoint.valuesSelectionThis is not a "best answer selection" problem. It is a constraint reduction problem over a closed value set.
Inputs
DecisionPoint (
src/ssvc/decision_points/base.py)DecisionPointValueobjectsDecisionPointValuehas aname,key, anddescription— these descriptions are the primary semantic source for reasoning about which values applyDecisionPoint documentation (optional, supplemental)
DecisionPointValue.descriptionfields are the minimum required semanticsEvidence bundle
Core Behavior
The evaluator must:
Treat the
DecisionPoint.valuesas the complete hypothesis spaceAnalyze evidence to determine which values are:
Produce a reduced candidate set of values
If ambiguity remains, explicitly surface:
If no evidence is sufficient to eliminate any values:
Output Contract
Return a valid
Selectionobject (src/ssvc/selection.py):DecisionPoint(via matchingnamespace,key, andversion)valuesmust be a non-empty list ofMinimalDecisionPointValueobjects (key only) drawn from the originalDecisionPoint.valuesdata/schema/v2/SelectionList_2_0_0.schema.jsonand can be regenerated viamake regenerate_json"Valid output" means schema-conformant: a
Selectionthat passes Pydantic validation. LLM inference variability is acceptable; outputs that fail schema validation should be retried or corrected before being returned.Interaction Modes
The skill should support two execution modes:
1. Pipeline mode
Selectiononly2. Interactive mode
Selection— if the user cannot or does not provide disambiguating information, fall back to returning all remaining viable valuesWhat is an "Agent Skill"?
A skill in this project is a SKILL.md file (plus any supporting code or resources) placed under
.agents/skills/. A stub already exists at.agents/skills/ssvc/evaluate-decision-point/SKILL.md. This issue is to implement that stub.See
.agents/skills/README.mdfor the skill format. A first working version can be purely a well-structuredSKILL.mdwith supporting Python tooling; sophistication can be added iteratively.Key Design Constraints
DecisionPointdefinition (name, description, value names/descriptions) is the primary source of semanticsNon-Goals (Explicit Out of Scope)
Implementation Notes (Guidance, Not Requirements)
Implementers are expected to decide how to structure:
However, the following are required:
DecisionPointSuggestions:
SKILL.mdand a small Python helper that loads/validates objects — get the full workflow working before refining quality of each stepSuccess Criteria
A minimal successful implementation:
DecisionPointSelectionFuture Extension Path (Context Only)
This work is expected to become the first component in a larger system that:
That future scope is explicitly not part of this task.