Agent Skill: Single Decision-Point SSVC Evaluator (Elimination-Based Selection Generation)

> [!note]
> This issue is intentionally **less prescriptive** than #1152. Implementers have latitude in how they approach the problem. #1152 exists as a technical reference with additional design hints (e.g., specific libraries, library placement) but should not be treated as a requirements document for this task.

#### Problem Context

We already have a structured SSVC domain model in Python (Pydantic-based), including `DecisionPoint`, `DecisionPointValue`, and `Selection`. What is missing is a reusable evaluation *skill* that can take a single decision point plus arbitrary evidence and produce a valid `Selection` in a consistent, repeatable way.

This is the first vertical slice toward agent-assisted SSVC classification. It intentionally does not include full SSVC tree evaluation or multi-decision-point orchestration.

---

#### Objective

Implement a reusable "decision-point evaluation skill" that enables an agent (or tool-using LLM) to evaluate exactly one SSVC `DecisionPoint` against a bounded evidence set and produce a `Selection`.

The evaluation model is explicitly **elimination-based (via negativa)**:

* Start from the full set of allowed `DecisionPoint.values`
* Use provided evidence to eliminate values that are contradicted or unsupported
* Return the remaining viable values as the `Selection`
* If no values can be eliminated with confidence, return the full set with an explanation that evidence is insufficient to disambiguate

This is not a "best answer selection" problem. It is a constraint reduction problem over a closed value set.

---

#### Inputs

1. **DecisionPoint** (`src/ssvc/decision_points/base.py`)

   * Fully defined Pydantic object
   * Provides: name, description, version, namespace, key, and a tuple of valid `DecisionPointValue` objects
   * Each `DecisionPointValue` has a `name`, `key`, and `description` — these descriptions are the primary semantic source for reasoning about which values apply

2. **DecisionPoint documentation** (optional, supplemental)

   * Additional context beyond what is embedded in the Pydantic object: extended definitions, worked examples, usage notes
   * May be absent; the embedded `DecisionPointValue.description` fields are the minimum required semantics

3. **Evidence bundle**

   * Arbitrary user-provided text
   * May include: incident reports, vulnerability descriptions, email threads, documentation excerpts, unstructured notes
   * No requirement for preprocessing or normalization upstream

---

#### Core Behavior

The evaluator must:

1. Treat the `DecisionPoint.values` as the complete hypothesis space
2. Analyze evidence to determine which values are:
   * **contradicted** — directly ruled out by evidence → eliminate
   * **weakly supported** — mentioned or partially consistent with evidence but not conclusively confirmed → remain possible
   * **not addressed** — evidence says nothing about the value → remain possible
3. Produce a reduced candidate set of values

   * Note: eliminating *all* values would be an error — if evidence would eliminate all candidates, treat this as insufficient evidence and return the full set

4. If ambiguity remains, explicitly surface:
   * why multiple values remain viable
   * what evidence would disambiguate them
   * *(in interactive mode)* optionally ask the user for that disambiguating information

5. If no evidence is sufficient to eliminate any values:
   * return all values as viable
   * explicitly indicate insufficiency of evidence

---

#### Output Contract

Return a valid `Selection` object (`src/ssvc/selection.py`):

* Must reference the original `DecisionPoint` (via matching `namespace`, `key`, and `version`)
* `values` must be a non-empty list of `MinimalDecisionPointValue` objects (key only) drawn from the original `DecisionPoint.values`
* Must pass Pydantic validation — the JSON schema is already generated and available at `data/schema/v2/SelectionList_2_0_0.schema.json` and can be regenerated via `make regenerate_json`

> **Note on `SelectionList`:** `Selection` is the appropriate output for a single decision-point evaluation. If the calling context needs a timestamped, multi-selection record, wrapping in a `SelectionList` is the caller's responsibility — it is out of scope here.

**"Valid output"** means schema-conformant: a `Selection` that passes Pydantic validation. LLM inference variability is acceptable; outputs that fail schema validation should be retried or corrected before being returned.

---

#### Interaction Modes

The skill should support two execution modes:

##### 1. Pipeline mode

* Fully automated
* Returns `Selection` only
* No external interaction; evidence provided is all there is

##### 2. Interactive mode

* May request additional evidence if necessary
* May ask clarifying questions when evidence is insufficient to distinguish between remaining values
* Must still converge to a valid `Selection` — if the user cannot or does not provide disambiguating information, fall back to returning all remaining viable values

---

#### What is an "Agent Skill"?

A skill in this project is a SKILL.md file (plus any supporting code or resources) placed under `.agents/skills/`. A stub already exists at `.agents/skills/ssvc/evaluate-decision-point/SKILL.md`. This issue is to implement that stub.

See `.agents/skills/README.md` for the skill format. A first working version can be purely a well-structured `SKILL.md` with supporting Python tooling; sophistication can be added iteratively.

---

#### Key Design Constraints

* The evaluator is generic across all SSVC decision points
* No per-decision-point custom logic should be required
* `DecisionPoint` definition (name, description, value names/descriptions) is the primary source of semantics
* Evidence is treated as unstructured input with no schema guarantees
* Output must always be schema-valid (see Output Contract above)

---

#### Non-Goals (Explicit Out of Scope)

* Full SSVC decision tree evaluation
* Multi-decision-point orchestration or batching
* External retrieval / search / indexing systems
* Designing new SSVC ontology or modifying existing decision points
* Building a general-purpose autonomous agent framework

---

#### Implementation Notes (Guidance, Not Requirements)

Implementers are expected to decide how to structure:

* prompt construction strategy (if using LLMs)
* evidence selection/filtering heuristics
* intermediate reasoning representations (if any)
* orchestration between Python and LLM components

However, the following are required:

* reuse existing Pydantic models for all structured IO
* enforce schema validation on outputs
* ensure the evaluator operates purely within the constraints of the provided `DecisionPoint`

Suggestions:

* A first version could consist solely of a well-written `SKILL.md` and a small Python helper that loads/validates objects — get the full workflow working before refining quality of each step
* See #1152 for additional design hints (library placement, framework options, retry semantics) — treat as optional reference, not requirements

---

#### Success Criteria

A minimal successful implementation:

* Accepts any valid `DecisionPoint`
* Accepts arbitrary evidence text
* Produces a schema-valid `Selection`
* Correctly reduces candidate values when evidence supports elimination
* Preserves all values when evidence is insufficient
* Can run in both pipeline and interactive modes

---

#### Future Extension Path (Context Only)

This work is expected to become the first component in a larger system that:

* composes multiple decision-point evaluations into full SSVC evaluation trees
* exposes the evaluator as a CLI tool
* eventually wraps as a service and/or agent "skill" interface

That future scope is explicitly not part of this task.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Skill: Single Decision-Point SSVC Evaluator (Elimination-Based Selection Generation) #1156

Problem Context

Objective

Inputs

Core Behavior

Output Contract

Interaction Modes

1. Pipeline mode

2. Interactive mode

What is an "Agent Skill"?

Key Design Constraints

Non-Goals (Explicit Out of Scope)

Implementation Notes (Guidance, Not Requirements)

Success Criteria

Future Extension Path (Context Only)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Agent Skill: Single Decision-Point SSVC Evaluator (Elimination-Based Selection Generation) #1156

Description

Problem Context

Objective

Inputs

Core Behavior

Output Contract

Interaction Modes

1. Pipeline mode

2. Interactive mode

What is an "Agent Skill"?

Key Design Constraints

Non-Goals (Explicit Out of Scope)

Implementation Notes (Guidance, Not Requirements)

Success Criteria

Future Extension Path (Context Only)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions