Add multiclass classification (OVO, OVA)

## Context

gpredomics currently supports binary classification only (2 classes: 0 vs 1). Many clinical and biological problems involve multiple classes (e.g., disease subtypes, treatment response categories, multiple conditions).

## Proposed approaches

### One-vs-All (OVA / OVR)
- Train K binary classifiers, each separating one class from all others
- At prediction time, assign the class with the highest score/confidence
- **Pros**: simple, only K models needed, each model is a standard gpredomics binary model (fully interpretable)
- **Cons**: class imbalance (one class vs all others); models are not calibrated against each other; may produce ambiguous regions where multiple classifiers predict positive
- **Implementation**: can be orchestrated externally (run gpredomics K times with relabeled y), or integrated into the engine for convenience

### One-vs-One (OVO)
- Train K×(K-1)/2 binary classifiers, one for each pair of classes
- At prediction time, each classifier votes; assign the class with the most votes
- **Pros**: each pairwise classifier sees balanced sub-problems; often better separation
- **Cons**: quadratic number of models; voting ties possible; harder to interpret the ensemble
- **Implementation**: more complex orchestration; needs a voting/aggregation layer

### Comparison

| Approach | # Models | Balance | Interpretability | Complexity |
|----------|----------|---------|-----------------|------------|
| OVA | K | Imbalanced | High (each model is standalone) | Low |
| OVO | K(K-1)/2 | Balanced | Medium (ensemble of pairwise models) | Medium |

## Design considerations

- **Jury integration**: The existing voting/jury system could potentially be reused for combining OVO classifiers
- **Feature importance**: How to aggregate feature importance across multiple binary models
- **Cross-validation**: CV should maintain class proportions across all classes (stratified K-fold)
- **param.yaml**: Need a new parameter for multiclass strategy (`multiclass: ova` or `multiclass: ovo`)
- **Output format**: Results should show per-class metrics (sensitivity, specificity, etc.) and a confusion matrix

## Related work

- [predomicsmc](https://github.com/UMMISCO/predomicsmc) — existing multiclass extension via Predomics (R implementation)
- scikit-learn's OneVsRestClassifier / OneVsOneClassifier as design reference

## Suggested implementation path

1. Start with OVA — simpler, each sub-model is a standard gpredomics run
2. Add OVO as an option later
3. Consider whether multiclass should be a core engine feature or an orchestration layer (wrapper script / predomicsapp-web feature)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multiclass classification (OVO, OVA) #52

Context

Proposed approaches

One-vs-All (OVA / OVR)

One-vs-One (OVO)

Comparison

Design considerations

Related work

Suggested implementation path

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Approach	# Models	Balance	Interpretability	Complexity
OVA	K	Imbalanced	High (each model is standalone)	Low
OVO	K(K-1)/2	Balanced	Medium (ensemble of pairwise models)	Medium

Add multiclass classification (OVO, OVA) #52

Description

Context

Proposed approaches

One-vs-All (OVA / OVR)

One-vs-One (OVO)

Comparison

Design considerations

Related work

Suggested implementation path

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions