Context
gpredomics currently supports binary classification only (2 classes: 0 vs 1). Many clinical and biological problems involve multiple classes (e.g., disease subtypes, treatment response categories, multiple conditions).
Proposed approaches
One-vs-All (OVA / OVR)
- Train K binary classifiers, each separating one class from all others
- At prediction time, assign the class with the highest score/confidence
- Pros: simple, only K models needed, each model is a standard gpredomics binary model (fully interpretable)
- Cons: class imbalance (one class vs all others); models are not calibrated against each other; may produce ambiguous regions where multiple classifiers predict positive
- Implementation: can be orchestrated externally (run gpredomics K times with relabeled y), or integrated into the engine for convenience
One-vs-One (OVO)
- Train K×(K-1)/2 binary classifiers, one for each pair of classes
- At prediction time, each classifier votes; assign the class with the most votes
- Pros: each pairwise classifier sees balanced sub-problems; often better separation
- Cons: quadratic number of models; voting ties possible; harder to interpret the ensemble
- Implementation: more complex orchestration; needs a voting/aggregation layer
Comparison
| Approach |
# Models |
Balance |
Interpretability |
Complexity |
| OVA |
K |
Imbalanced |
High (each model is standalone) |
Low |
| OVO |
K(K-1)/2 |
Balanced |
Medium (ensemble of pairwise models) |
Medium |
Design considerations
- Jury integration: The existing voting/jury system could potentially be reused for combining OVO classifiers
- Feature importance: How to aggregate feature importance across multiple binary models
- Cross-validation: CV should maintain class proportions across all classes (stratified K-fold)
- param.yaml: Need a new parameter for multiclass strategy (
multiclass: ova or multiclass: ovo)
- Output format: Results should show per-class metrics (sensitivity, specificity, etc.) and a confusion matrix
Related work
- predomicsmc — existing multiclass extension via Predomics (R implementation)
- scikit-learn's OneVsRestClassifier / OneVsOneClassifier as design reference
Suggested implementation path
- Start with OVA — simpler, each sub-model is a standard gpredomics run
- Add OVO as an option later
- Consider whether multiclass should be a core engine feature or an orchestration layer (wrapper script / predomicsapp-web feature)
Context
gpredomics currently supports binary classification only (2 classes: 0 vs 1). Many clinical and biological problems involve multiple classes (e.g., disease subtypes, treatment response categories, multiple conditions).
Proposed approaches
One-vs-All (OVA / OVR)
One-vs-One (OVO)
Comparison
Design considerations
multiclass: ovaormulticlass: ovo)Related work
Suggested implementation path