CellDivider — Prediction Models

This folder contains model code and artifacts used by the CellDivider prediction pipeline. The repository focuses on three primary modelling approaches used for phenotype prediction from processed expression features:

ElasticNet (regularized linear model)
Multilayer Perceptron (MLP) neural network
XGBoost (gradient-boosted trees)

Models

ElasticNet

Description: Linear regression with combined L1/L2 regularization (a mix of Lasso and Ridge). Useful as a baseline and for interpretable feature selection.
Key hyperparameters:
- alpha (overall regularization strength)
- l1_ratio (mix between L1 and L2 regularization)

Multilayer Perceptron (MLP)

Description: Feed-forward neural network with one or more hidden layers and non-linear activations.
Key hyperparameters:
- hidden_dim (hidden layer sizes)
- num_layers (number of hidden layers)
- dropout_rate (regularization)
- start_lr (learning rate for Adam optimizer)
- batch size (dataloader batch size)

The main training code for the MLP can be found in mlp/train_mlp.py

XGBoost

Description: Gradient-boosted decision trees.
Key hyperparameters:
- n_estimators (number of trees)
- max_depth (maximum tree depth)
- learning_rate (shrinkage)
- subsample, colsample_bytree (row/column sampling for regularization)
- gamma (regularization)

Installation

Activate your python enviroment, would recommend conda or venv.

pip install -r requirements.txt

If the GPU install doesn't work out of the box install pytorch for your GPU setup: https://pytorch.org/get-started/locally/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CellDivider — Prediction Models

Models

Installation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

CellDivider — Prediction Models

Models

Installation