Skip to content

Latest commit

 

History

History
51 lines (38 loc) · 2.24 KB

File metadata and controls

51 lines (38 loc) · 2.24 KB

ReSample

Python Streamlit Pandas Plotly

Overview

ReSample is a web application designed to automatically balance datasets for machine learning tasks. It helps address class imbalances in datasets, improving model generalization and performance. It provides a user-friendly interface to handle missing values, balance class distributions, visualize the results and export the processed dataset.

Additionally, ReSample features a recommendation model that suggests ten of the most optimal combinations of balancing methods based on the size and imbalance ratio of the uploaded dataset.


Live Demo

Experience the deployed with Streamlit app here:

Streamlit App


Features

  1. Upload or use a sample dataset.
  2. Handle missing values with various strategies (drop, fiil with median/moda/mean).
  3. Get recommendations about balancing methods based on dataset size and imbalance ratio.
  4. Balance class distribution:
    • Oversampling:
      • Random Oversampling
      • SMOTE
      • Borderline SMOTE
      • B-SMOTE SVM
      • ADASYN
    • Undersampling:
      • Random Undersampling
      • NearMiss (version 1, 2 and 3)
      • TomekLinks
      • CNN
      • ENN
      • OSS
      • NCR
  5. Visualize data before and after balancing (Pie & Bar charts).
  6. Export the processed dataset.