Skip to content

Akimuddinshaikh/Master-s-Research-Project

Repository files navigation

Analysis of Scalable and Efficient Approaches for Liver Disease Detection A Hybrid Approach Using Texture-Based Features & Deep Learning

πŸ“Œ Author: Akimuddin Aslam Shaikh πŸ“ Institution: National College of Ireland

πŸ“Œ Project Overview Liver disease detection using machine learning and deep learning techniques has gained attention for its potential to improve diagnostic accuracy and patient outcomes. This study introduces a unique hybrid approach by combining texture-based Gray-Level Co-occurrence Matrix (GLCM) features with high-dimensional deep learning features extracted from ResNet50.

The research focuses on leveraging unlabeled medical imaging datasets by integrating unsupervised clustering (KMeans, Agglomerative Clustering) with supervised classification (Random Forest, SVM, XGBoost). The results show a scalable and efficient methodology for detecting liver diseases with high precision.

πŸ” Key Highlights βœ… Extracted GLCM features and ResNet50 deep learning features from medical imaging datasets. βœ… Implemented unsupervised clustering (KMeans & Agglomerative Clustering) to generate pseudo-labels. βœ… Trained classifiers (Random Forest, SVM, XGBoost) using extracted feature sets. βœ… Combined features (GLCM + ResNet50) with cluster labels for improved classification accuracy. βœ… Achieved near-perfect accuracy (99%-100%) in specific combinations but addressed overfitting concerns.

πŸ“Š Machine Learning Techniques & Results Feature Set Clustering Method Classifier Accuracy ResNet50 Features KMeans RF/SVM/XGBoost ~95% ResNet50 Features Agglomerative Clustering RF/SVM/XGBoost 99%-100% GLCM Features KMeans RF/SVM/XGBoost ~98% GLCM Features Agglomerative Clustering RF/SVM/XGBoost 100% (Potential Overfitting) GLCM + ResNet50 Features KMeans SVM ~96% GLCM + ResNet50 Features Agglomerative Clustering SVM 99% πŸ”Ή Key Finding: Combining ResNet50 deep learning features with Agglomerative Clustering labels yielded the highest classification accuracy (99%-100%), significantly outperforming other combinations. πŸ”Ή Overfitting Observed: Results exceeding 99% accuracy in real-world medical datasets indicate a potential overfitting issue, requiring further validation. πŸ”Ή Pseudo-Labeling Approach Improved Performance: Integrating pseudo-labeling from clustering techniques enhanced classification accuracy.

πŸ› οΈ Technologies & Tools Used Python 🐍 Scikit-Learn (Machine Learning Models) TensorFlow/Keras (Deep Learning & ResNet50) OpenCV (GLCM Feature Extraction) SciPy & NumPy (Clustering & Statistical Analysis) Matplotlib & Seaborn (Data Visualization) πŸ“Œ Results & Insights βœ… GLCM + ResNet50 features provide a robust framework for liver disease detection. βœ… Agglomerative Clustering outperformed KMeans in feature representation. βœ… Pseudo-labeling from clustering significantly improved classification accuracy. βœ… The study presents a scalable, efficient, and high-precision methodology for medical imaging analysis.

πŸ“Œ Future Work βœ” Validate the approach on larger and more diverse medical imaging datasets. βœ” Implement Transfer Learning with advanced architectures (e.g., Vision Transformers). βœ” Optimize feature selection to mitigate overfitting risks. βœ” Develop an explainable AI (XAI) model to enhance medical interpretability.

About

A hybrid approach combining texture-based (GLCM) and deep learning (ResNet50) features with unsupervised clustering and supervised classification for detecting liver diseases. Achieved 99%-100% accuracy using SVM, XGBoost, and Random Forest on pseudo-labeled medical imaging datasets

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors