Skip to content

Maguids/XAI-Techniques--Students-Dropout-Dataset

Repository files navigation

XAI-Techniques--Students-Dropout-Dataset

This project applies Explainable AI techniques to a Student Dropout dataset, covering pre-, in- and post-modeling explanations, as well as an analysis of their quality. The project was developed for the "Adavnced Topics on Machine Learning" course. 1st Semester of the 1st Year of the Master's Degree in Artificial Intelligence.


Programming Language


python

Project Objective

The objective of this project is to analyze and compare multiple XAI approaches, evaluating:

  • The type of insights provided by each technique;
  • The consistency of the explanations;
  • The differences between interpretable (glass-box) and complex (black-box) models.

Project Structure

Task 1 – Pre-Modelling Explanations

Notebooks:

  • task_1_1_all_data_analysis.ipynb
  • task_1_2_data_analysis.ipynb

This task focuses on exploratory data analysis before model training.
The dataset is analyzed to understand:

  • Feature distributions;
  • Relationships between features and the target variable;
  • Potential data issues and relevant patterns.

These insights support informed decisions in later modeling stages.

Task 2 – In-Modelling Explanations

Notebook:

  • task_2_in_modelling.ipynb

In this task, an interpretable (glass-box) model is trained.
The analysis focuses on:

  • Feature importance;
  • Model parameters and learned relationships;
  • The interpretability offered directly by the model.

Task 3 – Post-Modelling Explanations

Notebooks:

  • task_3_and_4_mlp.ipynb
  • task_3_and_4_xgboost.ipynb

This stage involves training black-box models (MLP and XGBoost) and applying post-hoc XAI techniques, including:

  • Simplification-based methods to approximate model behavior;
  • Feature-based explanation techniques;
  • Example-based explanations for individual predictions.

The explanations obtained from different methods are compared and discussed.

Task 4 – Quality of the Explanations

This task evaluates the quality of the generated explanations using functionally-grounded metrics.
The results are analyzed to assess explanation reliability and to suggest possible interpretability improvements.

About the repository:

TAACproject
├── datasets
│   ├── data_all_pca_21_components       # The components generated by PCA on the Original Dataset
│   ├── data_all_preprocessed.csv       # Original Dataset Pre Processed
│   ├── data_all.csv       # Original Dataset
│   ├── data_preprocessed.csv       # Original Dataset Pre Processed without "Enrolled"
│   ├── data.csv       # Original Dataset without "Enrolled"
├── pickle_jar
│   ├── mlp_model.pkl       # Saved MLP model
├── ProjectStatment       # The Project Statment
├── Report_TAAC__DS2_G3.pdf       # The report of the project
├── task_1_1_all_data_analysis.ipynb       # Task 1 with all data ("Enrolled", "Graduated", "Dropout")
├── task_1_2_data_analysis.ipynb       # Task 1 without "Enrolled"
├── task_2_in_modelling.ipynb       # Task 2 with a Decision Tree as the glass-box model
├── task_3_and_4_mlp.ipynb       # Task 3 and 4 with MLP
├── task_3_and_4_xgboost.ipynb       # Task 3 and 4 with XGBoost

This repository contains all the code and analyses developed throughout the project.

Link to the course:

This course is part of the first semester of the first year of the Master's Degree in Artificial Intelligence at FEUP and FCUP in the academic year 2025/2026. You can find more information about this course at the following link:

About

This project applies Explainable AI techniques to a Student Dropout dataset, covering pre-, in- and post-modeling explanations, as well as an analysis of their quality. The project was developed for the "Adavnced Topics on Machine Learning" course. 1st Semester of the 1st Year of the Master's Degree in Artificial Intelligence.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors