This project aims to classify patients into various drug categories based on features such as age, sex, blood pressure, cholesterol levels, and sodium-to-potassium ratio. We use machine learning models, specifically Naive Bayes and Logistic Regression, to compare their performance in classifying the correct drug category.
The dataset includes patient data with the following features:
• Age: Patient’s age.
• Sex: Gender of the patient (Male/Female).
• BP: Blood pressure level (High, Normal, Low).
• Cholesterol: Cholesterol level (High, Normal).
• Na_to_K: Sodium-to-potassium ratio in blood.
• Drug: Target column representing the type of drug prescribed to the patient.
The goal is to classify patients into different drug categories (DrugA, DrugB, etc.) using machine learning models. We will compare the following models:
• Naive Bayes Classifier
• Logistic Regression
I have used the following metrics to evaluate the performance of the models:
• Confusion Matrix: To show the correct and incorrect classifications for each class.
• ROC Curve and AUC: To compare the area under the curve for each model, indicating their ability to classify patients correctly.
• Precision, Recall, F1-Score: To analyze the classification performance of each model across all drug classes.