Skip to content

Latest commit

 

History

History
15 lines (12 loc) · 608 Bytes

File metadata and controls

15 lines (12 loc) · 608 Bytes

NLP Project - Email Classification & Similarity clustering

This is part of the Data Science Lab course. We have done classification of emails as spam or ham using various Naive Bayes models. And comparing it with RNN based model LSTM. We fine-tuned a version of BERT model for the same task and compared results. The list of models are as follows;

Naive Bayes Models

  • Gaussian Naive Bayes
  • Multinomial Naive Bayes
  • Binomial Naive Bayes

LSTM Model

  • Uni-directional
  • Bi-directional

The research paper we used for our work : https://www.sciencedirect.com/science/article/pii/S1877050921007493