IMDb Sentiment Analysis — Contextual Word Classification - Shanghai Jiao Tong University/Mines Paris

Authors: Wael Ben Slima & Marko Babic
Notebook: Wael_Marko_IMDB_Sentiment_Analysis.ipynb
Dataset: IMDb Dataset of 50K Movie Reviews

Project Overview

This notebook implements a contextual word sentiment classification model using the IMDb movie review dataset.
The primary goal is to classify individual words as positive, negative, or neutral by leveraging sentence-level sentiment labels and the context of surrounding words.

For example:

“beautiful” → Positive
“defeat” → Negative

Dataset Description

The IMDb dataset contains 50,000 movie reviews, split into:

25,000 for training
25,000 for testing

Each review is labeled as either positive or negative.

Workflow

1. Data Loading & Preprocessing

Load data using Pandas.
Clean the text: remove HTML tags & punctuation, lowercase, strip numbers, remove stopwords.
Tokenize and pad sequences for model input.

2. Model Building

Utilize TensorFlow / Keras.
Architecture includes:
- Embedding layer
- (Bi)LSTM layer to capture contextual dependencies
- Dense output layers for classification

3. Training

Train on sentence-level labels.
Use callbacks (e.g. ModelCheckpoint) to save the best model.

4. Evaluation

Plot accuracy and loss curves.
Compute confusion matrix and classification metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md
Wael_Marko_IMDB_Sentiment_Analysis.ipynb		Wael_Marko_IMDB_Sentiment_Analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDb Sentiment Analysis — Contextual Word Classification - Shanghai Jiao Tong University/Mines Paris

Project Overview

Dataset Description

Workflow

1. Data Loading & Preprocessing

2. Model Building

3. Training

4. Evaluation

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

IMDb Sentiment Analysis — Contextual Word Classification - Shanghai Jiao Tong University/Mines Paris

Project Overview

Dataset Description

Workflow

1. Data Loading & Preprocessing

2. Model Building

3. Training

4. Evaluation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages