A simple Python tool to manually label text data for sentiment analysis. It loads a CSV file of movie reviews, shows each review in the terminal, and lets you assign a label: positive, neutral, or negative. The script then saves a new CSV with your labels, ready for basic analysis or future machine learning work.
Features Loads text data from a CSV file (IMDB movie reviews).
Interactive labeling in the terminal with input validation.
Exports a new labeled_data.csv file with a sentiment_label column.
Beginner‑friendly example of data annotation for AI projects.
Requirements Python 3
pandas
Install dependencies from the project folder with:
bash python -m pip install pandas (TensorFlow is not installed here because the current Python version does not yet have an official build, but this labeled dataset is designed to be used with TensorFlow or other ML frameworks in the future.)
How to Use Clone or download this repository.
Make sure the CSV file (for example, IMDB Dataset.csv) is in the same folder as label_sentiments.py.
If your CSV or text column has a different name, update these lines in label_sentiments.py:
python df = pd.read_csv('IMDB Dataset.csv') # CSV file name texts = df['review'][:20] # text column name and number of rows From the project folder, run:
bash python label_sentiments.py For each review shown, enter one of:
positive
neutral
negative
When you finish, open labeled_data.csv to see your labels in the sentiment_label column.
Screenshots Setup and labeling in the terminal:
Labeled output with the new sentiment_label column:
Future Ideas Use this labeled dataset to train a simple sentiment classifier with TensorFlow once a compatible build is available.
Add a small GUI or web interface for labeling instead of using the terminal.
Extend labels beyond positive, neutral, and negative (for example, very positive, very negative).
Project created as a beginner‑friendly data annotation exercise to showcase Python, pandas, and practical AI/ML workflow skills.