Break Through Tech AI Studio: Meta 1B

Finding Dataset Bias using Language Models

This project was developed in collaboration with Meta researchers Megan Ung and Candace Ross as part of the Break Through Tech fellowship.

Our model was trained by detecting bias in Reddit comments across multiple groups (gender, race, religion, and orientation) using NLP techniques and then fine-tuning on the BERT model.

🌐 Live Demo

Access the interactive NLP Bias Detection application here: https://nlp-bias-detection.vercel.app/

👥 Team Members

Name	GitHub Handle	Email
Rytham Dawar	@Rytham1	Rythamdawar30@gmail.com
Leonardo Siu	@baller7215	leonardo.siu.dev@gmail.com
Rianna Lei	@riannalei	rxlei@calpoly.edu
Wen Fan	@Wen-qqi	wqfan05@gmail.com
Jay Chan	@jayc-10	jayc10@uci.edu
Hala Khattab	@halakhattab	hala.khattaab@gmail.com

🎯 Project Highlights

Prepared and standardized the RedditBias dataset, creating robust train/validation/test splits to support reliable model evaluation.
Fine-tuned a base language model on the RedditBias dataset, optimizing performance through iterative training and validation on Wandb.
Conducted final model evaluation, documented findings, and deployed results on full-stack website.

👩🏽‍💻 Setup and Installation

Install dependencies:

cd backend
pip install -r requirements.txt

Test the model:
```
python test_model.py
```

Running the Backend Server

The backend provides a REST API for bias detection inference using FastAPI.

Start the server:
```
cd backend
uvicorn backend:app --reload
```
The --reload flag enables auto-reload on code changes.
Access the API:
- API will be available at: http://127.0.0.1:8000
- Interactive API documentation (Swagger UI): http://127.0.0.1:8000/docs

API Endpoints:

GET / - Health check endpoint

POST /predict - Bias detection endpoint

Request body:

{
  "text": "Your text to analyze here"
}

Response:

{
  "prediction": "biased" | "non_biased",
  "confidence": 0.xxxx,
  "probabilities": {
    "non_biased": 0.xxxx,
    "biased": 0.xxxx
  }
}

Example usage with curl:

curl -X POST "http://127.0.0.1:8000/predict" \
     -H "Content-Type: application/json" \
     -d '{"text": "Your text here"}'

Example usage with Python:

import requests

response = requests.post(
    "http://127.0.0.1:8000/predict",
    json={"text": "Your text to analyze"}
)
print(response.json())

🏗️ Project Overview

As Large Language Models (LLMs) become increasingly embedded in real-world applications, concerns around bias and fairness have grown. The models directly learn from large-scale text data which often includes historical, social, and demographic bias. When unleft, the biases can propagate stereotypes and lead to discriminatory outcomes. This project focuses on detecting and evaluating demographic bias in text before it is used. Our work highlights the challenges of bias in NLP systems and practical steps towards building more responsible and interpretable AI tools.

📊 Data Exploration

RedditBias Dataset: 28,000 Data Points
- Categorized into: Gender, Race, Orientation, Religion
Consolidated data into binary labels (0 = Not Biased, 1 = Biased)
Removed duplicates and handled missing values
Standardized and normalized all text (lowercased everything, removed URLS and special characters), and handled contradictions and whitespace
Marked trigger words and created word count feature

🧠 Model Development

We utilized weights and biases to track our 20+ fine tuning runs.
Originally, our model was overfitting so we applied different techniques which included applying regularization through weight decay and early dropout, adjusting learning rate, batch size, and epochs.
The solution to our issue was modifying the learning rate and learning rate scheduler type to linear rather than a constant one.
Additionally, we ran into a very high validation loss in which we continued hyperparameter tuning with a focus on learning rate and batch size.

📈 Results & Key Findings

Built a baseline Logistic Regression model for a supervised learning task, using data cleaning and filtering that reduced the dataset from ~28,000 to ~11,000 high-quality datapoints.
Achieved strong performance with ~0.40 training loss and ~0.45 validation loss, indicating good generalization.
Improved model effectiveness from 0.74 to 0.83 F1 score (12.16% relative increase), demonstrating feature engineering and establishing a solid foundation for future development.

🚀 Next Steps

Continue hyperparameter tuning
- Further refine our model to reduce validation loss
Expand dataset source
- Sample a wider range of data beyond the RedditBias dataset
- Test the fine-tuned model on other datasets
Experiment with other models for comparison
- RoBERTa, DistilBERT, DeBERTa

🙏 Acknowledgements

Rajshri Jain (Break Through Tech)
Candace Ross (Research Scientist @ Meta)
Megan Ung (Research Engineer @ Meta)

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
backend		backend
frontend		frontend
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
dataset-bias-detection-project.ipynb		dataset-bias-detection-project.ipynb
requirements.txt		requirements.txt
requirements_hf.txt		requirements_hf.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Break Through Tech AI Studio: Meta 1B

Finding Dataset Bias using Language Models

🌐 Live Demo

👥 Team Members

🎯 Project Highlights

👩🏽‍💻 Setup and Installation

Running the Backend Server

🏗️ Project Overview

📊 Data Exploration

🧠 Model Development

📈 Results & Key Findings

🚀 Next Steps

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Break Through Tech AI Studio: Meta 1B

Finding Dataset Bias using Language Models

🌐 Live Demo

👥 Team Members

🎯 Project Highlights

👩🏽‍💻 Setup and Installation

Running the Backend Server

🏗️ Project Overview

📊 Data Exploration

🧠 Model Development

📈 Results & Key Findings

🚀 Next Steps

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages