Production-ready sentiment analysis API powered by fine-tuned DistilBERT. Analyze customer reviews at scale with 90%+ accuracy.
- ✅ HuggingFace Expertise: Model fine-tuning, dataset creation, model hub publishing
- ✅ Transformer Models: Fine-tuning DistilBERT for domain-specific tasks
- ✅ Production ML: Model optimization, API deployment, monitoring
- ✅ MLOps: Training pipelines, model versioning, A/B testing setup
- ✅ API Development: FastAPI with async support, batch processing
- ✅ Documentation: Comprehensive model cards, API docs, deployment guides
graph LR
A[Customer Review] --> B[FastAPI Endpoint]
B --> C[Preprocessing]
C --> D[DistilBERT Model]
D --> E[Post-processing]
E --> F[JSON Response]
G[Batch Reviews] --> H[Async Processing]
H --> D
style D fill:#ffe1e1
style F fill:#e1ffe1
- 🚀 Fast Inference: < 50ms response time
- 📊 Batch Processing: Handle multiple reviews efficiently
- 🎯 High Accuracy: 90.2% on test set
- 📈 Confidence Scores: Get prediction confidence
- 🔄 Async Support: Non-blocking requests
- 📝 Comprehensive Logging: Track all predictions
- 🐳 Docker Ready: One-command deployment
- ⚡ Optimized: Quantized version available (4x smaller)
- 🌍 Public: Published on HuggingFace Hub
- 📚 Well-Documented: Complete model card
- 🧪 Tested: 90+ unit and integration tests
- 🔧 Flexible: Easy to fine-tune on your data
🎮 Interactive Demo on HuggingFace Spaces
Docker (Recommended)
git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
docker-compose up -d
# Test
curl -X POST "http://localhost:8000/api/v1/predict" \
-H "Content-Type: application/json" \
-d '{"text": "This product is amazing!"}'Local Development
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000Visit http://localhost:8000/docs for interactive API documentation.
from transformers import pipeline
classifier = pipeline(
"sentiment-analysis",
model="IberaSoft/customer-sentiment-analyzer"
)
result = classifier("This product exceeded my expectations!")
print(result) # [{'label': 'positive', 'score': 0.9823}]Base URL: http://localhost:8000/api/v1
Interactive Docs: Visit /docs for Swagger UI
# Single prediction
curl -X POST "http://localhost:8000/api/v1/predict" \
-H "Content-Type: application/json" \
-d '{"text": "Great product!"}'
# Response
{
"sentiment": "positive",
"confidence": 0.94,
"scores": {"positive": 0.94, "negative": 0.03, "neutral": 0.03},
"processing_time_ms": 35
}POST /predict- Analyze single textPOST /predict/batch- Analyze multiple texts (max 100)GET /model/info- Model information and metricsGET /health- Health check
Full API documentation: See docs/API.md
| Metric | Score |
|---|---|
| Accuracy | 90.2% |
| F1 Score | 0.89 |
| Precision | 0.90 |
| Recall | 0.89 |
| Inference Time | 35ms (CPU) |
Predicted
Pos Neu Neg
Actual Pos [ 728 45 27 ]
Neu [ 38 430 32 ]
Neg [ 22 48 630 ]
| Batch Size | Throughput (req/s) | Latency P95 (ms) |
|---|---|---|
| 1 | 28 | 45 |
| 8 | 89 | 120 |
| 32 | 156 | 280 |
Tested on Intel i7-11700K
flowchart TB
subgraph Client
A[Web/Mobile App]
B[Backend Service]
end
subgraph API Layer
C[FastAPI Server]
D[Request Validation]
E[Response Formatting]
end
subgraph ML Layer
F[Preprocessing]
G[DistilBERT Model]
H[Postprocessing]
end
subgraph Storage
I[Model Cache]
J[Logs]
end
A --> C
B --> C
C --> D
D --> F
F --> G
G --> H
H --> E
E --> C
G <--> I
C --> J
style G fill:#ffe1e1
style C fill:#e3f2fd
| Component | Technology | Purpose |
|---|---|---|
| ML Framework | HuggingFace Transformers | Model training & inference |
| Base Model | DistilBERT | Pre-trained transformer |
| API Framework | FastAPI | REST API server |
| Web Server | Uvicorn | ASGI server |
| Validation | Pydantic | Request/response validation |
| Testing | Pytest | Unit & integration tests |
| Load Testing | Locust | Performance testing |
| Containerization | Docker | Deployment |
| CI/CD | GitHub Actions | Automated testing & deployment |
| Monitoring | Prometheus | Metrics collection |
sentiment-analysis-api/
├── app/
│ ├── main.py # FastAPI application
│ ├── config.py # Configuration
│ ├── api/
│ │ └── endpoints/
│ │ ├── predict.py # Prediction endpoints
│ │ ├── batch.py # Batch processing
│ │ └── health.py # Health checks
│ ├── core/
│ │ ├── model.py # Model loading & inference
│ │ ├── preprocessing.py # Text preprocessing
│ │ └── cache.py # Response caching
│ ├── schemas/
│ │ ├── request.py # Request models
│ │ └── response.py # Response models
│ └── utils/
│ ├── logger.py # Logging configuration
│ └── metrics.py # Prometheus metrics
│
├── training/
│ ├── prepare_dataset.py # Dataset preparation
│ ├── train.py # Model training
│ ├── evaluate.py # Model evaluation
│ ├── optimize.py # Model optimization
│ └── configs/
│ └── training_config.yaml
│
├── tests/
│ ├── unit/
│ │ ├── test_preprocessing.py
│ │ ├── test_model.py
│ │ └── test_api.py
│ ├── integration/
│ │ └── test_end_to_end.py
│ └── load/
│ └── locustfile.py
│
├── scripts/
│ ├── download_data.py
│ ├── upload_to_hf.py
│ └── benchmark.py
│
├── notebooks/
│ ├── 01_data_exploration.ipynb
│ ├── 02_model_training.ipynb
│ └── 03_error_analysis.ipynb
│
├── docs/
│ ├── API.md # API reference
│ ├── TRAINING.md # Model training guide
│ ├── DEPLOYMENT.md # Deployment options
│ ├── SPACES_GUIDE.md # HuggingFace Spaces setup
│ ├── HF_TOKEN_GUIDE.md # Token setup guide
│ └── TROUBLESHOOTING.md # Common issues & solutions
│
├── .github/
│ └── workflows/
│ ├── test.yml
│ └── deploy.yml
│
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── requirements-dev.txt
├── .env.example
├── README.md
└── LICENSE
git clone https://github.com/IberaSoft/sentiment-analysis-api.git
cd sentiment-analysis-api
python -m venv venv
source venv/bin/activate
pip install -r requirements-dev.txt
pre-commit installcd training
python prepare_dataset.py --output-dir ./data
python train.py --config configs/training_config.yaml
python evaluate.py --model-dir ./models/customer-sentiment-v1
python ../scripts/upload_to_hf.py --model-dir ./models/customer-sentiment-v1Full training guide: See docs/TRAINING.md
pytest tests/ -v # All tests
pytest tests/ --cov=app --cov-report=html # With coveragedocker-compose up -d- Fork this repository
- Create Space on HuggingFace
- Connect GitHub repo
- Auto-deploys!
Spaces guide: See docs/SPACES_GUIDE.md
Supports AWS, GCP, Azure, DigitalOcean, and more.
Full deployment guide: See docs/DEPLOYMENT.md
Metrics: Available at /metrics (Prometheus format)
Logging: Structured JSON logs for easy parsing
Fine-tune on your own data:
python training/train.py \
--base-model IberaSoft/customer-sentiment-analyzer \
--dataset your-username/your-dataset \
--output-dir ./models/custom-modelSee docs/TRAINING.md for details.
- API Reference - Complete API documentation
- Training Guide - Train and fine-tune the model
- Deployment Guide - Deploy to production
- Spaces Guide - HuggingFace Spaces setup
- Token Guide - HuggingFace token setup
- Troubleshooting - Common issues & solutions
- Multi-language support (Spanish, French, German)
- Aspect-based sentiment analysis
- Confidence calibration improvements
- Real-time model updates
- ONNX optimization
- Model distillation (smaller, faster)
- GPU batch processing
- Response streaming
- Multi-model ensemble
- Active learning pipeline
- A/B testing framework
- Explainability (SHAP, LIME)
- Multi-tenancy support
- Custom model training UI
- Advanced analytics dashboard
- SLA monitoring
Contributions welcome! See CONTRIBUTING.md for guidelines.
Ways to contribute:
- 🐛 Report bugs
- 💡 Suggest features
- 📝 Improve documentation
- 🧪 Add tests
- 🎨 Improve UI/UX
This project is licensed under the MIT License - see LICENSE for details.
- HuggingFace for Transformers library and model hub
- FastAPI team for excellent framework
- DistilBERT authors for the efficient base model
- Community for feedback and contributions
Project Links:
Built with ❤️ by an aspiring AI/ML Engineer
Try the live demo: HuggingFace Spaces