A comprehensive course on deploying Large Language Models (LLMs) efficiently and cost-effectively.
- Load and fine-tune pre-trained transformer models
- Apply optimization techniques: distillation, pruning, quantization
- Deploy models using FastAPI, Gradio, Docker, and AWS ECS
- Implement production best practices
Open any notebook directly in Colab:
https://colab.research.google.com/github/[your-repo]/blob/main/[notebook-path]
Tip: An easy way to convert a Jupyter Notebook from GitHub to Google Colab is by changing
https://github.com/...tohttps://githubtocolab.com/...in the URL.
pip install -r requirements.txt
jupyter lab| Module | Topic | Notebooks |
|---|---|---|
| 00 | Course Intro | 1 |
| 01 | Foundations | 2 |
| 02 | Fine-Tuning | 3 |
| 03 | Optimization | 5 |
| 04 | Deployment | 4 |
| 05 | Capstone | 1 |
- Python 3.8+
- Basic understanding of machine learning
- Familiarity with PyTorch (helpful but not required)
transformers- Hugging Face Transformerstorch- PyTorchdatasets- Hugging Face Datasetsgradio- Web UI frameworkfastapi- REST API framework
