This repository contains five hands-on projects completed over five weeks, followed by a capstone in week six. The first five projects build capability with large language models, retrieval, tool use, research workflows, and multimodality. Week six is a capstone where you design your own system, tool, or startup idea based on your learnings. The instructions below are generic and apply to all projects. Each project also includes additional instructions specific to that project.
Each week, a new project is added to the repo at a specific release date and time. The weekly release includes the notebook, data, and environment file.
You can run the projects either on Google Colab (no local setup required) or locally (using Conda environments for reproducibility).
- Upload the notebook for the current week to Colab.
- If needed, add your API tokens using
os.environ[...] = "value". - Ensure that any local file paths are adjusted for Colab.
Each project comes with an environment.yml file that specifies its dependencies. This ensures consistent environments.
- Install Miniconda or Anaconda.
- Create and activate the environment from the provided YAML file:
The environment name is set inside the YAML. You can change it if desired.
conda env create -f environment.yml conda activate <ENV_NAME>
- Launch Jupyter and open the notebook for the current week:
jupyter notebook
Recommendation: Use Colab for projects 1 and 5, and local development for projects 2, 3, and 4.
The projects are designed so they do not require specific API keys or tokens by default. However, they are flexible, meaning you can switch to different LLMs, models, and systems. Depending on what you choose to experiment with, you may need to set up API keys or tokens from certain providers.
Possible API keys you might need:
OPENAI_API_KEYfor OpenAI modelsANTHROPIC_API_KEYfor Claude modelsGOOGLE_API_KEYfor Gemini modelsHUGGINGFACEHUB_API_TOKENfor Hugging Face hosted models and datasetsTAVILY_API_KEYorSERPAPI_API_KEYfor web search toolsPINECONE_API_KEY, or alternatives if using remote vector stores
- Projects are designed flexibly. They guide you step by step and provide the workflow. You will need to implement the sections marked with "your code here".
- There are multiple ways to implement each section. Feel free to deviate from the provided template and experiment with different algorithms, models, and systems.
- No submission is required. In the live deep-dive sessions, we will review each project in detail and show one possible implementation.
- Post questions in the corresponding Q/A space. You are also welcome to share your thoughts, opinions, and interesting findings in the same space.
An introductory project to explore how prompts, tokenization, and decoding settings work in practice, building the foundation for effective use of large language models.
Learning objectives:
- Tokenization of raw text into discrete tokens
- Basics of GPT-2 and Transformer architectures
- Loading pre-trained LLMs with Hugging Face
- Decoding strategies for text generation
- Completion vs. instruction-tuned models
A hands-on project to build a retrieval-based chatbot that answers customer questions for an imaginary e-commerce store.
Learning objectives:
- Ingest and chunk unstructured documents
- Create embeddings and index with FAISS
- Retrieve context and design prompts
- Run an open-weight LLM locally with Ollama
- Build a RAG (Retrieval-Augmented Generation) pipeline
- Package the chatbot in a minimal Streamlit UI
A project to create a simplified Perplexity-style agent that searches the web, reads content, and provides answers.
Learning objectives:
- Understand why tool calling is useful for LLMs
- Implement a loop to parse model calls and execute Python functions
- Use function schemas (docstrings and type hints) to scale across tools
- Apply LangChain for function calling, reasoning, and multi-step planning
- Combine Llama-3 7B Instruct with a web search tool to build an ask-the-web agent
A project focused on reasoning workflows, where you design a multi-step agent that plans, gathers evidence, and synthesizes findings.
Learning objectives:
- Apply inference-time scaling methods (zero-shot/few-shot CoT, self-consistency, sequential decoding, tree-of-thoughts)
- Gain intuition for training reasoning models with the STaR approach
- Build a deep-research agent that combines reasoning with live web search
- Extend deep research into a multi-agent system
A project to build an agent that combines textual question answering with image and video generation capabilities within a unified system.
Learning objectives:
- Generate images from text using Stable Diffusion XL
- Create short clips with a text-to-video model
- Build a multimodal agent that handles questions and media requests
- Develop a simple Gradio UI to interact with the agent
Purpose: Design and build your own system based on what you learned in weeks 1 to 5. This can be a product prototype, an internal tool, a research workflow, or the first step toward a startup idea. The hope is that some projects will continue after the cohort, using the connections and community built here.
The following documentation pages cover the core libraries and services used across projects:
- Conda documentation: Manage isolated Python environments and dependencies with Conda
- Pip documentation: Install and manage Python packages with pip
- duckduckgo-search: Python library to query DuckDuckGo search results programmatically
- gradio: Build quick interactive demos and UIs for machine learning models
- Streamlit documentation: Build and deploy simple web apps for data and ML projects
- huggingface_hub: Access and share models, datasets, and spaces on Hugging Face Hub
- langchain: Framework for building applications powered by LLMs with memory, tools, and chains
- numpy: Core library for numerical computing and array operations in Python
- openai: Official API docs for using OpenAI models like GPT and embeddings
- tiktoken: Fast tokenizer library for OpenAI models, used for counting tokens
- torch: PyTorch deep learning framework for training and running models
- transformers: Hugging Face library for using pre-trained LLMs and fine-tuning them
- llama-index: Data framework for connecting external data sources to LLMs
- chromadb: Open-source vector database for storing and retrieving embeddings in RAG systems