This repository contains the completed homework assignments for the Large Language Models course, instructed by Dr. MJ Dousti and Dr. Y Yaghoobzadeh.
This assignment consists of two main parts:
-
Getting Started with LLMs
We will investigate different tokenizers, compare base models with instruction-tuned models, and explore chat templates. -
Fine-tuning using LoRa
We will fine-tune a base model using a small classification dataset on emotion detection. The resulting model's performance will be compared with the instruction-tuned model by Meta and the base model. We will get a sense of how everything works quantitively.
This assignment consists of three main parts:
-
In-context Learning
We compare several in-context learning methods — including Role Prompting, Zero-Shot Chain of Thought (CoT), and Few-Shot Chain of Thought (CoT) — using theLlama-3.2-3B-Instruct-bnb-4bitmodel on the GSM8K benchmark dataset. All experiments are conducted with the Unsloth library. -
Human Preference Alignment
We investigate and compare various methods for aligning models with human preferences, including Reinforcement Learning from Human Feedback (RLHF), Proximal Policy Optimization (PPO), Direct Preference Optimization (DPO), and Odds Ratio Preference Optimization (ORPO). -
Evaluating the Impact of Alignment on ICL
We re-evaluate the in-context learning (ICL) performance after aligning the model with DPO and ORPO. The goal is to analyze how alignment affects the model’s ability to follow different prompting strategies.
This assignment consists of two main parts:
-
LLM as a Judge
We use thePhi-3-3.8Bmodel as a judge and evaluate it on theprometheus-eval/Feedback-Benchdataset. The implementation is done using the LangChain library. -
RAG
We construct a Retrieval-Augmented Generation (RAG) pipeline using LangChain and conduct a comparative analysis between Sparse and Semantic retrieval methods.
This assignment consists of three main parts:
-
Quantization
We investigate various quantization methods and fine-tune a model using QLoRA. -
Self-Explanations
This section explores the concept of LLM self-explanations, focusing on two main approaches: Explanation-to-Prediction (E→P) and Prediction-to-Explanation (P→E). We implement both techniques and evaluate their effectiveness on sentiment analysis tasks. -
Text2SQL using ReAct agent
Here, we progressively build and evaluate multiple Text-to-SQL pipelines. Starting with a simple prompt-based baseline, we then develop a graph-based routing system leveraging chain-of-thought reasoning and schema awareness, and finally construct a ReAct agent that interacts with the schema through tools. Each stage demonstrates a different strategy for generating SQL from natural language using LLMs.