This repository documents my personal journey into practical deep learning.
It contains hands-on notebooks, experiments, and notes as I learn and apply deep learning concepts through practice.
- Jupyter Notebooks: Experiments, exercises, and model training logs.
- Implementations: Practical code inspired by real-world use cases.
- Learning Notes: Observations and "gotchas" encountered while building intuition around DL frameworks.
The goal of this repository is learning by doing:
- Understanding concepts through experimentation.
- Making mistakes and improving over time.
- Building intuition around deep learning workflows and data pipelines.
This repository utilizes primary datasets for various machine learning and deep learning tasks:
- Source: Kaggle - Microsoft Cats vs Dogs Dataset
- Categories: Cat and Dog.
- Important Note: This dataset contains corrupted images (notably
666.jpgand11702.jpg) and non-image system files likeThumbs.db. A validation step is mandatory during data loading to prevent training crashes.
- Source: Kaggle - UTKFace (New)
- Categories: Human faces with diverse demographics (over 20,000 images).
- Labels: Age, Gender, and Ethnicity.
- Filename Format:
[age]_[gender]_[race]_[date&time].jpg
| Label | Description | Mapping |
|---|---|---|
| Age | Integer | 0 to 116 |
| Gender | Binary | 0 (Male), 1 (Female) |
| Race | Categorical | 0 (White), 1 (Black), 2 (Asian), 3 (Indian), 4 (Others) |
Tip for Implementation: Since the UTKFace dataset encodes labels in the filename, you will need a custom parser (e.g., using Python's
os.listdirandstring.split('_')) to extract the target variables.
- Source: Kaggle - Coin Image Dataset
- Categories: US Coins (Pennies, Dimes, Nickels, Quarters).
- Size: 750 pictures.
- Task: Object detection using YOLO.
- Annotation Format: Regardless of the specific type of data, the images need to be structured and separated for annotation using Label Studio.
- Source: ManyThings.org (Tatoeba project)
- Files:
fra-eng.zip(extracts tofra.txt) - Format: Tab-separated text file structured as:
[English sentence] \t [French sentence] \t [Attribution] - Task: Used for building and training a Transformer model from scratch.
- Understanding LSTMs: A visual and intuitive explanation of Long Short-Term Memory networks and how they solve the vanishing gradient problem.
- Backpropagation in RNNs: A deep dive into how gradients flow through time.
- RNN Architectures & Use Cases: A guide to choosing the right structure for sequential data.
- Convolutions and Backpropagation: Visualizing the math behind the spatial feature extraction.
- Padding and Strides in CNN: Understanding how these parameters affect the output feature map size.
- Practical Guide to Data Augmentation: Techniques to improve model generalization (crucial for datasets like UTKFace).
- Types of Optimizers: Comparison between SGD, Adam, RMSprop, and more.
- Weight Initialization Techniques: Why Xavier and He initialization are vital for training deep architectures.
- How to use Kaggle GPU: Maximizing free compute resources for training your models.
This repo reflects my progress step by step as I grow my skills in deep learning.