Skip to content

yvesemmanuel/deep-learning

Repository files navigation

Deep Learning Projects

A collection of deep learning implementations covering transformer architectures, multimodal systems, and retrieval-augmented generation.

Projects

Building a modern Transformer-based language model from the ground up. This project implements all core components of the Qwen3 architecture including:

  • Grouped Query Attention mechanism
  • Root Mean Square Layer Normalization
  • Feed Forward networks
  • Key-Value caching for efficient inference
  • Complete Transformer blocks

View Project →

An end-to-end Retrieval-Augmented Generation pipeline that processes PDF documents containing both text and images. Features:

  • Multimodal embeddings with Jina-CLIP
  • Vector database storage with ChromaDB
  • Image and text extraction from PDFs
  • Question-answering with Phi-3-Vision
  • Interactive chat interface

View Project →

Parameter-efficient fine-tuning of Vision Transformer models using Low-Rank Adaptation for food image classification. Features:

  • LoRA integration reducing trainable parameters by 98.56%
  • Vision Transformer (ViT) architecture
  • Food101 dataset with 101 food categories
  • Data augmentation pipeline
  • Mixed precision training
  • Experiment tracking with Weights & Biases

View Project →

A pipeline for curating validation datasets from 216,930 Jeopardy questions to evaluate Named Entity Recognition (NER) algorithms. Features:

  • LLM-based classification using Qwen3-4B-Instruct
  • Stratified sampling maintaining category distribution
  • Three linguistic challenge categories (numbers, non-English words, unusual proper nouns)
  • GPU-accelerated batch processing with checkpointing
  • Statistical analysis across the full dataset

View Project →

About

My own deep learning implementations

Topics

Resources

Stars

Watchers

Forks