Sharven Rane SharvenRane

I build production computer vision and deep learning systems. Currently at Vetology Innovations in Chicago, where I've taken models from raw research ideas to deployed production infrastructure at scale. Over 50 models shipped across classification, detection, and segmentation, trained on A100s with 400K+ images.

My focus spans self-supervised learning, foundation models, vision-language models, generative models, and MLOps. I care about the full stack: from pretraining and architecture design to optimized inference in production.

currently  : Data Scientist @ Vetology Innovations, Chicago
building   : foundation models, vision-language systems, production CV pipelines
background : MS Data Science, Stevens Institute of Technology
interests  : SSL pretraining · vision-language · generative models · efficient inference

Things I work with

What I've been building

Self-Supervised & Foundation Models

ssl-comparison: BYOL, SimCLR, MAE, DINOv2, MoCo on the same datasets, apples to apples

mae-pretraining: MAE from scratch with reconstruction visualization

medical-foundation-model: domain-specific foundation model pretrained on 100K+ images

ijepa-implementation: clean I-JEPA implementation

Vision-Language & Multimodal

llava-medical-vqa: LLaVA fine-tuned for domain-specific visual Q&A

clip-finetuning: domain-specific CLIP/SigLIP fine-tuning with zero-shot evaluation

vlm-comparison: LLaVA vs InternVL vs Qwen-VL vs PaLiGemma benchmarked

visual-rag: retrieval-augmented generation using image embeddings

Generative Models

diffusion-model-from-scratch: DDPM from scratch with full training pipeline

diffusion-transformer: DiT implementation with scalable architecture experiments

controlnet-finetuning: ControlNet fine-tuning for conditioned image generation

gan-progression: DCGAN to StyleGAN2 progression with training stability

Detection, Tracking & Segmentation

yolo-benchmark: YOLOv8 vs v9 vs v10 vs YOLO-World speed/accuracy benchmark

medical-image-segmentation: UNet++, DeepLabV3+, SegFormer end-to-end pipeline

sam2-finetuning: SAM2 fine-tuned on custom domains with prompt strategies

open-vocabulary-detection: Grounding DINO and YOLO-World for zero-shot detection

Stats

More repos by area

Agentic AI

vision-agent · medical-diagnosis-agent · visual-rag · auto-image-annotator · image-search-engine · multi-agent-ml-experiments

Detection, Tracking & Video

yolo-benchmark · open-vocabulary-detection · sam2-video-tracking · multi-object-tracking · action-recognition · temporal-action-localization

Generative Models

diffusion-model-from-scratch · diffusion-transformer · controlnet-finetuning · medical-image-synthesis · gan-progression · dreambooth-finetuning

Autonomous Vehicles & 3D Vision

lidar-object-detection · multi-sensor-fusion · 3d-object-detection · depth-estimation · 3d-scene-reconstruction · occupancy-prediction

Efficient Models & Edge AI

model-compression · tensorrt-optimization · onnx-deployment · quantization-aware-training · knowledge-distillation · edge-vision-pipeline

MLOps & Infrastructure

ml-project-template · distributed-training · triton-inference-server · model-monitoring · ci-cd-for-ml · experiment-tracking

Deep Learning Fundamentals

transformer-from-scratch · vision-transformer-from-scratch · autograd-engine · paper-implementations · diffusion-math-walkthrough · backpropagation-from-scratch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sharven Rane SharvenRane

Block or report SharvenRane

Things I work with

What I've been building

Stats

More repos by area

Pinned Loading

Uh oh!