Skip to content
View SharvenRane's full-sized avatar

Block or report SharvenRane

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SharvenRane/README.md

I build production computer vision and deep learning systems. Currently at Vetology Innovations in Chicago, where I've taken models from raw research ideas to deployed production infrastructure at scale. Over 50 models shipped across classification, detection, and segmentation, trained on A100s with 400K+ images.

My focus spans self-supervised learning, foundation models, vision-language models, generative models, and MLOps. I care about the full stack: from pretraining and architecture design to optimized inference in production.

currently  : Data Scientist @ Vetology Innovations, Chicago
building   : foundation models, vision-language systems, production CV pipelines
background : MS Data Science, Stevens Institute of Technology
interests  : SSL pretraining · vision-language · generative models · efficient inference

Things I work with

PyTorch Python Azure MLflow ONNX Docker Apache Airflow NumPy OpenCV Linux


What I've been building

Self-Supervised & Foundation Models

ssl-comparison: BYOL, SimCLR, MAE, DINOv2, MoCo on the same datasets, apples to apples

mae-pretraining: MAE from scratch with reconstruction visualization

medical-foundation-model: domain-specific foundation model pretrained on 100K+ images

ijepa-implementation: clean I-JEPA implementation

Vision-Language & Multimodal

llava-medical-vqa: LLaVA fine-tuned for domain-specific visual Q&A

clip-finetuning: domain-specific CLIP/SigLIP fine-tuning with zero-shot evaluation

vlm-comparison: LLaVA vs InternVL vs Qwen-VL vs PaLiGemma benchmarked

visual-rag: retrieval-augmented generation using image embeddings

Generative Models

diffusion-model-from-scratch: DDPM from scratch with full training pipeline

diffusion-transformer: DiT implementation with scalable architecture experiments

controlnet-finetuning: ControlNet fine-tuning for conditioned image generation

gan-progression: DCGAN to StyleGAN2 progression with training stability

Detection, Tracking & Segmentation

yolo-benchmark: YOLOv8 vs v9 vs v10 vs YOLO-World speed/accuracy benchmark

medical-image-segmentation: UNet++, DeepLabV3+, SegFormer end-to-end pipeline

sam2-finetuning: SAM2 fine-tuned on custom domains with prompt strategies

open-vocabulary-detection: Grounding DINO and YOLO-World for zero-shot detection


Stats


More repos by area

Agentic AI

vision-agent · medical-diagnosis-agent · visual-rag · auto-image-annotator · image-search-engine · multi-agent-ml-experiments

Detection, Tracking & Video

yolo-benchmark · open-vocabulary-detection · sam2-video-tracking · multi-object-tracking · action-recognition · temporal-action-localization

Generative Models

diffusion-model-from-scratch · diffusion-transformer · controlnet-finetuning · medical-image-synthesis · gan-progression · dreambooth-finetuning

Autonomous Vehicles & 3D Vision

lidar-object-detection · multi-sensor-fusion · 3d-object-detection · depth-estimation · 3d-scene-reconstruction · occupancy-prediction

Efficient Models & Edge AI

model-compression · tensorrt-optimization · onnx-deployment · quantization-aware-training · knowledge-distillation · edge-vision-pipeline

MLOps & Infrastructure

ml-project-template · distributed-training · triton-inference-server · model-monitoring · ci-cd-for-ml · experiment-tracking

Deep Learning Fundamentals

transformer-from-scratch · vision-transformer-from-scratch · autograd-engine · paper-implementations · diffusion-math-walkthrough · backpropagation-from-scratch


LinkedIn Email

Pinned Loading

  1. llava-medical-vqa llava-medical-vqa Public

    Visual question answering for medical imaging using LLaVA

    Python

  2. medical-foundation-model medical-foundation-model Public

    Domain-specific foundation model pretrained on 100K+ medical images using SSL

    Python

  3. medical-image-segmentation medical-image-segmentation Public

    Production segmentation pipeline: UNet++, DeepLabV3+, SegFormer on medical imaging datasets

    Python

  4. ssl-comparison ssl-comparison Public

    Comprehensive benchmark of self-supervised learning methods: BYOL, SimCLR, MAE, DINOv2, MoCo

    Python

  5. vision-agent vision-agent Public

    Autonomous vision agent that analyzes images, runs models, and generates structured reports

    Python

  6. whole-slide-image-pipeline whole-slide-image-pipeline Public

    Full WSI pipeline: OpenSlide → patch extraction → MIL aggregation → slide-level prediction

    Python