I bridge the gap between research-grade ML models and production hardware. My focus is on air-gapped multimodal agents, medical diagnostic pipelines, and high-concurrency streaming systems.
Local NotebookLM & Mind-Mapper
Built a local research assistant using LLM-aided OCR to extract handwritten "stickers" and notes. Powered by a local LLM, it dynamically generates interactive, expandable flowcharts and a vectorized knowledge base for local RAG. βοΈ Flow:
Image IngestβLLM OCRβEntity ExtractionβDynamic Expanding Flowchartπ οΈ Tech:Local LLMDynamic UIOCRLocal RAG
Nebulai Ecosystem
Architected a local LLM cluster and an agentic routing system. Cut token costs by 30% and operational costs by 60%. βοΈ Flow:
User QueryβRouter AgentβvLLM ClusterβCost-Optimized Outputπ οΈ Tech:vLLMAgentic OrchestrationCost Optimization
OpTomo (Medical Diagnostic Pipeline)
Designed an end-to-end inference pipeline for breast cancer detection, migrating heavy Python logic to hardware-accelerated bindings for a 40% latency reduction. βοΈ Flow:
Medical ScanβPython PreprocessingβC++ Inference BindingsβReal-Time Diagnosisπ οΈ Tech:C++PythonHardware OptimizationEdge Inference
Secure Enterprise RAG
Built an air-gapped "Talk-to-your-Data" tool utilizing hybrid search and robust data engineering pipelines. βοΈ Flow:
Enterprise DataβApache NiFi IngestβHybrid Search (BM25 + Vector)βSecure LLMπ οΈ Tech:Apache NiFiMilvus / PineconeHybrid SearchAir-gapped LLM
Mobile Edge-AI Engine
Engineered a hot-swappable mobile inference engine capable of switching neural architectures at runtime without requiring app store updates. βοΈ Flow:
Android AppβKotlin OrchestratorβONNX Model Hot-SwapβOn-Device Inferenceπ οΈ Tech:KotlinONNXAndroid SDKNeural Architecture
Edge AI & Hardware Optimization
Data Engineering, RAG & Databases
Computer Vision, Analysis & Visualization

