Systems Engineer | Edge AI Β· Agentic RAG Β· Distributed Autonomous Infrastructure
I design and build deterministic software architectures for resource-constrained systems. My work focuses on pulling heavy AI workloads out of the cloud and optimizing them to run efficiently on bare-metal and ARM edge infrastructure.
With a background rooted in mechanical engineering and a career built shipping production robotics R&D in Bengaluru, I treat software as an extension of physical constraints. I write production-grade Python and systems automation that accounts for thermal throttling, memory leaks, I/O bottlenecks, and network failures before a single line of code goes live.
A deterministic, graph-based compiler that converts arbitrary system requirements into fully audited, deployable software assets.
- Fractal Decomposition: Avoided sequential prompting pitfalls by building a stateful, cyclic execution graph that recursively splits monolithic code requirements down to 4 isolation layers.
- Context Optimization: Engineered a 3-tier token budget manager (Manifest β Contract Surface β Git Diffs), resulting in a 7x reduction in token overhead.
- Static & Semantic Verification: Built
pie.py(Production Intuition Engine) to run deterministic AST parsing alongside an LLM-as-a-judge node, enforcing over 10,000 structural engineering constraints. - Crash-Safe Persistence: Wrote an asynchronous checkpoint engine that commits runtime state to local disk, surviving mid-run API failures without data loss.
Stack:
PythonLangGraphFastAPIPydantic v2SQLite
A production-grade document synthesis engine designed to process non-textual, high-density corporate data structures natively.
- Visual-Spatial Ingestion: Utilized ColPali vision-language embeddings to parse complex structural assets (tables, schematics, charts) without lossy OCR steps.
- Hybrid Indexing: Configured custom Qdrant vector spaces optimized for hybrid keyword and dense-spatial searches.
- Deterministic Citations: Enforced strict, metadata-validated tracing layers (
[Source: filename, Page: N]) to prevent hallucinated data injection. - Fault-Tolerant Networking: Designed a KeyRotator abstraction layer to absorb aggressive provider rate-limiting and handle horizontal network failovers seamlessly.
Stack:
ColPaliQdrantGroq APIFastAPIDocker
A cloud-independent, low-latency execution runtime built explicitly for ARM-based embedded compute platforms.
- Inference Latency Hardening: Quantized and compiled vision architectures down to localized Int8 execution providers, dropping perception-to-action overhead to <100ms on a Raspberry Pi 5.
- Asynchronous Pipelines: Deployed local audio capture, Whisper processing, local LLM evaluation, and TTS into independent, zero-copy memory pipelines.
- Low-Latency Telemetry: Re-engineered spatial transmission using WebRTC data channels with sub-15ms system command latency.
- Enterprise Hardening: Secured systems via RSA-signed OTA updates, mutual TLS tunnels, and a local SQLCipher storage layer.
Stack:
PythonTFLiteYOLOv8WebRTCROS2OpenCV
High-throughput system monitoring built at the kernel layer for minimal performance degradation.
- Kernel Instrumentation: Deployed custom eBPF probes directly into the kernel data path to log system events without performance penalties.
- Distributed Logging: Directed asynchronous event packets through an Apache Kafka streaming pipeline into an optimized InfluxDB time-series backend.
- State Tracking: Designed dynamic Grafana dashboards paired with automated webhook alerting to identify anomalous system load spikes instantly.
Stack:
eBPFApache KafkaInfluxDBGrafana
- β‘ Zero-Cloud Compute Dependency: Moved heavy vision pipelines from expensive x86 cloud clusters down to on-device ARM edge chips, hitting a hard sub-100ms processing threshold.
- π 30% Reduction in Telemetry Latency: Replaced legacy, synchronous polling protocols with event-driven WebSocket and WebRTC pipelines to minimize transport layer overhead.
- π 40% Faster Prototyping Life-cycles: Bridged the communication gap between mechanical CAD pipelines and ML lifecycles by designing software abstractions that account for real-world mechanical constraints.
- π 25% Uptime Improvement: Designed robust, failure-tolerant Python scripts with strict process isolation, preventing cascading hardware crashes in real-world deployments.
"I don't look at machine learning as a magical black box; I treat an LLM or neural network as a highly non-linear, stochastic software module that requires the same strict testing, validation, and cost constraints as any legacy database or compiler."
- Compute is Never Free: An elegant architecture is defined by how small of a model it needs to solve a problem deterministically, not how many parameters it can throw at it.
- Fail Gracefully at the Boundary: When a sensor fails, a camera drops a frame, or an external API times out, the system should degrade safelyβnot throw an unhandled exception and crash the entire machine.
- Hardware and Software are Married: Building software without understanding memory maps, CPU architectures, cache lines, or thermal thresholds is how bad code makes great hardware look slow.
- π Current R&D: Building asynchronous, stateful agent networks and local Edge AI perception graphs.
- π± Current Deep-Dive: Advanced weight distillation, structural quantization-aware training, and model fine-tuning with LoRA/QLoRA.
- π€ Open To: Principal, Lead, or Senior Systems Engineering roles within Edge AI, Agentic Systems Design, or Intelligent Robotics Infrastructure.
Equipped for global asynchronous remote engineering.