- View Portfolio | For Recruiters | For Consulting Clients | Performance Benchmarks | Technology Comparisons
Hi, I'm Drew— a Rust developer building advanced AI infrastructure and GPU-accelerated systems.
What you'll find here:
- 🦀 Rust-first AI systems: LLM routers, RAG backends, GPU-accelerated inference using Claude Code, OpenCode
- ⚡ CUDA-integrated performance: Matrix operations 15-40x faster, custom GPU kernels for Nvidia clusters
- 🔒 Memory-safe ML: Zero-copy operations, no GC pauses, predictable latency for production AI
- 🏗️ Full-stack cloud engineering: Pulumi IaC, Kubernetes orchestration, CI/CD pipelines, infrastructure as code
- 🧠 Open-source tools: CLI utilities, automation frameworks, MCP servers, developer experience
- 🤖 Systems expertise: Linux optimization, container runtimes, cross-platform tooling
My Focus:
- Rust + AI Engineering: Memory-safe LLM systems using burn, candle, candle-core for production workloads
- GPU Optimization: CUDA kernels, TensorRT integration, Nvidia NIM and Blueprints for scalable AI
- Cloud Infrastructure: Pulumi IaC, Kubernetes orchestration, RunAI clusters, multi-cloud patterns
- Performance Engineering: Real-world benchmarks, 40+ speedups, production optimization with full-stack visibility
- Developer Tooling: CLI-first workflows, containerized AI environments, MCP servers
Philosophy: Rust's ownership model extended to system architecture—composable components, memory-safe abstractions, zero-cost abstractions. Build systems that are reproducible, performant, and respect principle of least surprise with maximum transparency.
Rust isn't just a language—it's a competitive advantage for production AI systems.
| Metric | Python | Rust |
|---|---|---|
| Memory Safety | Runtime errors | Compile-time guarantees |
| Performance | Good | 2-5x faster |
| Concurrency | GIL limitations | True parallelism |
| Memory Overhead | 50-100MB+ | Zero-cost abstractions |
| Deployment | Heavy interpreter | Single binary |
- Language: Rust with Tokio async runtime
- Purpose: Intelligent LLM prompt routing using reinforcement learning algorithms
- Tech: Claude Code integration, OpenCode MCP, model evaluation, adaptive routing
- Performance: Sub-millisecond routing decisions with minimal memory overhead
- GitHub: awdemos/merlin
- Language: Rust backend + Python Chainlit frontend
- Purpose: Retrieval-augmented generation pipeline with streaming
- Tech: Async inference using burn/candle, vector embeddings, concurrent processing, Claude Code assistance
- GitHub: demos/llm/chainlit_rust_rag
- Language: Rust-based (Gentoo fork, Btrfs, Cosmic Desktop)
- Purpose: Linux distro designed for AI workloads from the ground up
- Tech: Rust init system, Btrfs filesystem, AI-optimized tooling, Claude Code integration
- GitHub: awdemos/RegicideOS
- Language: Rust with CUDA bindings
- Purpose: 1024x1024 matrix multiplication demonstrating GPU optimization
- Performance: 15x faster than CPU, 40x faster with CuBLAS—production-proven speedup
- GitHub: demos/rust/rust_matrix_multiplication
Verified Expertise Across AI Infrastructure, GPU Computing, and Cloud Engineering
| Certification | Focus Area |
|---|---|
| AI Advisor - Technical Sales | AI solutions architecture & sales engineering |
| AI Infrastructure Operations Fundamentals | GPU cluster operations & management |
| BCM Administration | NVIDIA BlueField Controller management |
| DGX SuperPOD Administration | Enterprise AI infrastructure deployment |
| InfiniBand Essentials | High-performance networking fundamentals |
| InfiniBand Network Administration | Network configuration & optimization |
| Introduction to Networking | Network architecture foundations |
| NVIDIA Pro: Agentic AI | Agentic AI systems & implementation |
| NVIDIA Pro: AI Infrastructure | Advanced AI infrastructure design |
| Certification | Focus Area |
|---|---|
| AI Business Practitioner | Business AI strategy & implementation |
| AI Technical Practitioner | Technical AI solution design |
| Partner Certification (EAP Comms) | Partner technical enablement |
| Certification | Provider | Focus Area |
|---|---|---|
| AI Certificate | IBM / edX | AI fundamentals & applications |
| Technical Sales Professional | Pure Storage | Storage infrastructure & solutions |
| Securing Software Supply Chain with Sigstore | Linux Foundation | Supply chain security |
| Credential | Focus Area |
|---|---|
| Supply Chain Security | Chainguard |
| Securing Software Supply Chain | Sigstore / Linux Foundation |
| Course | Focus Area |
|---|---|
| Acting Responsibly with Generative AI | AI ethics & responsible use |
| Empathy and Emotional Intelligence at Work | Leadership & collaboration |
| AI Empathy Certificate | Human-AI interaction |
| Dignity and Respect in the Global Workplace | Inclusive workplace practices |
| Preventing Workplace Violence | Workplace safety |
| Healthcare Introduction | Healthcare industry fundamentals |
This portfolio demonstrates practical application of NVIDIA technologies across multiple projects:
| Project | NVIDIA Technologies Used |
|---|---|
| LLM Deployment Demos | NVIDIA GPUs, CUDA optimization |
| AI Infrastructure Demos | NVIDIA container runtime, MIG (Multi-Instance GPU) |
| MLOps Pipelines | NVIDIA Triton Inference Server, RAPIDS |
All certificates available in the certs/ directory for verification.
Why Consider This Portfolio?
Deep Technical Expertise:
- Rust, CUDA & GPU: Production experience building AI systems with memory-safe Rust and GPU acceleration
- Claude Code & OpenCode: Deep expertise in AI-assisted development with MCP servers
- Kubernetes: 50+ deployment patterns across AWS, GCP, Azure with RunAI integration
- AI/ML Infrastructure: Production LLM deployments, MLOps pipelines with Rust backends
- Modern IaC: Pulumi (Go), Terraform, GitOps practices
- Full-Stack Cloud Engineering: Infrastructure as code, CI/CD pipelines, observability, monitoring
Open Source Contributions:
- Merlin — Rust-based LLM router with reinforcement learning
- RegicideOS — AI-native Rust Linux distribution
- chainlit_rust_rag — Production RAG backend in Rust
- rust_matrix_multiplication — CUDA-accelerated ops (15x faster)
Technical Skills Demonstrated:
- Memory-safe ML frameworks (burn, candle, candle-core)
- GPU kernel development and optimization for Nvidia clusters
- Async systems programming with Tokio for high-throughput AI workloads
- Nvidia NIM (NVIDIA Inference Microservices) and Blueprints integration
- Infrastructure as Code with declarative Pulumi definitions
- RunAI clusters and multi-cloud orchestration
- CI/CD pipelines with container-native approaches
- Comprehensive observability and monitoring
- Model Context Protocol (MCP) servers for AI agents
- Container-isolated development environments for AI workflows
- Multi-agent coordination and parallel processing
- Linux kernel and container runtime optimization
- Systemd services and container orchestration
Contact for Recruiting:
- 🐙 GitHub Issues — Create an issue to reach out
- 📧 Use GitHub's email contact feature (if public on my profile)
Learn More:
- 🦀 Rust Demos & AI Projects — See production Rust code in action
- 📊 Performance Benchmarks — Quantifiable metrics and improvements
- 🔬 Technology Comparisons — Deep technical analysis
- 📚 Rust for AI Guide — Comprehensive documentation
- 🤖 Linux & Systems — Kernel and container runtime optimization
I help organizations:
- Build Rust-first AI infrastructure with 2-5x performance improvements
- Optimize Nvidia clusters and RunAI deployments for cost efficiency
- Develop memory-safe ML systems using burn, candle, and custom CUDA kernels
- Implement Infrastructure as Code with Pulumi across multi-cloud environments
- Deploy scalable AI platforms with Kubernetes, CI/CD, and full observability
Proven Results:
"Reduced our AI costs by 60% using Rust-based infrastructure while improving performance. The CUDA optimizations and memory safety were game-changing." — CTO, FinTech Startup "Transitioned our AI platform to Rust with zero downtime. Production performance improved 3x with predictable latency." — VP Engineering, AI Company
How to Work With Me:
- Initial Consultation: Free 10-minute discovery call
- Engagement Models:
- Tier 1: Strategy & Planning — $250/hr, 10-hour minimum
- Infrastructure Assessment
- Rust + AI Architecture Review
- Cost Optimization Analysis
- Technology Roadmap
- Team Training on Rust and CUDA
- Tier 2: Full Implementation — $5,000/project, exclusive to one client
- Complete Infrastructure Overhaul
- AI/ML Pipeline Development in Rust
- Kubernetes Migration
- GPU Optimization and CUDA Kernel Development
- Ongoing Support (retainer-based)
- Tier 1: Strategy & Planning — $250/hr, 10-hour minimum
📅 Schedule Free Consultation: cal.com/aiconsulting
What You Get:
- Production-ready Rust code (see demos in this repo)
- Knowledge transfer and team training
- Ongoing support and optimization
- Transparent pricing and clear timelines
Practical Impact:
"The best consulting delivers value that lasts long after engagement ends. My Rust expertise enables building systems that are composable, memory-safe, and performant. Zero GC pauses means predictable AI performance—critical for production workloads. Infrastructure as Code with Pulumi ensures reproducible deployments across clouds."
Enterprise-Grade Practices: The patterns demonstrated in this portfolio aren't just for startups—they scale:
- Compliance-Ready: SLSA Level 2/3 supply chain hardening for regulated industries
- Multi-Tenant Security: Zero-trust architecture with proper RBAC and isolation
- Audit Trails: Comprehensive logging, monitoring, and traceability across all systems
- Disaster Recovery: Immutable infrastructure with backup and restore strategies
- Scalable Architecture: Horizontal scaling with proper state management and orchestration
Production-Grade Infrastructure Patterns & Demos
| Directory | Description | Technologies | Highlights |
|---|---|---|---|
kubernetes/ |
100+ deployment patterns | K8s, EKS, GKE, Talos, Cilium | Multi-cloud, zero-trust, GPU-optimized, RunAI integration |
llm/ |
AI/ML infrastructure | Mistral, OpenAI, Nvidia GPUs | Fine-tuning, inference, RAG pipelines, GPU optimization |
dagger-go-ci/ |
CI/CD pipelines | Dagger, Tekton, Go | Container-native, reproducible, platform detection |
pulumi-azure-tenant/ |
Multi-tenant IaC | Pulumi (Go), Azure | Secure, scalable patterns, GitOps |
rust/ |
Rust CLI tools & AI systems | Rust, Tokio, CUDA | Performance-critical tools, memory safety |
python/ |
Python best practices | Poetry, Type hints | Production-ready patterns |
ai-agent-tools/ |
AI agent infrastructure | MCP, container-use, OpenCode | Multi-agent systems, isolated workspaces |
dev-experience/ |
Developer tooling | Zerobrew, tmux, neovim | Cross-platform automation, dotfiles |
# Clone repository
git clone https://github.com/awdemos/demos.git
cd demos
# Explore available demos
ls -la demos/Enterprise-Grade AI Stack
Production experience building and operating complete AI/ML infrastructure:
- NVIDIA GPU Operator: Automated GPU provisioning in Kubernetes
- MIG (Multi-Instance GPU): Partitioning for multi-tenant efficiency
- DCGM Monitoring: Real-time GPU metrics and telemetry
- CUDA Toolkit 12.1.0: Optimized workflows and memory management
- NVIDIA Container Toolkit: Seamless GPU access in containers
- NIM (Inference Microservices): Scalable AI model deployment with RunAI clusters
- Triton Inference Server: Production model serving with GPU acceleration
- MLflow: Experiment tracking, model registry, and lineage
- Ray: Distributed computing for training and inference
- Argo Workflows: ML pipeline orchestration with GitOps
- Model Serving: Production deployments (Mistral, OpenAI, custom models)
- Rust-First ML: Burn framework for memory-safe ML workloads
- Inference Optimization: TensorRT, batch processing, resource management
Explore Demos:
demos/llm/— LLM infrastructure with GPU optimizationdemos/rust/— Rust-based AI tools and performance benchmarksdemos/kubernetes/— GPU-enabled Kubernetes deployments with RunAI
Always Learning, Always Building
Active areas of investigation and experimentation:
- MCP (Model Context Protocol) — Building custom tools for AI agents in Rust
- Parallel Agent Workflows — Running multiple AI agents simultaneously for complex tasks
- Async Agent Coordination — Background task management and result aggregation
- Container-Isolated Environments — Safe execution of AI-generated code
- AI-Native Tools — Editors and IDEs with LLM-first design (Claude Code, OpenCode)
- Automated Code Review — Using AI for architecture validation
- Self-Healing Infrastructure — Systems that detect and fix issues autonomously
- GPU Optimization — CUDA kernels, memory management, and NVIDIA TensorRT acceleration
- NVIDIA DCGM Integration — Deep GPU monitoring and telemetry for production systems
- Rust-Based AI Infrastructure — Performance-critical ML tooling
- Resource Scheduling — Efficient NVIDIA GPU allocation for multi-tenant systems
- RunAI Clusters — Scalable AI infrastructure on Nvidia GPUs
- SLSA in Production — End-to-end supply chain verification
- Zero-Knowledge Workloads — Confidential AI on untrusted infrastructure
- Hardened Container Images — Minimal attack surfaces for AI services
Want to Collaborate? These areas are actively evolving. If you're working on similar problems or want to explore together, let's connect.
- efrit — Native elisp coding agent running in Emacs. Nushell port in progress.
- Voice of the Dead — SOTA TTS project
- Merlin — LLM router written in Rust. Utilizes RL to route LLM prompts intelligently. GPL 3.0 project.
- chainlit_rust_rag — Rust backend for RAG pipeline with Chainlit frontend
- RegicideOS — AI-native, Rust-first Linux distribution based on Gentoo, Btrfs, Cosmic-Desktop
- DCAP — Dynamic Configuration and Application Platform for distributed systems
- symbolic_ai_elisp_knowledge_base — Open-source reimagining of a Cyc-style knowledge base
- Dotfiles — Complete development environment with 300+ lines of Makefile automation, cross-platform support (macOS, Linux, WSL, Alpine), AI/ML stack integration, and comprehensive documentation
- container-use integration — Isolated development environments for AI coding agents with branch isolation and diff/review workflows
- MCP Servers — Production examples extending AI agents with custom tools (CLI execution, API integration, web search)
- Talos — Best in class Kubernetes OS
- Pulumi — Infrastructure as Code in general purpose programming languages
- RunAI — Scalable AI infrastructure on Nvidia GPUs
- vCluster — Virtual Kubernetes clusters
- Cilium — eBPF-based networking and security
- Cloudflare — Cost-effective cloud services
- Railway — Instant deployments, effortless scale
- GPTScript — Natural language scripting
- Claude Code — I use it daily
- OpenCode — Modern AI development environment
- pairup — AI Pair Programming in Neovim
- ComfyUI — Stable diffusion framework
- container-use — Isolated development environments for AI agents (Dagger)
- bincapz — Container image security analysis
- Colima — Container runtime for macOS/Linux
- Dive — Docker image layer analysis
- Podman — Daemonless container engine
- nerdctl — Docker-compatible containerd CLI
- slim — Container image optimization (30x reduction)
- Kitty Terminal — Fast, GPU-accelerated terminal
- Cursor IDE — AI-powered development environment
- Devcontainer — Containerized development
- Devpod — Automated dev environments
- Chainguard — Software supply chain security and minimal base images
- SLSA Framework — Supply chain Levels for Software Artifacts (implemented in dotfiles)
- GrapheneOS — Security-focused Android distribution
- NitroPC — Open-source secure PC
For Recruiters:
- 📧 Use GitHub's email (if public) or create an issue to reach out
- 📋 Review Featured Projects for evidence of expertise
For Consulting:
- 📅 Schedule Free Consultation
- 💼 Review For Consulting Clients section
Open Source:
- 🐙 Follow on GitHub for new projects
- ⭐ Star interesting projects to show appreciation
Open Source by Default
Everything in this portfolio is open source, documented, and reproducible. I believe in:
- Transparent systems - No black boxes, all decisions documented
- Knowledge sharing - Comprehensive guides and troubleshooting documentation
- Composable tools - Every component replaceable and well-integrated
- Security-first - SLSA implementation, immutable infrastructure, supply chain integrity
- Dotfiles Repository — Complete development environment with AI/ML stack, GPU orchestration, MCP servers, and advanced developer tooling
- MCP Guide — Comprehensive Model Context Protocol implementation examples
- AI Coding Tools — Terminal-focused AI assistance workflows
- SLSA Implementation — Supply chain security hardening
- Rust for AI Guide — Comprehensive Rust ecosystem for AI/ML workloads
- Technology Comparisons — Deep analysis of Kubernetes, LLM serving, IaC, CI/CD, and service mesh tools
- Performance Benchmarks — Quantifiable metrics from production deployments
- Screenshots Guide — Instructions for creating visual assets to showcase demo projects
- Production-ready infrastructure patterns from real deployments
- Security best practices (SLSA Level 2/3, immutable infrastructure)
- Multi-agent AI system architectures
- Cross-platform developer experience automation
This portfolio and associated dotfiles repository aren't just demos—they represent production patterns that solve actual problems:
- Cost Reduction: NVIDIA GPU scheduling and MIG partitioning that cut AI infrastructure spend by 50%+
- Reliability: GitOps workflows that have maintained 99.9%+ uptime across multiple clients
- Velocity: Automated CI/CD pipelines that reduced deployment times from hours to minutes
- Performance: CUDA optimization and Triton Inference Server deployments that improved inference throughput by 3-5x
- Security: SLSA implementation that passed external audits for regulated industries
Open Source ≠ Only Open Source
While this repository contains openly available tools, patterns, and examples, expertise demonstrated here is equally applicable to proprietary, confidential, or regulated environments. The principles—automation, reproducibility, transparency—work everywhere.
- Multi-Agent Orchestration: Building systems where AI agents collaborate with domain experts
- Self-Healing Infrastructure: Systems that detect and remediate issues autonomously
- AI-Native Tooling: Development environments optimized for AI-assisted workflows
- Quantum-Resistant Cryptography: Preparing infrastructure for post-quantum security requirements
- Distributed Training at Scale: Optimizing ML pipelines across heterogeneous NVIDIA GPU clusters
- CUDA Kernel Development: Custom GPU kernels for specialized AI workloads
- NVIDIA MIG Optimization: Advanced GPU partitioning strategies for multi-tenant efficiency
Beyond Basic Automation
Building systems that improve themselves:
- AAS (Artificial Age Score) Monad Framework — Mathematically-grounded scoring for configuration evolution, contradiction detection, and guided optimization
- Parallel Experimentation — Multiple isolated configurations evaluated objectively for optimal states
- Appetition-Driven Updates — Systems that evolve toward better configurations through measurable feedback
- Git Worktree + container-use — Parallel feature development with isolated AI agent environments
- Workmux — Project-based tmux session management with automatic workspace setup
- Multi-Agent Coordination — Multiple AI agents working simultaneously in isolated environments
- Immutable Infrastructure — All infrastructure declarative and version-controlled
- Security-First Design — SLSA Level 2/3 implementation, supply chain integrity
- Observability — Comprehensive monitoring (NVIDIA DCGM, MLflow, Prometheus dashboards)
Read More:
Production-Grade Security Practices
Building systems that are secure by design:
- SLSA Implementation — Full supply chain provenance for artifacts
- Verifiable Builds — Reproducible builds with attestation
- Dependency Verification — SBOM generation and vulnerability scanning
- Zero-Trust Networking — Cilium, eBPF-based security policies
- Secrets Management — HashiCorp Vault, Kubernetes secrets encryption
- Container Hardening — Chainguard images, bincapz security analysis
- Code Signing — GPG signing for all commits and releases
- Security Auditing — Regular penetration testing and dependency updates
- Compliance-Ready — Infrastructure designed for SOC2 and ISO27001
Tools Used:
- Chainguard — Software supply chain security
- bincapz — Container image security
- SLSA Framework — Supply chain security standards
Modern Productivity Stack
Tools and workflows that make development faster and more reliable:
- Neovim / AstroVim — Lua-configured, LSP-powered editing with AI integration
- Kitty Terminal — GPU-accelerated, multiplexed terminal workflow
- Tmux — Session management with workmux for project-based automation
- Zellij — Modern terminal workspace alternative
- Dagger — Programmable CI/CD pipelines (Go SDK)
- container-use — Isolated environments for AI agents and testing
- nerdctl / Podman — Daemonless container engines
- slim — 30x container image size reduction
- Rust-Based Ecosystem — Modern replacements for coreutils, fd, ripgrep, bat, zoxide
- Nushell — Data-focused shell with structured data manipulation
- Homebrew / Nix — Reproducible package management
- btop — Real-time system monitoring
- htop — Process management with GPU metrics
- glances — Web-based system monitoring
- lazydocker — Terminal UI for Docker/containerd
Why This Matters:
"Tools aren't just utilities—they're force multipliers. A well-configured development environment can save 2-3 hours per day through automation, faster feedback loops, and reduced context switching."
Read More:
While this is my demo repository, create an issue if you would like to connect with me further!
All original code in this repository is released under MIT License. Third-party components may have different licenses — please refer to their respective documentation.
© 2026 — Portfolio demonstrating Rust + AI infrastructure expertise.
