Shashibhushan Singh singh728om

🧠 About Me

class AIEngineer:
    def __init__(self):
        self.name        = "Shashibhushan Singh"
        self.role        = "AI Engineer — LLM + Backend"
        self.location    = "India 🇮🇳"
        self.experience  = ["AWS Cloud Engineer @ Amazon", "AI/LLM Engineer"]
        self.focus       = ["LLM Applications", "RAG Systems", "AI Agents", "Backend APIs"]
        self.currently   = "Building production-grade LLM backends"
        self.learning    = ["Fine-tuning (LoRA/QLoRA)", "Multi-agent systems", "MLOps"]
        self.ask_me      = ["LangChain", "FastAPI", "Vector DBs", "AWS", "Prompt Engineering"]
        self.contact     = "itshashi.io@gmail.com"

    def __str__(self):
        return "From cloud infrastructure to intelligent AI systems 🚀"

💼 Experience

🟠 AWS Cloud Engineer — Amazon

Designed, deployed, and maintained scalable cloud infrastructure on AWS

Architected multi-region, highly available systems on EC2, ECS, Lambda, and EKS
Built automated CI/CD pipelines using CodePipeline, CodeBuild, and GitHub Actions
Managed data infrastructure with S3, RDS, DynamoDB, and Redshift
Implemented infrastructure-as-code with Terraform and AWS CloudFormation
Set up observability stacks using CloudWatch, X-Ray, and Grafana
Enforced security best practices: IAM roles, VPC design, KMS, GuardDuty
Reduced infrastructure costs by optimizing Reserved Instances and Auto Scaling policies

🟣 AI Engineer — LLM + Backend

Building production AI systems powered by large language models

Designing RAG pipelines over enterprise document corpora with retrieval + reranking
Building LLM agents with tool use, persistent memory, and multi-step reasoning
Shipping async FastAPI backends with streaming (SSE/WebSocket) for real-time AI
Integrating OpenAI, Anthropic Claude, and open-source models (Mistral, LLaMA)
Implementing LLM observability with LangSmith and Langfuse
Exploring LoRA/QLoRA fine-tuning for domain-specific model adaptation

🛠️ Tech Stack

🤖 AI / LLM

🔍 Vector Databases

⚙️ Backend & Languages

☁️ AWS & Cloud (Amazon Experience)

🏗️ Infrastructure as Code

🔧 DevOps & Containers

📊 LLM Observability & Evals

🧰 Dev Tools

🚀 What I Build

📄 RAG Pipelines         →  Ingest docs · Chunk · Embed · Retrieve · Rerank · Answer
🤖 AI Agents             →  Tool use · ReAct loops · Multi-agent orchestration
⚡ LLM APIs              →  FastAPI · Streaming (SSE) · Async · Auth · Rate limiting
☁️  Cloud Infrastructure  →  AWS multi-region · IaC · Auto-scaling · Cost optimization
🧪 Eval Systems          →  RAGAS · LLM-as-judge · Regression testing
🔐 Safe AI               →  Guardrails · Prompt injection defense · Output validation
📦 Fine-tuning           →  LoRA · QLoRA · PEFT · Instruction datasets

📌 Featured Projects

🔷 Production RAG System

End-to-end RAG pipeline over enterprise documents

PDF/HTML ingestion → semantic chunking → Qdrant vector store
Hybrid search (BM25 + dense) + Cohere reranking
Streaming FastAPI backend with LangSmith tracing
Deployed on AWS ECS + S3 with CloudWatch monitoring
FastAPI LangChain Qdrant OpenAI Docker AWS

🔷 AI Agent with Tool Use

Multi-tool LLM agent with persistent memory

ReAct loop with web search, code execution, and DB query tools
Long-term memory via Redis + embeddings
Streamed responses over WebSocket
Serverless deployment on AWS Lambda + API Gateway
LangGraph FastAPI Redis OpenAI PostgreSQL AWS

🔷 LLM Gateway / Router

Unified API gateway across OpenAI, Anthropic, and open-source models

Smart routing by cost, latency, and capability
Semantic caching to cut costs by ~40%
Per-user rate limiting and API key management
Infra managed via Terraform on AWS ECS + ElastiCache
LiteLLM FastAPI Redis Docker Terraform AWS

🔷 Cloud Cost Optimizer

AWS cost analysis and auto-optimization tool (from Amazon experience)

Automated detection of idle EC2, oversized RDS, and unused Elastic IPs
Scheduled Lambda jobs to snapshot and right-size instances
Savings report dashboard via CloudWatch + QuickSight
Python AWS Lambda Boto3 CloudFormation CloudWatch

📈 GitHub Stats

🗺️ Journey & Roadmap

✅  AWS Cloud Engineer @ Amazon
     └─ EC2 · ECS · Lambda · EKS · S3 · RDS · DynamoDB
     └─ Terraform · CloudFormation · IAM · VPC · CloudWatch

✅  LLM Foundations
     └─ Transformers · Attention · Tokenization · Embeddings

✅  LLM APIs & Prompt Engineering
     └─ OpenAI · Anthropic · Few-shot · CoT · Structured outputs

✅  RAG Systems
     └─ Chunking · Vector DBs · LangChain · LlamaIndex · Reranking

✅  FastAPI AI Backends
     └─ Async · Streaming · WebSockets · Pydantic · Auth

🔄  Fine-tuning (LoRA / QLoRA)         ← currently here
🔄  Multi-agent systems (LangGraph, CrewAI)

⬜  LLM inference optimization (vLLM, TensorRT-LLM)
⬜  MLOps at scale (BentoML, Kubeflow)
⬜  Multimodal models (vision + language)

🤝 Let's Connect

I'm open to collaborating on LLM projects, AI tooling, cloud architecture, and backend systems.
Building something interesting? Let's talk.

"Cloud gives you the scale. AI gives you the intelligence. Together — that's the future."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly