I specialize in AI evaluation, LLM infrastructure, and quality assurance for production-grade systems. Currently focused on end-to-end RAG development, automated evaluation pipelines, and multi-modal model testing to ensure reliable, high-precision AI outcomes.
- RAGAS Evaluation Framework Integration β integrated automated evaluation across end-to-end RAG systems.
- Multi-modal AI Testing β built visual evaluation datasets and benchmarking workflows for VLMs.
- Automated Evaluation Pipelines β designed modular βevalsβ framework for one-click benchmarking and insights.
- Speech & Voice AI QA β fine-tuned Whisper for ASR optimization and production-grade voice agents.
From: 21 April 2026 - To: 21 May 2026
Total Time: 175 hrs 1 min
Python 66 hrs 11 mins βββββββββββββββββββββββββ 35.49 %
JSON 41 hrs 2 mins βββββββββββββββββββββββββ 22.01 %
Markdown 25 hrs 31 mins βββββββββββββββββββββββββ 13.69 %
Text 14 hrs 46 mins βββββββββββββββββββββββββ 07.92 %
Other 11 hrs 26 mins βββββββββββββββββββββββββ 06.14 %
Bash 8 hrs 20 mins βββββββββββββββββββββββββ 04.48 %
Jinja2 4 hrs 35 mins βββββββββββββββββββββββββ 02.47 %
CSV 3 hrs 56 mins βββββββββββββββββββββββββ 02.11 %
YAML 2 hrs 21 mins βββββββββββββββββββββββββ 01.26 %
TOML 2 hrs 10 mins βββββββββββββββββββββββββ 01.17 %
