Skip to content
View debu-sinha's full-sized avatar

Block or report debu-sinha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
debu-sinha/README.md

Debu Sinha

Principal ML Engineer @ Databricks | Author | IEEE Senior Member

I build production-grade ML and GenAI systems, with a focus on deployment, evaluation, and reliability. My contributions include implementing core evaluation capabilities in MLflow and building tooling used in enterprise AI systems.


Technical Contributions

MLflow Core Contributor

MLflow is one of the most widely adopted open-source ML lifecycle platforms globally (23K+ stars, 18M+ monthly PyPI downloads).

Production GenAI Evaluation Capabilities

Implemented 3 of 5 third-party scorer integrations for MLflow's GenAI evaluation framework. Each contribution was independently reviewed, approved, and merged by senior MLflow maintainers:

Integration PR Maintainer Review Production Impact
Phoenix (Arize) #19473 Reviewed by @smoorjani, @B-Step62 Hallucination detection, relevance scoring
TruLens #19492 Reviewed by @smoorjani, @B-Step62 Groundedness, context relevance, agent evaluation
Guardrails AI #20038 Reviewed by @smoorjani Safety validators (toxicity, PII, jailbreak detection)

These capabilities are now available to ML practitioners and enterprises worldwide.

Infrastructure & Deployment Contributions

PR Capability Status Impact
#19152 LLM Judge inference parameters ✅ Merged Temperature, top_p control for evaluation
#19248 Configurable scorer parallelism ✅ Merged MLFLOW_GENAI_EVAL_MAX_SCORER_WORKERS
#20344 UV package manager integration 🔄 In Review Automatic dependency inference

UV Integration Design Doc


Cross-Project Contributions

Bug fixes independently reviewed and merged by external maintainers:

Repository PR Issue Resolved
truera/trulens #2328 Instrumentation crash on non-callable objects
truera/trulens #2308 Databricks structured outputs compatibility

Open contributions under review:

Repository PR Proposed Fix
langchain-ai/langgraph #6547 Type signature for conditional edges
langchain-ai/langgraph #6544 functools.partial handling in ToolNode

MLflow-Modal Deployment Plugin

PyPI Downloads

Production deployment capability enabling MLflow models to run on serverless GPU infrastructure:

pip install mlflow-modal-deploy
mlflow deployments create -t modal -m models:/my-model/1 --name my-deployment
  • Auto-scaling from zero to thousands of GPUs (T4 → H200)
  • Sub-second cold starts
  • Native MLflow deployment interface

GitHub | PyPI


Publications

Book

Practical Machine Learning on Databricks

Practical Machine Learning on Databricks
Packt Publishing, 2023 | 244 pages

End-to-end guide for building production ML systems. Best-seller in category.


Research Papers

Research Affiliate, Johns Hopkins University

Paper Focus Venue
The Semantic Illusion Hallucination detection failure modes arXiv:2512.15068
Demystifying Large Language Models LLM architecture survey IJCET
Reinforcement Learning for Real-World Impact RL applications IJSRCET
AI in Healthcare Clinical ML pipelines IRJMETS

Speaking

Event Topic
TechFutures 2025 (NYC) End-to-End MLOps Workshop
Data Con LA 2022 Databricks Feature Store
Data Con LA 2021 Fraud Detection at Scale
NYU Guest Lecture ML Pipelines with Apache Spark

Professional

IEEE Senior Member — Recognition for significant contributions to the profession (requires 10+ years experience and documented achievements)


Contact

Pinned Loading

  1. Databricks-GenAI-Series Databricks-GenAI-Series Public

    All the resources related to GenAI hands on workshop.

    Python 24 48

  2. PacktPublishing/Practical-Machine-Learning-on-Databricks PacktPublishing/Practical-Machine-Learning-on-Databricks Public

    Practical Machine Learning on Databricks, published by packt

    Python 22 38

  3. mlflow mlflow Public

    Forked from mlflow/mlflow

    The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

    Python

  4. mlflow-modal-deploy mlflow-modal-deploy Public

    MLflow deployment plugin for Modal serverless GPU infrastructure

    Python 2

  5. cross-region-model-serving-dab cross-region-model-serving-dab Public

    Production-ready Databricks Asset Bundle for cross-region ML model serving using Delta Sharing. Deploy models and feature tables across workspaces with zero-copy data access and automated online fe…

    Python 1