Skip to content
View mukund1985's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report mukund1985

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mukund1985/README.md

Mukund Pandey

Staff ML Engineer · Meta

Building agentic AI systems, LLM eval infrastructure, and XAI pipelines at billion-user scale. Current focus: making AI systems that can explain themselves, fail safely, and be trusted in production.


What I work on

  • Agentic AI — multi-step agent architectures, tool-use, planning, and safety guardrails at scale
  • LLM Eval Infrastructure — consistency, hallucination detection, factual grounding, response drift
  • LLM Inference Infrastructure — high-throughput model serving, torch.compile optimisation, KV cache efficiency, production latency SLAs
  • MLOps & Observability — drift detection, model monitoring, evaluation pipelines, contributor to evidentlyai/evidently
  • Explainable AI (XAI) — decision explainability hooks, counterfactual reasoning, causal attribution
  • Security ML — real-time risk scoring, access intelligence, anomaly detection at billion-event scale

Research

Evaluating Agentic AI in the Wild: Failure Modes, Drift Patterns, and a Production Evaluation Framework

DOI

Identifies 7 failure modes in production agentic AI systems and introduces PAEF (Production Agentic Evaluation Framework) — validated on four controlled experiments. Reference implementation: llm-eval-toolkit.


Open source

My repos

Repo Description
llm-eval-toolkit Production-grade framework for evaluating LLM agent outputs — consistency, grounding, hallucination, drift
agentic-safety-patterns Pattern library for safe agentic systems — circuit breakers, explainability hooks, rollback, audit logging
retrieval-ranking-eval Dense retrieval + cross-encoder reranking pipeline benchmarked on BEIR datasets — NDCG@K, Recall@K, MRR
QuantumAI-IntradayRiskDemo Intraday risk pipeline: LSTM volatility forecasting + quantum-inspired QUBO/D-Wave portfolio optimisation

Upstream contributions

Repo What
evidentlyai/evidently Merged PR #1318 — ROUGE score descriptor (rouge1/2/L, F/P/R variants, 737 lines, 31 tests)
evidentlyai/evidently Merged PR — KL-divergence drift score bug fix
vllm-project/vllm PR #41381 open — torch.compile config hash typing cleanups + cache_key_factors debug expansion

Stack

Python PyTorch HuggingFace scikit-learn LangChain

MLflow FastAPI Ray

Apache Spark Kubernetes AWS GCP Azure


Find me

LinkedIn Medium GitHub


GitHub Stats

GitHub Stats Top Languages

Pinned Loading

  1. llm-eval-toolkit llm-eval-toolkit Public

    Production-grade framework for evaluating LLM agent outputs — consistency, hallucination detection, factual grounding, explainability, drift

    Python

  2. agentic-safety-patterns agentic-safety-patterns Public

    Pattern library for safe agentic AI systems — circuit breakers, explainability hooks, counterfactual impact estimation, audit logging, rollback

    Python

  3. retrieval-ranking-eval retrieval-ranking-eval Public

    Dense retrieval + cross-encoder reranking pipeline benchmarked on BEIR datasets (NDCG, Recall@K, MRR)

    Python

  4. QuantumAI-IntradayRiskDemo QuantumAI-IntradayRiskDemo Public

    Intraday risk pipeline: LSTM volatility forecasting + quantum-inspired (QUBO/D-Wave) portfolio optimisation — production ML for quantitative finance

    Jupyter Notebook

  5. evidentlyai/evidently evidentlyai/evidently Public

    Evidently is ​​an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

    Jupyter Notebook 7.5k 838

  6. mukund1985.github.io mukund1985.github.io Public

    HTML