Skip to content
View Shehrozkashif's full-sized avatar

Block or report Shehrozkashif

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Shehrozkashif/README.md

πŸ‘‹ Hi, I'm Shehroz Kashif

AI Engineer | Software Engineer | LLM & MLOps Researcher
Research Assistant @ Micro Electronics Research Lab (MERL)
LFX’25 Mentee @ RISC-V International

Open-source contributor focused on production-ready AI systems, LLM evaluation, and reproducible ML pipelines.


πŸš€ About Me

I’m an AI Engineer and Researcher working at the intersection of LLMs, MLOps, and open-source systems.
My work focuses on building reliable, testable, and deployment-ready AI pipelines rather than experimental-only models.

πŸ” Current Focus

  • 🧠 LLM evaluation & benchmarking (functional, syntactic, adversarial)
  • πŸ›‘οΈ Hallucination mitigation in private LLMs using GAN-based approaches
  • βš™οΈ Reproducible ML pipelines with CI/CD, logging, and SLA-aware validation
  • πŸ“Š RISC-V data & tooling for machine-readable specifications and verification

πŸ’‘ I care deeply about making AI systems trustworthy in production.


🧠 Roles & Affiliations

  • πŸ”Ή Research Assistant β€” Micro Electronics Research Lab (MERL)
    Working on LLM evaluation pipelines, benchmarking frameworks, and RISC-V-related tooling

  • πŸ”Ή LFX’25 Mentee β€” RISC-V International
    Contributing to machine-readable RISC-V specifications, schemas, and CI validation pipelines


🧰 Tech Stack

πŸ”€ Languages

Python Β· Scala Β· Verilog Β· Java Β· Shell Β· JavaScript Β· HTML Β· CSS

🧠 AI / ML

PyTorch Β· TensorFlow Β· Hugging Face Transformers Β· GANs Β· LLM Evaluation
NumPy Β· Pandas Β· Scikit-learn

βš™οΈ MLOps & Engineering

CI/CD Β· Docker Β· REST/gRPC Β· Logging & Monitoring Β· Reproducible Pipelines
Git Β· GitHub Actions Β· Linux Β· pytest

🧾 Data & Config

JSON Β· YAML Β· MySQL


πŸ’‘ Featured Projects

πŸ›‘οΈ AI4org β€” GAN-based Hallucination Mitigation for Private LLMs

πŸ”— https://github.com/merledu/ai4org

  • Built a privacy-first ML pipeline to detect and mitigate hallucinations in private LLMs
  • Designed a GAN-style generator/discriminator for hallucination detection
  • End-to-end pipeline: ingestion β†’ validation β†’ reproducible training β†’ containerized inference
  • Integrated CI/CD, automated testing, and monitoring for production readiness

πŸ“Œ Designed for enterprise and on-prem LLM deployments where reliability matters.


πŸ”¬ ArcheV β€” LLM Benchmark Suite

πŸ”— https://github.com/merledu/ArcheV

  • Engineered a reproducible LLM benchmarking framework
  • Standardized JSON I/O and CI-driven evaluation pipelines
  • Validates functional and syntactic correctness to support deployment decisions

πŸ“˜ RISC-V Unified Database

πŸ”— https://github.com/riscv-software-src/riscv-unified-db

  • Maintained versioned YAML/JSON schemas for RISC-V tooling
  • Implemented CI validation to ensure data integrity and observability
  • Improved downstream reliability for tooling and ML pipelines

πŸ† Highlights & Achievements

  • πŸŽ“ Linux Foundation Mentorship Program (LFX) 2025
  • πŸ§ͺ Research Assistant at MERL
  • πŸ“Š Improved LLM benchmarking reliability by ~25%
  • 🧠 Hands-on experience with LLMs, GANs, MLOps, and CI/CD
  • πŸ“ Contributor to open-source and research-grade tooling

πŸ“ˆ GitHub Stats


πŸ“« Connect With Me


⭐ If you find my work useful, feel free to star a repository.
🀝 Open to collaborations in AI, LLMs, MLOps, and open-source systems.

Pinned Loading

  1. riscv-software-src/riscv-unified-db riscv-software-src/riscv-unified-db Public

    Monorepo containing a machine-readable database of the RISC-V specification and artifact generation tools

    C++ 130 80

  2. Vermithor Vermithor Public

    RISCV RV-32I 5 Stage Pipelined Processor

    Scala

  3. merledu/ArcheV merledu/ArcheV Public

    RISC-V RV-32i RTL Benchmark for evaluating Large Language Models.

    Verilog 3

  4. merledu/ai4org merledu/ai4org Public

    Python 1