Aastha Varma aasthavar

👋 Hi, I’m Aastha

Applied ML researcher/engineer. I build scalable LLM systems, agentic AI, and robotics, bridging research → production.

LLM Pre-training & Fine-tuning: Distributed pipelines (DeepSpeed, FSDP, SMP, LoRA/PEFT, CPU offload) for 7B–70B models → training 57h → 5.6h, multi-GPU/Node, benchmarked and scaled for production.
Inference Optimization: Triton + TensorRT + vLLM, fused attention, multi-LoRA adapters, quantization → 70% latency reduction, 80+ QPS, production-grade deployments.
Embodied & Agentic AI: Modular exoskeleton (LLM + vision + speech) → ICML 2025 demo. Multi-agent orchestration for GenAI campaigns & designer assistant with RAG, query rewriting, hallucination detection.
Mechanistic Interpretability: Crosscoders (sparse autoencoders) to probe LLM instruction-tuning; HF open-source pipeline.
NP-hard / HPC Projects: Brick Maestro: Lego assembly optimization using HPC, AWS ParallelCluster → presented at AWS re:Invent, Paris Summit.
Foundations: Deep Learning for Face Anti-Spoofing (Thesis), TA at NIT Rourkela, Algorithm (karatsuba + quad itoh-tsuji) optimization @ DRDO.

Training: PyTorch, DeepSpeed, FSDP, SMP, LoRA/PEFT, Multi-GPU/Node
Inference: TensorRT-LLM, vLLM, sglang, LoRA adapters, Quantization & Distillation
Infra: AWS (SageMaker, EKS, HyperPod, ParallelCluster), Docker, Prometheus, Grafana
Other: HPC, distributed LLM scaling, agentic AI