Skip to content
View rajesh-agrawal's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report rajesh-agrawal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rajesh-agrawal/README.md

πŸ‘‹ Hi, I'm Rajesh Agrawal

πŸš€ Senior Site Reliability Engineering (SRE) Lead | Cloud Infrastructure | DevOps | Platform Engineering | GitOps | MLOps | Kubernetes

  • πŸ‘€ Contributed to enhancing observability and monitoring of mission-critical ISG applications utilized by 1.5 million connected machines.
  • πŸ“« How to reach me ...
  • [email protected]

πŸ”§ About Me

I'm a results-driven SRE Engineering Lead with nearly two decades of experience building scalable, secure, and reliable systems. From high-volume ingestion platforms to cloud-native microservices, I specialize in driving operational excellence and modernizing infrastructure with automation-first principles.

πŸ’‘ My expertise lies at the intersection of:

  • 🌐 Cloud Architecture (AWS, Hybrid, On-Prem)
  • ☸️ Kubernetes & Container Orchestration (EKS, ECS, OpenShift)
  • πŸ” GitOps & CI/CD (Helm, ArgoCD, GitHub Actions, Terraform)
  • πŸ“Š Observability (Datadog, ELK, Dashboards, SLO/SLIs)
  • 🧠 MLOps Enablement (Kubeflow, Model Pipelines, GPU workloads)
  • πŸ” Security, Resiliency & Cost Optimization

πŸ› οΈ Tech Stack & Tools

Category Tools & Technologies
Cloud AWS (EC2, S3, RDS, Lambda), Azure (AKS), Hybrid
Kubernetes EKS, ECS, OpenShift, Helm, Kubeflow
Automation Terraform, GitHub Actions, Jenkins, ArgoCD
Languages Python, Golang, Bash
Monitoring Datadog, ELK Stack, Prometheus
CI/CD & GitOps GitHub Actions, ArgoCD, Helm, SOPS
MLOps Jupyter, ML Pipelines, GPU Scheduling, Tolerations
Infra as Code Terraform, Kustomize
Security IAM, Taints/Tolerations, Node Affinity, Compliance

πŸ“Œ What I'm Working On

  • 🌍 Designing a multi-region observability pipeline for microservices
  • πŸ§ͺ Building SRE Maturity Metrics Framework for product teams
  • πŸ”„ Improving GitOps workflows with validation gates and auto-rollbacks
  • 🧠 Exploring MLOps strategies for model lifecycle and GPU job optimization

πŸ“š Recent Interests

  • Kubernetes internals: nodeAffinity, taints, tolerations, scheduling policies
  • GitOps workflows with security-focused automation
  • Scalable observability patterns
  • Cost optimization and failover design in AWS
  • Hands-on with CKAD & OpenShift certification prep

🀝 Let's Connect


πŸ” GitHub Stats

Rajesh's GitHub Stats Top Languages


β€œSRE is not just a job, it's a mindset β€” blending engineering with empathy, resilience, and continuous learning.”

Pinned Loading

  1. PrometheusClient PrometheusClient Public

    Java 1

  2. SpringBootReaderWriterDriver SpringBootReaderWriterDriver Public

    Java 1

  3. UITestStrategy UITestStrategy Public

    TypeScript 1

  4. spring-otel-metrics-push spring-otel-metrics-push Public

    Metrics, logs and traces exporter

    Java