Sr Site Reliability Engineer · AWS Certified · Kubernetes Administrator
15+ years ensuring reliability, scalability, and performance across high-traffic e-commerce platforms serving millions of customers. I bridge development, platform engineering, and operations—enabling fast releases without compromising system stability.
I'm a customer-focused SRE professional at Ahold Delhaize USA, supporting high-traffic e-commerce platforms that power some of America's most-loved grocery brands—Stop & Shop, Giant, Food Lion, Hannaford, and Peapod. I specialize in ensuring reliability, scalability, performance, and observability across web and mobile applications backed by GraphQL APIs and containerized microservices running in Azure.
My journey spans Nordstrom, AT&T, and Innova Solutions—from building CI/CD pipelines and containerizing microservices to standing up full observability stacks and driving incident management. I hold a M.S. in Computer Science from California State University, East Bay and a B.S. in Computer Science from Osmania University. I'm both an AWS Certified DevOps Engineer and a Certified Kubernetes Administrator.
When I'm not on-call, I'm exploring canyons, learning new cloud patterns, or mentoring the next generation of engineers.
Site Reliability Engineer specializing in large-scale e-commerce platforms, focused on Kubernetes reliability, observability engineering, AI-driven incident automation, and performance optimization across web and mobile ecosystems. Passionate about reducing MTTR, improving customer experience, and building resilient distributed systems in Azure cloud environments.
A deep look at how I drive reliability across frontend, mobile, and e-commerce platforms.
Concrete outcomes from my focus on reliability, automation, and cross-team collaboration.
Led root cause analysis for major incidents, implementing long-term fixes that improved Mean Time To Recovery by 20%.
Built system resilience through preventive monitoring and deployment guardrails, reducing critical production incidents by a quarter.
Maintained 99.9% uptime SLAs while simultaneously cutting infrastructure costs across multi-cloud environments.
Improved frontend performance impacting checkout conversion rates through targeted RUM insights and API optimization.
Led weekly cross-team releases using blue-green and canary patterns, ensuring zero customer-facing downtime per deployment.
Pioneered AI-driven incident triage and observability intelligence, positioning the team for next-generation SRE automation.
15+ years building infrastructure and reliability at scale for leading enterprises.
California State University, East Bay, 2015
Osmania University, India, 2012
Professional Level
CKA — CNCF