About

Most infrastructure content falls into one of two traps: either it's so simplified that it's useless in production, or it's so dense that only the author understands it.

ScaleDeep exists because I got tired of both.

The Short Version

I'm Raju Gupta. I've spent the last 13+ years building and operating infrastructure at companies like Meesho, Razorpay, and Plivo—environments where "a bad day" means millions of failed requests, not a slow loading spinner.

At Meesho, I lead infrastructure for systems handling millions of requests per second across thousands of Kubernetes deployables. This includes the ML infrastructure that powers recommendations and search ranking at scale. Before that, I built payment infrastructure at Razorpay during its hypergrowth phase.

My work sits at the intersection of DevOps, SRE, and MLOps—because in modern systems, you can't separate the model from the infrastructure that serves it. I hold CKA, RHCE, and RHCSS certifications, but more importantly, I've been on-call when things break at 3 AM.

This isn't a humble brag. It's context. Because the problems you face at 100 RPS are fundamentally different from the problems at 1M RPS, and most content out there doesn't acknowledge that gap.

ScaleDeep is my attempt to bridge it.

Why This Exists

Here's what I've noticed: engineers at smaller companies often can't find practical guidance for scaling challenges because most "scale" content comes from FAANG engineers solving FAANG-specific problems. Meanwhile, engineers at larger companies are often too busy firefighting to write about what they've learned.

The same problem is even worse in MLOps. Most ML content focuses on model architecture and training, while the unglamorous work of deploying, monitoring, and maintaining models in production gets ignored. But that's where most ML projects actually fail.

I wanted to create something different—content that's:

Deep enough for senior engineers who've seen some things and want to see more
Accessible enough for newcomers who want to understand why, not just how
Practical enough to implement without a team of 50 and a $10M infrastructure budget
Honest about trade-offs because every architecture decision is a bet

What You'll Find Here

Systems Deep Dives — Thorough breakdowns of how things actually work under the hood. Kubernetes scheduler internals. eBPF-based observability. Infrastructure architecture. The stuff that's hard to find explained well.

Hyperscale Patterns — Patterns that only emerge (or matter) at massive scale. Rate limiting strategies that don't fall apart at 1M RPS. Cell-based architectures. Consistent hashing in practice.

MLOps & AI Infrastructure — The operational side of machine learning that nobody talks about. Data pipeline drift detection. Feature store architectures. Model serving at scale. How to build ML systems that don't page you at 2 AM because predictions started drifting silently.

Practical Playbooks — Step-by-step guides you can actually implement. Building drift detection systems. SLO frameworks that work. Incident response automation that doesn't suck.

Mental Models — Frameworks for thinking about systems. The Universal Scalability Law. CAP theorem in practice (not theory). How to reason about failure modes before they happen.

What You Won't Find

AI-generated filler that says nothing
Fabricated "war stories" for engagement
Vendor-sponsored content pretending to be neutral
Oversimplified explanations that would get you fired if you followed them
Gatekeeping jargon that exists to make the author feel smart
ML content that ignores the infrastructure reality

The Name

ScaleDeep — it's not subtle, but it's accurate. Scale is the context I work in. Depth is what I try to bring. The tagline writes itself: Where Scale Meets Depth.

Get In Touch

I'm always interested in hearing from engineers dealing with interesting scaling challenges—whether that's infrastructure, MLOps, or the messy intersection of both. Especially the problems that don't have clean solutions.

Find me on LinkedIn or check out my personal site at rajugupta.me.

If something I wrote helped you solve a problem, or if you spotted an error that would embarrass me in a code review, I'd genuinely like to know.