The monolith-to-microservices talk
I gave a talk at Google Cloud Next 2022 titled “Migrating Monolith Applications into Microservices.” The slot was 30 minutes. The title looked tactical; the talk was a careful argument for why most microservices migrations fail. Here is what I’d say differently four years later.
The 30-minute structure
The slot mechanics matter. A 30-minute slot at Next is roughly:
- 4 minutes intro and credibility setup
- 18 minutes content
- 5 minutes Q&A
- 3 minutes reserved for the changeover
Eighteen minutes of content forces ruthless cuts. The first draft was 45 minutes of material. The shipped version had three things:
- A framing of why monolith-to-microservices is hard.
- One worked example of how to do it carefully.
- A short list of “don’t do this” patterns.
Everything else got cut. Architecture debates, vendor comparisons, specific framework choices — all out. The discipline was: every slide had to either change someone’s mind or change what they did on Monday.
The argument
The talk’s core claim: most microservices migrations fail because the team picks the wrong moment to split.
Three common wrong moments:
-
Splitting before the monolith hurts. The team has heard that microservices are “better.” They split prematurely. They inherit the operational complexity of microservices without the scaling or team-autonomy benefits.
-
Splitting at the wrong boundary. The team splits services along technical lines (database vs API vs background workers) rather than along team / business-capability lines. The result is high-coupling services that always deploy together.
-
Splitting while still actively changing the schema. The team splits before the data model has stabilised. Every schema change becomes a cross-service migration; velocity collapses.
The right moment, by contrast: when the monolith is genuinely slowing the team down, when the boundary is obvious (one team owns one capability), and when the schema is stable enough that cross-service contracts can be set.
The worked example
I walked through a hypothetical migration: a monolithic e-commerce backend, split into checkout / catalog / search / orders.
The detail I emphasised: each split was a separate quarter. Don’t split four services in parallel. Pick one, ship it, learn from it, then do the next. The team’s operational maturity grows with each split; trying to do them all at once means none of them get the attention they need.
I also showed what not to do: a “big bang” migration where everything moves at once. The audience nodded knowingly. Several of them had lived through it.
The “don’t do this” list
Six anti-patterns in five minutes:
-
Distributed monolith. Services that always deploy together, share a database, and can’t be independently versioned. The worst of both worlds.
-
Shared schema. Multiple services writing to the same database tables. Every schema migration is a cross-team coordination exercise.
-
No service-level SLO. Without per-service SLOs, every incident becomes a cross-team blame exercise. With SLOs, the ownership is clear.
-
No idempotency keys. Distributed systems without idempotency keys can’t safely retry. Retries become a source of bugs.
-
No circuit breakers. One slow downstream takes down every upstream service that calls it.
-
No observability investment. Microservices without distributed tracing are unobservable in production.
Each anti-pattern took 50 seconds. The list was the most-quoted part of the talk.
What the Q&A revealed
The five-minute Q&A told me what the audience actually cared about. The questions clustered around three themes:
- “How do we know when we’re ready?” The maturity question. Most teams ask this after they’ve started splitting.
- “How do we get buy-in from leadership?” The political question. Tech is the easy part.
- “What do we do about our existing distributed monolith?” The painful-reality question. Most teams aren’t migrating from a clean monolith; they’re trying to fix a distributed monolith.
I had partial answers for all three. The third one — fixing a distributed monolith — is genuinely harder than starting from a monolith. The talk didn’t cover it; the Q&A did.
What I’d say differently in 2026
Four things changed in the four years since:
-
Service meshes matured. Istio + Linkerd are now stable. Some of the cross-cutting concerns the talk recommended you build yourself (retries, circuit breakers, observability) can come from a mesh. The trade-off is different now.
-
Cloud Run / serverless changed the math. Many “microservice” workloads are better as Cloud Run services than as Kubernetes pods. The operational tax is lower; the autoscale shape is often a better fit.
-
gRPC + protobuf is the default contract. The talk had a slide saying “consider gRPC.” It would now say “use gRPC unless you can’t.”
-
The agentic AI / multi-agent shape is the new microservices. Each agent is a microservice with policy wrappers around its inputs and outputs. Most of the microservice patterns transfer; the boundary question changes because agent capabilities are more fluid than business capabilities.
If I gave the talk in 2026, the structure would be the same. The specifics would update. The core claim — most migrations fail because teams pick the wrong moment — is still right.
What speaking at Next was actually like
The audience was experienced. The Q&A was sharp. The hallway conversations after were the most valuable part — three conversations with engineers who’d lived through the exact failure modes the talk described, comparing notes.
The talk itself was a forcing function for clarity. I had to know the material well enough to cut 45 minutes to 18 and have the cut version make sense. The discipline made me a better engineer on that topic; the audience benefit was secondary.
For early-career engineers thinking about speaking at conferences: the work is upstream of the talk. The talk crystallises the work. If your work is sound, the talk will be sound. If the work is shaky, the talk will be too — and the audience will know.