A Caution Against Microservices

Here's a story about Steve, the CTO of a Series-B startup called SaaSCo. It's a story I've heard a dozen times before. To me, this story is now a fable, a cautionary tale.

SaaSCo is six years old, heavily capitalized, and perpetually "6 months away from profitability". They have a decent product that's marketed well. Steve has overseen the growth of the product and engineering team from 3 to 30. The product lives on a monolith written in something mature and boring, like PHP, Python, or Ruby.

SaaSCo's product has a huge footprint. The important parts are mature and work well. There's a good deal of debt in the codebase. Ever since the front-end shifted to React a few years ago big parts of the legacy codebase are being deprecated.

The engineering team seems antsy. Deadlines are rarely hit. The product is just too big and complex. Changes often have unintended consequences. Somebody's always refactoring something, and internal documentation is essentially worthless.

Steve has heard the microservices success stories, and so have the engineering team. They're eager to try it. Many of them feel that the Python backend is boring, and they're itching to work with new technologies.

Steve decides to start the long and arduous process of converting the monolith to microservices. It'll be a big undertaking, but it'll all be worth it because of the increased velocity we'll have afterwards.

So SaaSCo engineering starts methodically breaking things out into microservices. First they remove image processing and replace it with an IMagick-based microservice. Then comes the Spamassassin service, and then the stats-processor service. This all goes pretty smoothly. Those services were already on the fringes of the monolith, and separated easily. Note that the team is now supporting four service deployments.

Next they try breaking authentication out into a microservice. This is tougher and scarier than planned, but it works out. Encouraged, they start breaking out more core services into microservices.

But then they hit a wall. They quickly discover that the microservices are all talking to each other constantly, and have hidden dependencies they didn't notice during planning. The legacy monolith is still running, albeit in a crippled mode, and has become a sort of service router for the app. Steve decides to pump the brakes on the aggressive microservice-ization in order to stabilize a bit. The solution is to build shared support libraries that help microservices interface with one another. There is also now a microservices style guide that specifies which languages and libraries should be used, in order for the majority of the team to be able to support the majority of the services. This, of course, makes the engineering team upset because a few of them really wanted to use Elixir in one of the services.

Steve now has a lot more problems than the monolith, which, by the way, never fully goes away. Instead of a monolith, Steve now has a dozen different services built by different people using a patchwork of different technologies, libraries, and API conventions.

The development velocity does not actually improve after shifting to microservices. Years go by in this state of microservices purgatory, and then with SaaSCo v4.0 the team proudly announces that they're going back to a monolithic model, except now with a new shiny framework!

I think the moral of the story goes something like this: Steve didn't have a services problem, he had a complexity problem. Moving to microservices does not necessarily solve complexity problems. Instead of focusing on the "DevOps" layer and topology, Steve should have started with an even more abstract view of what the system components are and where the complexities lie. The simple truth is that complex things are hard, and most successful business platforms are complex. And it turns out that teams of engineers are not typically experts at highly-complex system design, and so most will reach for solutions that sound right but don't actually address the root problem.

I'm not claiming that microservices aren't useful or a solid design patterm. I'm just saying that I've seen a lot of CTOs reach for the microservices hammer when they really needed a more subtle tool. In most cases I've observed personally, a much better approach is to reorganize your monolith into a shared-library model. Monoliths by themselves aren't evil. Microservices by themselves aren't good.

I would go even further and offer this rule of thumb: one should not convert a mature monolith to microservices. The move from monolith to microservices, for a sufficiently complex platform, is very expensive and fraught with peril. It's a large jump: microservices is not just one step away from monolith. The more complex a platform is, the more inertia it has, and the tougher it'll be to move that large distance towards a replatform. Moving straight to microservices in this case is like trying to stop a boulder in its path; whereas a subtle refactoring to a shared-library model is more like nudging the boulder so that it doesn't crash into the village below.

By the numbers: In my career I've personally observed maybe 20 examples of CTOs choosing microservices. Of the fresh, green-field platforms that choose microservices, 8 of 10 are successful. However, of the mature platforms that decided to rearchitect to microservices, I've only seen one company actually pull it off to completion. In other words, I've observed a 90% failure rate in large monoliths successfully and completely moving to microservices. The rest have ended up in a state of purgatory, or have rolled back their efforts.

So this is my caution against microservices: if Steve's story feels all-too-familiar to you, please think twice before trying to step in front of the boulder.

Burak Kanber, Engineer

Author, tech CTO, engineer, and some other things too.

A Caution Against Microservices