Scaling can be a weighty subject…
It’s important to keep your AWS bill at a healthy weight.
You should probably consider reducing your mass-ive infrastructure. (yeah yeah, mass isn’t weight, I get it… geez)
Wow… How many scale puns can this guy come up with?
“A pun is the lowest form of humor—when you don’t think of it first.” ― Oscar Levant
Table Of Contents
What Exactly is Software Architecture?
To break it down quickly, software architecture is the high level abstract design of a software system. It’s important to understand that it’s the design of the entire system. It should take into account the relationships between elements of the software system as well as the interacting technologies.
Figure 1: Example of a high-level software architecture1
I’m going to break down different architecture methodologies in a future series, but for now this should hold you over.
The Problem of Scale
Scaling, Scaling, Scaling. It has become a buzzword in the industry.
“But is it scalable?” What does this buzzy sentence really mean?
Scaling is all about how you add additional resources to your software when the system is reaching a soft or hard limit on processing, memory, or communication. Simply put, if your system is hitting a constraint, how do you remove the constraint?
Needing to scale your application is generally a good thing. It means you’re doing something right. What makes scaling difficult are questions about “how” you add the necessary resources, and whether or not it’s stable and reliable.
For any number of problems there are a number of possible solutions. Part of being a discerning engineer is knowing what problems require which solutions and knowing when to stop. Two good examples are 1) the difference between a monolithic versus microservice architecture, and 2) first normal form versus fifth normal form2 normalization in database design. Understanding these principles comes with experience and, honestly, a great deal of trial and error.
Scaling software is not generally cut and dry. It requires a thorough understanding of the problem space and an understanding of the desired state of the system you are scaling. For example, you would not scale a database in the same way that you would scale an API endpoint because they have different requirements and expected functions.
A well-defined architecture and design can help guide the process assuming it’s done properly. However, far too many systems are built through an organic process of “implementation design.” Implementation design is a byproduct of a fail-fast mentality in which a team will make mistakes, re-adjust quickly, and reimplement. This kind of design methodology leads to a volatile architecture of mixed technology and methods, which are generally non-optimal. Again, though, a good architecture is a balancing problem between design and fast failure.
Ultimately, even small efforts in architecture design can reap massive rewards immediately with less overall “failure.” So, don’t be intimidated by it! In fact, I would postulate that learning from those who have failed in something allows you to succeed faster than re-learning from mistakes already made. It’s exactly what mentorship is based on. Even in machine learning, transfer learning3 can be incredibly effective in transferring knowledge. You don’t have to re-build a model when it can be trained by an existing one. So, too, can architecture design as it relates to scaling.
All that being said, it is easy, like in most things, to go too far with scaling. It is not a zero-sum game, and mixing solutions is likely the best route.
Vertical Scaling
Vertical scaling revolves around adding resources to individual computing machines to increase available resources.
For example, if your application needs more memory to be able to handle more requests you would add more memory to the server. Adding memory in this way would make that new memory available as soon as the system is back up and running, and the application can take advantage of it.
This all sounds reasonable right? What’s the problem?
Example Time (Costs) of Vertical Scaling
Although the ability to add resources to an existing system simplifies scaling, there are limits. These limits include available technology and, more importantly, how much money you have to throw at the problem.
In these graphs you can clearly see that, as the memory in an AWS M5 Series EC2 instance increases, the cost of the instance increases at an exponential rate.4
Vertically scaling using this pricing model is not viable for even the largest enterprises, and, honestly, there is a better and less expensive option. Furthermore, even assuming you could afford to vertically scale, there is still a hard limit to how far you can scale.
For example, the largest M5 Series EC2 instance you can get is 384gb memory and 96 CPUs. If you need any more than that you either 1) find a higher level, more expensive system that can give you more, or 2) you cannot scale beyond that. Eventually though you will hit an upper limit that you cannot scale past. In fact, assuming you could scale memory and CPU infinitely, you would still have network throughput to take into account.
Obviously, after slamming vertical scaling, I should probably present a solution to the computational and monetary barriers I just dumped on you, right?
Horizontal Scaling
Welcome “horizontal scaling” to save the day and fix all of our problems. Yes and no. Horizontal scaling is the idea that, rather than buying bigger and bigger systems to increase computational power, you add smaller systems that communicate with each other. This is pretty much stealing straight from the single-core / multi-core solution of adding more cores.
Example Time (Costs) of Horizontal Scaling
Using the AWS pricing data4 let’s compare an m5a.large and an m5dn.24xlarge EC2 Instance.
m5a.large | m5dn.24xlarge |
---|---|
Specs | Specs |
8gb memory | 384gb memory |
2 CPU** | 96 CPU** |
Cost | Cost |
$0.054/hr | $4.113/hr |
$473.04/year | $36029.88/year |
Let’s say, for example, that you want access to the same amount of memory and CPUs as a m5dn.24xlarge, but you don’t need to use a single system.
To achieve the same performance (with some caveats), you would need 48 m5a.large systems to get the full 384gb memory and 96 CPUs that the m5dn.24xlarge has.
However, if you calculate the price of 48 m5a.large systems the cost is only 2.592/hr or $22,705.92/year. So, scaling out by buying multiple m5a.large systems will save you ~37% of the cost of a single m5dn.24xlarge system.
Ok, that’s great and all, but what’s the catch?
Well, most likely if your application is currently something that can only be vertically scaled, then horizontal scaling doesn’t immediately help you. That would likely require a new design and architecture that is built to handle horizontal scaling. This is not an insurmountable task, although it likely feels that way. Many organizations have made the shift to a horizontally scaling architecture, but it does take proper design and planning to make the move worth it.
Also, there are some caveats to horizontal scaling. Because you’re decreasing your system size and moving to multiple coordinated systems, you are going to be increasing network IO, which has added expense. There is also increased complexity in managing horizontally scaled architectures. So, the numbers are not truly comparable on a one-to-one basis, and the memory and CPU numbers are not directly translatable either since each system will have an OS and other applications running on the system. Despite these stipulations, though, at a base level horizontal scaling is still much more affordable than vertical scaling.
How do I scale my application?
So, we have discussed why horizontal scaling is beneficial to my scaling problem, but how do I do it?
There are a number of different ways to deal with moving from a vertically scaling application to a horizontally scaling application, but each of them fall into the “discerning engineer” category. Your best bet is to take the current architecture of your software and evaluate different areas of it to see where you can take advantage of a horizontal model. Start small, and make incremental improvements.
For the last several years there has been a massive push for microservices. Before that, it was service oriented architecture. And before that, it was monolithic applications.
There always seems to be a cool new technology that becomes popular and ends up having excessive implementations. Personally, I believe microservices falls into this category. Designing architecture is about understanding the requirements and limitations of a system and building solutions around them. Not everything should be a microservice and, for that matter, not everything should be a monolith. A good design acknowledges that there is no universal, all-inclusive approach to creating a software solution.
My next posts are going to dive into monolithic / microservice architectures and different ways to scale both.