Without realizing it, you connect to a web server every day. When you do this, a load balancer (a router or switch of sorts) will disperse incoming requests like yours between several servers and will stick you to that server until your visit is over.
Load balancers are used when a single server doesn’t have the capacity to handle the number of incoming connections, nor the background processing requirements. They’re designed to increase the number of clients you can serve concurrently, and also provide redundancy in case of a server failure.
Even load-balanced servers can run into trouble related to traffic load. I’ve seen many examples of how a server farm can hit performance thresholds because of load balancing setup choices initially made without anticipating the effect of business growth.
As your business grows, you don’t want to suffer from growing pains, service disruptions, or the costs of troubleshooting to figure out how to adapt because the initial design doesn’t allow you to grow beyond inherent capacity.
In a future article, I’ll cover the basics of how the load balancer functions can help you see which basic features to use and when to use them. This isn’t meant to be vendor specific, but since they’re so popular I’ll refer to features and functions used on the F5 LTMs. This isn’t going to be a comprehensive study guide, either – we’ll cover only the things you need to understand how planning up front can save you a lot of headaches down the road. There’s too much to cover without turning this into a book, so we’ll focus on the widely differing parameters that’ll help illustrate the extremes of choices that can affect performance scalability.
Points to Consider
There are a few things you need to consider when deciding on the setup. Keep these in mind as we step through some of the feature functionality.
- Do all of the servers in any one of your application farms have the same capabilities?
- Do you expect each of your client connections to originate from unique IP addresses?
- Do you expect each of your client connections to originate from a single NAT’d IP?
- Do you have a traffic baseline already established for this application?
In addition, keep in mind when your gateways need to make application calls to the APP server tier or the app server tier to the database (DB). It’s a client, too, and you should look at it like it’s just another client, which becomes especially important in point #3 above.
(Note: DB’s are most often not behind load balancers but more often clustered using DB specific clustering functionality. They can still be load-balanced, however.)
Load Balancing Design Models
You might have a simple web server, or a web based application requiring a gateway, application server, and database. The simple web server would be a single level of servers and the web facing application might be tiered.
Client 1 — LB — Server1, Server2, Server3, etc.
Client 2 — LB — GW(s) — LB — APP(s) — DB(s)
The simple web server is a collection of servers on one level behind a single LB, or a pair in high-availability mode, serving the same content (most likely static unchanging content), and they may even be running the application code locally and fetching from a common file/application server or local storage. This model might be used within a business for internal applications that employees use daily or periodically. The single level model is only considered horizontally scalable.
The web based application would have a web server (GW) behind a LB which makes calls to the application (APP) server (behind another LB) based on selections you make on a form. The application servers would make a call to the database to fetch the data depending on what you selected and return it to the gateway, which returns it to your computer screen. This model would be used for Internet facing applications where you expect a lot of connections and you need the inherent security of your data not being stored on the Internet-exposed server. This is also the model your business can suffer from the most if not scalable. The multi-tiered model is scalable both horizontally at each level as well as vertically.
Even still, your choice of load balancing parameters below can render a multi-tiered application un-scalable, or at least place limitations on its growth potential.
There are a lot of additional features and functionality in LBs, but I’ll keep it simple and as basic as possible. In this and next week’s article, we’ll be examining the tiered model. In a real-life scenario you’ll have firewalls, NAT, intrusion detection, and packet inspection. A lot of that can be done on the load balancer itself… but we’ll avoid going there.
So your incoming request hits the LB. The LB is taking server health readings (health monitors) to determine which server should get your incoming request, and uses a load balancing method or “next available server decision making process” (usually referred to as the load balancing algorithm), and of course a type of persistence where it decides what criteria it will use to ensure your connection stays on that one server.
The most critical of the three is persistence, and this is also the one thing that can foul your scalability more than any other choice. As we continue next week, you’ll see why.