It all started when a friend of mine complained about how the load balancing algorithm he was using wasn't delivering the outcomes he was expecting. Because of his somewhat unique use case I was very intrigued, so I decided to help him solve the issue. Not long after, I found myself researching the literature on all the load balancing algorithms I could find. After a while, I decided to build a this simulation to better evaluate the algorithms based on:
Each algorithm serves a particular use case on a different side of the spectrum. Which should cater to most applications, ranging from CDNs to CRUD applications that keep their state in a centralized database.
Algorithm | Load Distribution | Cache Friendliness | Guarantees |
---|---|---|---|
Round-Robin | Uniform load distribution, if the requests take the same time. | Not friendly, because it assigns work to hosts randomly. | All hosts will get the same number of requests. |
Two Random Choices | Max load of any server is O(log log number_of_servers) . |
Not friendly, because it assigns work to hosts randomly. | Host's max load will be O(log log number_of_servers) , with high probability. |
Consistent Hashing | Load distribution is skewed, due to the random slicing of the ring. | Very friendly, because it'll always assign the same key to the same host, unless the topology changes. | When the topology changes, only 1/number_of_server will be assigned to different hosts. |
Bounded Consistent Hashing | Max Load of a host will never exceed ceil(average_load*scale_factor) . |
Very Friendly, unless a host load reaches its max, then it'll distribute requests coming to it to other hosts. | Adding or removing a host changes location of 1/(e^2) where e is scale_factor-1 - e in our case is 0.25-. |
The simulation is derived from a day's worth of anonymized real traffic logs donated from a major Online Classifieds site in the Middle East, which makes the study more meaningful than a randomly generated traffic.
before explaining the simulation let's talk about its Components:
Now that we got this out of the way, let's see how the simulation works: