Using HAProxy for load balancing FileCloud Servers
Although starting your own personal cloud server using FileCloud is pretty easy, it is generally seen that scaling your server to accommodate a higher demand is a far bigger concern. True that using a larger and more powerful server can help in this, there comes a time when the traffic is so large that a single dedicated server will not be able to handle loads. In such cases, you would need to distribute traffic in a smart way. This idea is known as load balancing.
We can implement load balancing at two levels. Firstly, we can implement load balancing for the server which runs FileCloud and secondly, we can perform the task for the database, which in this case is MongoDB. The two cases are illustrated below.
To demonstrate the same, we would need three virtual machines- one running FileCloud normally with the database in the same system (System A), one running FileCloud but connects to the database at A for storage (System B) and a final virtual machine which runs HAProxy as a load balancer (System C), which reroutes your request to either of systems A or B. In this case, it is important to understand that no matter which system (A or B) you choose, the response that you are going to get is going to be the same.
Setting Up Systems A and B:
The first step here is to install FileCloud normally on System A. You can follow the guide here to start installing your system.
Secondly, you need to install FileCloud the same way on System B but point the database to the one on System A. The admin documentation instructs you on which file to edit and what to change.
Installing and configuring HAProxy:
The third part is usually the toughest, since you need to set up a load balancer. We use HAProxy in our case. We use a Ubuntu server to demonstrate the use of HAProxy. Install HAProxy by running
sudo apt-get install HAProxy
To enable it, you need to edit /etc/default/haproxy and set the value of ENABLED to 1. Next, you need to edit the configuration file which is located at /etc/haproxy/haproxy.cfg. Although the documentation of HAProxy is self explanatory, it doesn’t provide you with a sample working config file, which might be an issue.
timeout connect 5000
timeout client 50000
timeout server 50000
listen appname 0.0.0.0:80
stats uri /haproxy?stats
stats realm Strictly\ Private
stats auth A_Username:<username>
stats auth Another_User:<password>
server serverA <server_ip>:80 check
server serverB <server_ip>:80 check
Here is a brief explanation of the options.
- maxconn- maximum number of concurrent connections. Default value is 2000.
- user, group- the user, group that HAProxy users.
- retries- number of retries on connection failure.
- timeout- maximum time to wait for a connection attempt. It is defined for a connection, server and client. Client and server timeouts have the same value.
- option redispatch- Redispatch a connection to a live server if one is down. The alternative to this option is option persist, which forces requests to down servers.
- We listen to port 80 because we are setting it up for only HTTP requests.
- stats- We are enabling stats and a demo can be viewed on the official site. You can view the stats on your system at xxx.xxx.xxx.xxx/haproxy?stats
- balance- It refers to the load balancing algorithm to use. The options are roundrobin, static-rr, leastconn, source, uri and url_param.
- server- server directive declares the backend servers. The check option in the server directives checks for the health of the server.
Once you have saved the configuration file, start HAProxy by running service haproxy start.
FileCloud uses MongoDB for its database storage and MongoDB supports replication and sharding by default. Here is a great tutorial on scaling MongoDB.
We hope that this post helped you in understanding the basic implementation of load balancing for FileCloud. If you have any issues, feel free to leave in the comments below.