Why would we need to set up a Rsync? Sometimes, we need to set up file sync between two or multiple servers. Especially, if you are creating a distributed platform using a load balancing over multiple servers, you can not upload the new or the modified files to each server separately. Linux/Unix provides a very useful tool called Rsync using which, you can copy/replace incremental files to the destination.
Rsync is a fast and handy command-line utility that synchronizes files and directories between two locations through a remote shell, or from or to a remote Rsync daemon. It provides fast incremental file transfer by transferring only the differences between the source and the destination. Rsync can either be used within a server to take a backup of the files in a different directory. It can also be used over multiple servers to copy the new or the modified files from one server to the other.
To set up Rsync, first, you need to check if it’s installed. If not, you need to install it. Here are the commands below to check and install.
$ rsync --version
rsync version 3.1.2 protocol version 31
That means it’s already installed. Otherwise, run the following command.
$ sudo apt install rsync -y (For Debian based systems)
$ sudo yum install rsync (For Red Hat based systems)
Once the Rsync utility is installed or if it already exists, you need to do the “root” SSH login to the destination server. If you want to do a sync from Server A to Server B, you need to login to Server B first.
For the sake of examples, I’m taking two arbitrary IPs for the servers.
Server A: 184.108.40.206
Server B: 220.127.116.11
Log in to Server B using the “root” and switch to the user that owns the destination directory. Run the following commands:
$ su username
$ cd ~/
$ Generating public/private rsa key pair.
$ Enter file in which to save the key (/home/username/.ssh/id_rsa): [press enter]
$ Enter passphrase (empty for no passphrase): [provide no password]
$ Enter same passphrase again:
Make sure not to put any passwords while generating the keys. Now the private/public key pair has been generated for the user. Copy the public key and keep it available for later use. This key needs to be copied over to the source server. Run the following command to see the content of the key.
$ cat ~/.ssh/id_rsa.pub
Now login to the source server, Server A, with root login and switch to the username (i.e. su username) that owns the source directory.
$ vi ~/.ssh/authorized_keys
Put the key copied previously at the end of the file in a new line and save it. Now, go back to the destination server, Server B, and run the following command to check if the key-based authentication has been established properly.
$ ssh [email protected] (i.e. IP of the source server), it should show you something like
$ Last login: Tue Apr 14 11:46:47 2015 from IP 18.104.22.168
Set up r Rsync:
Now exit from the source server and log back into the destination server. Run the following ssh command to check if the Rsync is working properly.
$ rsync -avzhe ssh [email protected]:/absolute/path/to/the/directory/at/source/ /absolute/path/to/the/directory/at/destination/
If you are wondering what the options after the
rsync do, here is a list below of some of the useful options:
- -a / –archive: Archive mode, which allows copying the files and directories recursively. It preserves the symbolic links, file & folder permissions, user & group ownerships, and timestamps as well
- -v: verbose
- -z / –compress: Compress the data while transferring
- -h: Show the output numbers in a human-readable format
- -e: Mention the remote shell to use in the Rsync
- -r: Copies data recursively, however, it doesn’t preserve the timestamps and the permissions while transferring the data
- -l: Copy symlinks during the sync
- -u: Don’t copy the files from the source to the destination if the destination files are newer than the source
- –delete: This option will delete the files that exist in the destination, however, not in the source.
- –exclude: Exclude specific files or directories while doing the sync, e.g. –exclude=node_modules –exclude=vendor
Note: If SSH on the source server is listening on a port other than the default port 22, then you can specify the port using the
$ rsync -avzhe "ssh -p 2222" [email protected]:/absolute/path/source/directory/ /absolute/path/destination/directory/
Now set up a Rsync command using CRON job in 5 minutes interval on Server B.
Many hosting providers provide servers with one public IP and one private IP. Especially if you have opted for the dedicated server model. Public IP works over the internet. Private IP works over the intranet, however, only if the server nodes are in the same network zone. Make sure to use the Private IP if possible, otherwise, it will consume the public bandwidth for copying the files.
Follow the same steps for all the destination servers, like Server-C, Server-D, etc., if you have more than two nodes. You can even chain the copy rather than copying from only one node. Because that might create additional load on the source node. For example, you can set up Rsync from Server-A to Server-B, Server-B to Server-C, and so on. However, the file updates will be reflected in the later nodes after some delay as it’s a chained copy.