rsync is a program that can be used to synchronize files and directories across a variety of local and remote locations. It can interact with multiple operating systems, work over SSH, provide incremental backups, execute commands on a remote machine, and replace the need for the cp and scp commands. The rsync program is an invaluable asset for any system administrator who intends to run a server or manage a network of computers, as it not only simplifies the process of making backups in general, but it can be used to action a complete backup solution. For this reason, it is the purpose of this process to offer a suitable starting point for a small utility that will quickly become your trusted friend.
To Start With: What Do You Need?
In a bid to complete this process, you will require a working installation of the CentOS 7 operating system with root privileges, a console-based text editor of your choice, and a connection to the Internet in order to facilitate the download of additional packages.
The Process
During the course of this process, it will be assumed that you know the location of the source files and directories that you wish to synchronize and that a suitable destination is available:
- To begin this process, log in as root and install rsync by typing:
yum install rsync
- Now, create a target directory for our synchronization (change the folder name appropriately):
mkdir ~/sync-target
- To begin the synchronization process, simply repeat the following command by modifying the value used for /path/to/source/files/ with something more applicable to your needs:
rsync -avz --delete /path/to/source/files/ ~/sync-target
- Having used the Return key to confirm the preceding instruction, your system will now respond with a live report of what is being copied. When this process has finished, you can then compare both directories to see that the contents are exactly the same. To do this, use the diff command (if both are the same, no output will be written):
diff -r /path/to/source/files/ ~/sync-target
How it works…
In this process, we considered the use of rsync through the command line. Of course, this is only one of the many ways that this tool can be used, but by using this approach we were able to explore a handful of the features provided by this very valuable utility.
So, what did we learn from this experience?
Rsync is not intended to be complicated. It is a fast and efficient file synchronization tool that is designed to be versatile by giving you complete access to an array of features on the command line. It can be used to maintain an exact copy (or mirror) of the source directory on the same machine or on a completely different system, and it does this by copying all the files once and then only updating the files that have changed the next time you run it. This can save tremendous bandwidth and should be your primary tool when copying data over the network. The use of the phrase, --delete, is important, as it instructs rsync to delete files on the target that do not exist in the source, while the chosen flags imply that rsync should use -an archive mode in order to recursively copy files and directories while keeping all permissions and time-based information; –v)verbosity mode so you can see what is happening; and –z to compress the data during the file transfer in order to save bandwidth and reduce the amount of time required to complete the entire process.
As you can see, rsync is very flexible and has many options that go beyond the purpose of this process, but if you want to exclude certain files you could always extend the original instruction by invoking the --exclude flag. By doing this, you tell rsync to back up an entire directory but ensure that it does not include a predefined pattern of files and folders. For example, if you are copying files from your server to a USB device and you do not want to include large files (such as a .iso image) or ZIP files, then your command may look similar to this:rsync --delete -avz --exclude="*.zip" --exclude="*.iso" /path/to/source/ /path/to/external/disk/
On a final note, there is the subject of verbosity. Verbosity is very useful, but a tendency to use bytes as its primary unit of measurement can be a source of confusion. So, in order to change this, you can invoke rsync with the –h (or human readable) option, as shown next:rsync -avzh --exclude="home/path/to/file.txt" /home/ /path/to/external/disk/