RAID is a system to prevent data lost by mirroring two storage devices. In my opinion it should be a part of a server data backup strategy.
I have much experiences with setting up software RAID1’s. In the most of the cases I use a btrfs which mirrors the devices of an physical RAID0 cluster. A repositories which shows an example for how I use a software RAID1 can be found here:
In this article I give an overview over one possibility how to manage backups of dedicated root servers. This article is not supposed to guide one through all steps. Instead the goal is to present a concept.
Servers
One of my backup setups consist out of two servers running with Arch Linux. Both servers run with a similar configuration. They use an LUKS encrypted virtual RAID1 . The production server in the scheme is called arch-server1. All applications are running on it. The most of the applications run in docker containers. For this reason an script makes an local backup of all docker volumes. This backup later gets pulled from the server arch-server2. In this scheme this server is just responsible for the backup of the data from the production server. For this reason it’s possible to use an server with not so much power.
Root dedicated server backup scheme
RAID
Like I mentioned above both servers use an RAID1. This guaranties that a copy of the data exists on a hard drive if one of them fails. In general I prefer to use software raids with Btrfs because I’m used to this system. If one wants to have performance advantages one can also implement an RAID10 system. For my use cases RAID1 is enough, so I didn’t need to dive deep into the other RAID configurations.
Folder Structure
I use a specific backup folder structure which appears to be the same over all servers. This structure allows me to easily find the right backup and synchronize it over multiple server instances. The structure which I use appears to be the following:
Path
Description
/Backups/
This folder contains all backups.
/Backups/{{machine}}/
This allows to use unique id to backup data of multiple instances without conflicts. The hashing with sha256sum happens out of security reasons.
/Backups/{{machine}}/{{script}}/
The sub-folder {{script}} under {{machine}} represents the script name with which the backup was created. This is helpful if one use different scripts to backup data to avoid conflicts.
/Backups/{{machine}}/{{script}}/latest
This folder contains the last state of the backup.
/Backups/{{machine}}/{{script}}/deleted/{{time}}
This folder contains the files which had been deleted.
Backup folder structure
The folder {{machine}} is defined by the following function:
$(sha256sum /etc/machine-id | head -c 64)
The folder {{time}} is defined by the following function
"$(date '+%Y%m%d%H%M%S')"
Local Backup
All configuration files of the production server I manage with Ansible. For this reason I don’t need to back them up. The only data which I need to backup is the data which is stored in the Docker volumes. I do this with the script docker-volume-update. The role native-docker-volume-backup is responsible for installing the script on the server and installing a systemctl timer which calls this script daily in the time frame between 3:00am UTC and 4:00am UTC.
Remote Backup
The remote backup server pulls the local backup with the script pull-remote-backup. The role native-pull-remote-backups is responsible for installing the script on the server and installing an systemctl timer which calls this script daily in the time frame between 10:00pm UTC and 11:00pm UTC.
Disconnected Backup
To reduce the risk of data lose it is recommended to store backups on an offline memory.