Securely Back Up Your PostgreSQL Cluster to an Encrypted Archive
Data security is paramount. When you need to back up or migrate an entire PostgreSQL cluster, you're handling sensitive information that requires robust protection. This guide will walk you through the best way to export all your databases, roles (users), and global objects to a single, encrypted file that only the root user can access. 🔐
Why Encrypt Your Backup?
A full cluster dump contains everything: your data, your application logic (functions, procedures), and user credentials.
- Encryption protects this data from being read by anyone who might gain unauthorized access to the backup file itself.
- Root-Only Permissions add another layer of security on the server, ensuring only the system's most privileged user can manage this critical archive.
An End-To-End Encrypted and Secured Pipeline
We are going to assemble a small script to handle the steps required to export our database cluster in as secure a manner as possible. At no point during this process will the data be exposed to unauthorized access.
Is This the Best / Only Way To Do This?
Absolutely not. Each command shown below can be swapped out for another option depending on your preferences or security requirements. The real take-away here is the workflow that handles your backups without ever exposing data temporarily. I offer the example below for guidance.
How Does This Scale?
Depending on the size of your databases and the cluster as a whole, you may have to alter your approach to backing up your data, but the pipeline below is still useful.
pg_dumpall creates an SQL script that will recreate all global objects and databases to restore your cluster to the state that it was in when it was backed up. If you have particularly sizable databases in your cluster, this could cause an issue.
pg_dump is similar to pg_dumpall but it only backs up a single database. No global objects are included.
pg_basebackup will do a backup of the data files for your cluster as they exist on disk.
Depending on your scenario and needs, you may need to use a combination of the above commands, but you can still use the pipeline below to secure the output of these commands, although you will have to setup a loop to handle the multiple files that would be output by pg_basebackup.
If you switch to using pg_dump to divide your backups into separate files, don't forget to also use pg_dumpall --globals-only to grab your global objects.
TL;DR
Here is the final script for those who do not want to read through the entire write-up below:
echo "Exporting Postgres DBs"
tmpfile=$(mktemp)
pg_dumpall -U postgres | openssl enc -aes-256-cbc -salt -pbkdf2 -pass file:/[your key file path]/.backup.key > "$tmpfile"
install -m 600 "$tmpfile" /[your backup path]/your_encrypted_archive.sql.enc
rm "$tmpfile"
echo "Postgres DB Export Complete."
Prerequisites
- Root or
sudoaccess to your Linux server. - A running PostgreSQL cluster.
- Familiarity with the command line.
Here is a step by step walkthrough of what this script is doing:
Step 1: Establish secure temporary file
For this we use mktemp. It's designed to create a temporary file or folder with a random string name to prevent collisions. The items created have permissions configured so that only the owner may read or write to the file/folder, ensuring that the data is secure during this step. In addition to this level of security, we're also going to encrypt the data before it touches this temp file in step 3.
Step 2: Dump the Entire PostgreSQL Cluster
For PostgreSQL, the best tool for this job is pg_dumpall. It's designed to dump the entire cluster, including all databases as well as global objects like roles and tablespaces, into a single script file. This is much simpler than handling databases and users separately.
For security and permissions reasons, it's best practice to run this command as the postgres system user.
If you want your archive to clear out existing databases, etc during the recovery process, insert the --clean option into your pg_dumpall command above. More on that below.
Step 3 (optional): Compress DB Dump Data
I don't have it in this script, but if you wanted to compress your db dump, you would add a pipe here just after the pg_dumpall command. After this pipe you would use something like tar to compress the output before handing it off to openssl for encryption.
Step 4: Encrypt the DB Dump Before it Hits the Disk🛡️
You will see above that pg_dumpall is piped into openssl . This portion encrypts the pg_dumpall output using the algorythms and features that we desire. Which options to use are beyond the scope of this post. I may create another post on this topic in the future. There are other options for encryption, such as gpg . The main point is that you encrypt your backup before sending it to disk and also that you understand the steps required to decrypt them in the event that you ever need to do a restore.
One important detail here is the use of a .key file to store the encryption passphrase. This file must have the right permissions assigned or else all of the work you are putting into securing your data will be wasted. I have this file set to be readable only by the account used to run backups. Just as a hint for those that would need it: You would use chmod and chown to set these permissions.
Note that the ouptut of this encryption process dumps the data to our temporary file that was created by mktemp . So at this point you have a temporary file that is encrypted and is in a secure location where only the owner running the script has access to it.
Step 5: Securely Move the Data to its Destination
We will use install instead of a simple mv because it allows us to set the file permissions (-m 600) and move the file in a single, atomic operation.
Note that the final file has an extension of .sql.enc . This should help others understand that they are looking at an encrypted sql backup.
Step 6: Finalize and Clean Up
Here we simply delete the temporary file and echo a message that the process completed.
How to Restore (Verification)
So you've messed up your database and/or cluster and you need to do a restore. Well, fortunately you've kept your backups safe and you documented your restore procedure. Right? ;-)
You'll need to decrypt your encrypted backup to a secure folder that only the owner has read/write access to (you may want to use sudo to do this as root):
openssl enc -d -aes-256-cbc -pbkdf2 -pass file:/[your key file path]/.backup.key -in [your_encrypted_archive.sql.enc] -out [your_decrypted_archive.sql]If you inserted a compression command per step 3, you would now have to decompress your archive. That might look something like this:
tar -xvf your_decrypted_archive.sql.tar
I like to open the unencrypted file to make sure that it looks like an SQL backup. You can use nano or vi for this but you should use head as your archive is likely a large file.
head -n 100 your_decrypted_archive.sqlFinally, import the archive to restore your cluster.
WARNING!
There are a few things to consider here:
1. If you included the --clean optiopn for the pg_dumpall command, the following import will attempt to drop and recreate all databases in the target cluster before restoring data contained within the archive.
2. If you omitted the --clean option in your pg_dumpall command, the following import will attempt to insert the archive data into the existing databases, which may or may not suit your objectives. If this doesn't sound optimal you will want to use the pqsl drop the existing database(s) before importing from the backup.
Once you are ready, the command to restore from your pg_dumpall archive is:
sudo -u postgres psql -f [your_decrypted_archive.sql]
If you used pg_dump , you need to specify the target database:
sudo -u postgres psql -d [database_name] -f [your_decrypted_archive.sql]Always remember to clean up the decrypted and extracted files after you are finished!
And that's it! You now have a secure, encrypted, and properly permissioned backup of your entire PostgreSQL cluster.
Addendum: Hey I'm Using Docker!
So you're using a Postgres Docker image. No problem. Because of the way that pg_dumpall directs its output, we can simply modify our script to run it via docker exec . Here's what it would look like:
echo "Exporting Postgres DBs"
tmpfile=$(mktemp)
docker exec [container name] pg_dumpall -U postgres | openssl enc -aes-256-cbc -salt -pbkdf2 -pass file:/[your key file path]/.backup.key > "$tmpfile"
install -m 600 "$tmpfile" /[your backup path]/db_backup.sql.enc
rm "$tmpfile"
echo "Postgres DB Export Complete."
This allows you to run your backup script from the host without any additional hassle.
Your restore will require a little more work, as you will have to place the decrypted archive in a volume mapped to the container from the host. This will make the decrypted archive file on the host visible inside the container. Once you have done that, you can use:
docker exec -it [container name] psql -U postgres -f [/path_to_archive/your_decrypted_archive.sql]to restore from your backup file. Don't forget to specify the database name if you used pg_dump. Remember, path_to_archive is the path from within your container, not on your host machine.
Need some help, have a question or suggestion? Feel free to leave a comment below!