How to backup PostgreSQL on Ubuntu to the cloud of your choice?

PostgreSQL is very popular solution as backend for many web projects (I believe there are other reasons to use PSQL for data store). I recently wrote short note about PSQL dump, but I'd love to extend its capabilities by offloading it to the separate storage (rather than keeping dumps in the production compute instance). Essentially, what I am going to describe here is how to backup PostgreSQL databases to the cloud (public) storage. I will use two different options: Amazon Simple Storage Service (S3) and Google Cloud Platform (GCP) and Nearline storage class (may be even Coldline to cut costs).
My strategy is based on three steps:

  • Local database dump;
  • Copy .bak files to object storage;
  • Clean up local folder (for dump).

Backup PostgreSQL

PostgeSQL as any other database engine has internal tool to make DB dump. Its structure (dump file) contains PSQL queries to create DB, tables and insert data (CREATE, INSERT commands) and this is what we are going to do first. First let's check list of available DBs (if you don't remember, don't know) through the psql client.

postgres=# \list
List of databases
  Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
hr        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |

[ ... skipped ... ]

(4 rows)

We have some default DBs and hr with postgres as owner. I am going to dump hr to /hr/home.

pg_dump hr > hr.sql

Now, when the dump is created, we can check its content simply typing:

cat /home/hr/hr.sql

We see that it contains native language (PSQL) with INSERT, CREATE etc. This will be our backup file. And here we have thousand options to do this. For instance we can do this using AWS CLI as I plan to use S3 as my storage. We can also automate it (third-party backup tool).

PostgreSQL backup to the cloud

By having this file in local disk we are protected agains software (PSQL) failures or users typos, however we are not protected agains general server (VPS) failure. There are many strategies for backups, but essentially every single admin tries to follow 3-2-1 rule (where you need 3 backups of your data in 2 different locations with at least one off-site). So we are going to have 1 off-site for sure (Cloud Storage). The rest, I believe, is easy to implement. There are several ways to off-load data to the cloud (depends on admin skills and available time to setup and configure accordingly). I have two options listed below: semi-automate via AWS CLI and automate via CB Backup for Linux.

Semi-automate backup to Amazon S3

I am going to add AWS CLI to my Ubuntu and configure access to Scalable Storage in the cloud (S3) using IAM access and secret key. This gives me unlimited objects storage "attached" to my Linux VM and I can use it for my assets and backups. It is described here.
Make sure you change "AWS Access Key ID [None]" and "AWS Secret Access Key [None]" to yours, which you can generate in your AWS Console. If it does not install AWS CLI, refer to the installation page, you may miss some of the required dependencies. Make sure you give appropriate permissions to your IAM user (at least we need full access to S3 services). Good tool for managing IAM permissions.

AWS IAM Permissions management

Cloud backup tool

Another good option to do this is to have third-party tool to automate backup. In this case I want to use Cloud Backup for Linux as it has very reach CLI and can use any storage (AWS, GCP, Azure, CTL etc).

cbb addBackupPlan -n "PostgreSQL Backup" -a "AWS_S3" -f "/home/hr/hr.sql" -c yes -every week -at "23:00" -weekday "mo, tu, we, th, fr" -notification on

By having this in CLI we set backup for hr.sql to AWS_S3 with compression on daily basis at 11pm on weekdays only.

Wrap it up

Now we have two things working independently (we execute dump manually), CB off-loads at 11pm. The missing brick here is cron. I made simple bash script with two lines: pg_dump and cbb plan -r "PostgreSQL Backup". I also disabled cbb plan scheduler, so It never runs based on its own schedule as I want this handled by cron. Cron schedule depends on RPO, I set daily in my case. My cron example below:

crontab -e
00 23 * * * /home/hr/

Here is nice cron shortcut with asterisks / number in its line:
Crontab example

That's it!

In short, this is about:

Start discussion:
Related articles:
CloudAcademy webinar: how to use AWS services for Backup and DR challenges. I am presenter! ... more
In this article you'll learn how to protect your data against hardware and software failures according to the 3-2-1 rule. ... more
Looking for free option to backup your data? No issues, - have a look at various options available with AWS S3, AWS CLI or 3d parties for your sensitive data. ... more
almost 4 years#aws s3 #backup
Here at CloudBerry, where one of the goal I have is to help partners decide, pick and build right configurations for dealing with computers data when they need backup solution. FTP, SCP, WebDav and some other proprietary protocols have been here forever, where simplicity and flexibility made them number one in data protection and management fields as primary target configurations. But, there is big BUT... ... more