blog-banner

Simplifying CockroachDB Backups to S3: A Guided Approach

Last edited on April 7, 2025

0 minute read

    Picture this: You own a bakery, and every evening, you need to store your secret recipes in a secure vault. You have two options: manually carrying them to a safe deposit box, or using a trusted courier service that ensures their safety without exposing your key. 

    Similarly, when backing up your CockroachDB cluster, you can either handle the process manually or leverage built-in external connection capabilities for a more seamless approach. In this guide, we’ll explore both methods — 1) manual backups using AWS CLI and 2) automated external connections to Amazon S3 — to help you choose the right strategy for your environment.

    Managed and Customer-Managed Backups in CockroachDBCopy Icon

    CockroachDB takes managed backups by default in all CockroachDB Cloud (Managed Service) offerings. These backups are viewable in the CockroachDB UI Console under the Backup and Restore page. More details on managed backups can be found here.

    However, for those who prefer control over their backups, CockroachDB also supports customer-managed backups using S3, Amazon’s high-speed web-based cloud storage service. This approach allows you to configure and maintain backups on your own terms, leveraging CockroachDB’s EXTERNAL CONNECTION capabilities.

    Learn more about external connections here.

    Choosing Between External Connection and Specified AuthenticationCopy Icon

    When setting up backups to S3, you have two options:

    1. External Connection: Ideal when delegating authentication management to an external service, such as a cloud storage gateway. This ensures backups run smoothly without manually handling credentials.

    2. Specified Authentication: Opt for this method when you need direct control over S3 authentication settings and permissions within CockroachDB itself.

    In general, choose external connection for abstraction and ease, and specified authentication for full control over the backup process.

    More details on authentication methods can be found here.

    Setting Up Manual Backups Using AWS CLI on MacCopy Icon

    For those preferring a hands-on approach, follow these steps to manually back up a CockroachDB cluster to S3:

    1) Install AWS CLI

    brew install awscli brew upgrade awscli aws --version

    2) Configure AWS Profile

    • Set up your AWS IAM user with an access key.

    • View, manage, and delete access keys if necessary. Note: Each AWS IAM user can have up to two access keys.

    Copy the access key:

    aws configure --profile $username  aws iam create-access-key --user-name $username

    View access keys by user:

    aws iam list-access-keys --user-name $username

    Get details of your AWS profile:

    grep -A 2 '\[$user\]' ~/.aws/credentials

    Delete a user’s access key:

    aws iam delete-access-key --access-key-id $access_key_id --user-name $username

    3) Set AWS Environment Variables

    • Configure AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN in your terminal.

    export AWS_ACCESS_KEY_ID=$access_key_id export AWS_SECRET_ACCESS_KEY_ID=$secret_access_key_id export AWS_SESSION_TOKEN=$session_token 

    4) Create External Connection

    • A CockroachDB feature that simplifies backups without exposing credentials.

    • If an external connection name already exists, drop it before creating a new one.

    • Verify the external connection is successfully created and stored in S3.

    DROP EXTERNAL CONNECTION $name; CREATE EXTERNAL CONNECTION $name AS 's3://bucket name?AWS_ACCESS_KEY_ID={access key}&AWS_SECRET_ACCESS_KEY={secret access key}'; SHOW CREATE ALL EXTERNAL CONNECTIONS;

    A file storing our external connection location will appear in our S3 bucket, along with the timestamp of creation.

    5) Perform Cluster Backups

    • Run a full cluster backup.

    • Confirm backup files in the S3 bucket.

    BACKUP DATABASE $database_name INTO 'external://$name';

    6) View and Query Backups

    • List all backups.

    • Query backups for specific databases.

    SHOW BACKUPS IN 'external://$name'; SELECT * FROM [SHOW DATABASE $database_name FROM LATEST IN 'external://$name'] where "database_name" = '$database_name';

    7) Incremental and Revision History Backups

    • Take an incremental backup.

    • Backup with full revision history which records every change made to the cluster within the garbage collection period leading up to and including the given timestamp.

    BACKUP INTO LATEST IN 'external://$name' AS OF SYSTEM TIME '-10s'; BACKUP INTO LATEST IN 'external://$name' WITH revision_history;

    8) Schedule Automated Backups

    • Set up a daily backup with revision history.

    • Monitor running backup schedules.

    • Pause or drop schedules as needed.

    CREATE SCHEDULE [IF NOT EXISTS] <label> FOR BACKUP [<targets>] INTO <location> [WITH <backup_options>[=<value>] [, ...]] RECURRING [crontab] [FULL BACKUP <crontab|ALWAYS>] [WITH SCHEDULE OPTIONS <schedule_option>[= <value>] [, ...] ]; SHOW SCHEDULES; PAUSE SCHEDULE $schedule_id SHOW PAUSED SCHEDULES; DROP SCHEDULE $schedule_id

    Using Implicit Authentication for S3 BackupsCopy Icon

    Implicit authentication allows CockroachDB to automatically use environment-based credentials or IAM roles without explicitly setting access keys. This is particularly useful when running CockroachDB on AWS EC2 instances.

    1) Create an IAM Role

    Assign AmazonS3FullAccess or specific permissions for backup, restore, import, and export. More on permissions here.

    2) SSH into an EC2 Instance

    chmod 400 path/to/private/key.pem ssh -i path/to/private/key.pem user@<instance_public_dns/ip>

    3) Configure IAM Role for EC2

    • List and create instance profiles.

    • Attach the instance profile to the IAM role.

    • Associate the instance profile with the EC2 instance.

    aws iam list-instance-profiles-for-role --role-name $role_name aws iam create-instance-profile --instance-profile-name $instance_profile_name aws iam add-role-to-instance-profile --role-name $role_name --instance-profile-name $instance_profile_name

    4) Use Implicit Authentication

    • Ensure AWS environment variables are configured in the same terminal.

    • Verify CockroachDB can authenticate to S3 without manual credentials.

    aws ec2 associate-iam-instance-profile --iam-instance-profile Name=$instance_profile_name --region=$region --instance-id $instance_id BACKUP DATABASE $database_name INTO $'s3://bucket_name.s3-website.$region.amazonaws.com?AUTH=implicit';

    Just like securing your bakery’s secret recipes for future use, backing up your CockroachDB cluster requires a reliable and secure approach! Whether you manually transport your data using AWS CLI or leverage CockroachDB’s EXTERNAL CONNECTION for a hands-free experience, both methods have their place depending on your needs.

    By choosing the right authentication and backup strategy, you ensure that your data remains safe, accessible, and well-organized.

    Ready to try these CockroachDB backup methods for yourself? You can get hands-on with CockroachDB’s free cloud offering today.

    Ben Sherrill is a Sales Engineer for Cockroach Labs.