This article is co-authored by David Joy, Senior Sales Staff Engineer for Cockroach Labs, Harsh Shah, Staff Sales Engineer for Cockroach Labs, and Krishnaswamy Venkataraman, Technical Specialist – Azure for Microsoft.
CockroachDB is a SQL database with a distributed cloud native architecture that offers high availability, scalability, and strong consistency. Deploying CockroachDB on Microsoft Azure allows you to leverage Azure’s robust cloud infrastructure to build resilient and scalable applications.
This is the second article in the series, Unlocking Enterprise Scale with CockroachDB on Microsoft Azure, where we help you utilize the best that CockroachDB on Azure has to offer. Visit our first post for an overview of how CockroachDB and Azure combine to unlock new levels of innovation and scale, for multiple use cases. Please visit our third post to learn about production options, deployment strategies for single-region and multi-region, plus critical aspects of survivability, security, network configurations, and more.
In this blog, we will explore different production options, deployment strategies, and key considerations for single-region and multi-region setups, while introducing key aspects of the Azure ecosystem.
Production Options for Deploying CockroachDB on Azure
When it comes to deploying CockroachDB on Azure, you have several production options:
Self-hosted on Azure Virtual Machines (VMs): Deploy CockroachDB self-hosted on Azure VMsfor complete control over your database environment. This option is ideal for those who need customized configurations and direct management of their infrastructure.
Self-hosted Azure Kubernetes Service (AKS): Deploy CockroachDB self-hosted on AKS for containerized environments, benefiting from Kubernetes orchestration. This is suitable for organizations already using Kubernetes or looking to modernize their infrastructure with containerization.
Dedicated: CockroachDB dedicated is a fully managed service that takes care of the operational aspects of running CockroachDB. This option allows you to focus on your applications while the service provider handles the database management.
There are benefits to each of our production options. When deploying CockroachDB on Microsoft Azure, which option you choose depends on your use-case and business needs.
Why CockroachDB Self-Hosted on Azure VMs?
Full Control: This strategy offers full control over the server environment, allowing you to customize hardware, OS, and other configurations.
Security: Ideal for organizations with stringent security and compliance requirements, where data needs to be closely managed.
Scalability: Provides the ability to fine-tune performance parameters and optimize resource utilization by giving you the ability to deploy across multiple cloud providers or in a hybrid configuration.
Why CockroachDB Self-Hosted on AKS?
Containerization: Leverages Kubernetes for orchestration, which is ideal for containerized applications.
Scalability and Automation: Offers automated scaling, updates, and management of containerized workloads.
Integration: Suitable for organizations already using Kubernetes, allowing seamless integration with existing workflows and tools.
Why CockroachDB Dedicated?
Ease of Management: A managed service that takes care of operational aspects such as backups, upgrades, and scaling. CockroachDB Dedicated clusters can meet various compliance requirements as well.
Focus on Development: Allows teams to focus on developing and optimizing applications without worrying about database operations.
Scalability: Easily scalable according to the needs, with the provider handling infrastructure changes.
Single Region vs. Multi-Region
For each of the above production options, you can run a single region or multi-region deployment. Single-region deployment of CockroachDB is ideal for applications that primarily serve users within a specific geographical area. It allows you to store data in a single cloud region, providing a straightforward setup and operation. Multi-region deployment of CockroachDB allows you to extend your application across multiple geographic regions while still functioning as a single logical database. This is particularly beneficial for applications with a global user base or those that need to comply with data sovereignty regulations.
Benefits of Single-Region CockroachDB
Simplicity: It's easier to manage and operate as all nodes are located within the same region.
Cost-Effective: It can be more cost-effective as it avoids the data transfer costs associated with multi-region deployments.
Performance: Since all data is located in one region, it can provide low latency access for users within that region.
Benefits of Multi-Region CockroachDB
Reduced Latency: By tying data to specific locations closer to where users access it, you can significantly reduce latency, leading to improved application performance.
Regulatory Compliance: It aids in meeting data sovereignty requirements, as it allows storing and processing of data within defined geographic boundaries.
Surviving Outages: It ensures that your database can survive availability zone or regional outages, providing an additional layer of data safety.
Scalability: It allows you to scale your application to accommodate a growing user base across different regions.
Key Integrations with the Azure Ecosystem
Deploying CockroachDB on Azure allows seamless integration with Azure’s extensive ecosystem, providing several advantages:
Scalability: Azure VMs and AKS allow you to scale your CockroachDB cluster up or down based on demand, ensuring optimal performance and cost efficiency.
Security: Leverage Azure’s robust security features, including virtual networks, security groups, and managed identities, to secure your CockroachDB deployment.
Integration: Integrate CockroachDB with other Azure services such as Azure Blob Storage for backups, Azure Monitor for logging and monitoring, and Azure Active Directory for identity management.
Global Reach: Azure’s global data centers enable you to deploy CockroachDB in multiple regions, ensuring low latency and high availability for users worldwide.
Check out how easy it is to integrate CockroachDB with Azure Event Hubs here:
Setting Up CockroachDB on Azure
Now that we have shared the different deployment options and their benefits, we will provide short walkthroughs for each.
Self-Hosted on Azure VMs
We recommend that you follow the official documentation, which includes the necessary requirements for deploying on Azure. Below we are sharing instructions on how you can use the Azure CLI to semi-automate the process.
1. Create Resource Group and VMs: Use the Azure CLI to create a resource group and multiple VMs for your CockroachDB nodes.
az login
az group create --name cockroachdb-resource-group --location eastus
# Create VMs
for i in 1 2 3; do
az vm create \
--resource-group cockroachdb-resource-group \
--name cockroachdb-vm$i \
--image UbuntuLTS \
--admin-username azureuser \
--generate-ssh-keys \
--size Standard_D4_v5
az vm open-port --port 26257 --resource-group
cockroachdb-resource-group --name cockroachdb-vm$i
az vm open-port --port 8080 --resource-group cockroachdb-resource-group --name cockroachdb-vm$i
done
2. Set Up Load Balancer: To distribute traffic across the CockroachDB nodes and provide high availability, set up an Azure Load Balancer.
3. Create a Load Balancer:
az network lb create \
--resource-group cockroachdb-resource-group \
--name cockroachdb-lb \
--frontend-ip-name cockroachdbFrontEnd \
--backend-pool-name cockroachdbBackEndPool \
--location eastus
4. Create a Health Probe: Configure a health probe to monitor the status of the CockroachDB nodes.
az network lb probe create \
--resource-group cockroachdb-resource-group \
--lb-name cockroachdb-lb \
--name cockroachdbHealthProbe \
--protocol tcp \
--port 26257 \
--interval 5 \
--threshold 2
5. Create a Load Balancer Rule: Define a rule to distribute traffic for CockroachDB's SQL and Admin UI ports.
# For SQL traffic
az network lb rule create \
--resource-group cockroachdb-resource-group \
--lb-name cockroachdb-lb \
--name cockroachdbSQLRule \
--protocol tcp \
--frontend-port 26257 \
--backend-port 26257 \
--frontend-ip-name cockroachdbFrontEnd \
--backend-pool-name cockroachdbBackEndPool \
--probe-name cockroachdbHealthProbe
# For Admin UI traffic
az network lb rule create \
--resource-group cockroachdb-resource-group \
--lb-name cockroachdb-lb \
--name cockroachdbAdminUIRule \
--protocol tcp \
--frontend-port 8080 \
--backend-port 8080 \
--frontend-ip-name cockroachdbFrontEnd \
--backend-pool-name cockroachdbBackEndPool \
--probe-name cockroachdbHealthProbe
6. Associate VMs with the Load Balancer: Add the VMs to the load balancer's backend pool.
for i in 1 2 3; do
az network nic ip-config update \
--resource-group cockroachdb-resource-group \
--nic-name cockroachdb-vm${i}VMNic \
--name ipconfig1 \
--lb-name cockroachdb-lb \
--lb-address-pools cockroachdbBackEndPool
done
7. Install CockroachDB: SSH into each of the 3 VMs that were created to download and install CockroachDB, and configure the data directories.
ssh azureuser@<VM-IP-ADDRESS>
curl https://binaries.cockroachdb.com/cockroach-v24.1.2.linux-amd64.tgz | tar -xz
sudo cp -i cockroach-v24.1.2.linux-amd64/cockroach /usr/local/bin/
sudo mkdir -p /var/lib/cockroach
sudo chown azureuser /var/lib/cockroach
8. Start the Cluster: Initialize the CockroachDB nodes and start the cluster, joining the nodes to form a cohesive distributed database.
cockroach start \
--insecure \
--listen-addr=<VM1-IP-ADDRESS>:26257 \
--http-addr=<VM1-IP-ADDRESS>:8080 \
--store=/var/lib/cockroach \
--join=<VM1-IP-ADDRESS>:26257,<VM2-IP-ADDRESS>:26257,<VM3-IP-ADDRESS>:26257 \
--background
cockroach start \
--insecure \
--listen-addr=<VM2-IP-ADDRESS>:26257 \
--http-addr=<VM2-IP-ADDRESS>:8080 \
--store=/var/lib/cockroach \
--join=<VM1-IP-ADDRESS>:26257,<VM2-IP-ADDRESS>:26257,<VM3-IP-ADDRESS>:26257 \
--background
cockroach start \
--insecure \
--listen-addr=<VM3-IP-ADDRESS>:26257 \
--http-addr=<VM3-IP-ADDRESS>:8080 \
--store=/var/lib/cockroach \
--join=<VM1-IP-ADDRESS>:26257,<VM2-IP-ADDRESS>:26257,<VM3-IP-ADDRESS>:26257 \
--background
# Initialize the Cluster
cockroach init --insecure --host=<VM1-IP-ADDRESS>:26257
9. Access and Manage: Access the CockroachDB Admin UI by navigating to http://<VM1-IP-ADDRESS>:8080
in your browser. Use the CockroachDB SQL client to manage your database: cockroach sql --insecure --host=<address of load balancer>
Self-Hosted on AKS (Azure Kubernetes Service)
The gist of the next tutorial will follow our general Kubernetes documentation, we will highlight the differences on Azure and from where to follow the documentation.
1. Create an AKS Cluster: Use the Azure CLI to create an AKS cluster, setting the desired number of nodes and enabling monitoring.
# Create a resource group in the desired region
az group create --name cockroachdb-resource-group --location eastus
# Create an AKS cluster in the same region
az aks create --resource-group cockroachdb-resource-group --name cockroachdb-aks-cluster --node-count 3 --location eastus --enable-addons monitoring --generate-ssh-keys
2. Get the Credentials
az aks get-credentials --resource-group cockroachdb-resource-group --name cockroachdb-aks-cluster
Note: From this point onwards, you can follow the official CockroachDB documentation on how to set up a CockroachDB cluster on Kubernetes. Specifically, follow the steps in the official documentation from here onwards to install the cluster using the Kubernetes Operator.
CockroachDB Dedicated on Azure
You can provision your own CockroachDB Dedicated cluster on Azure using the CockroachDB Cloud Console. If you don't have an account you can go ahead and create one and once logged in you can follow the steps here to provision your CockroachDB Dedicated cluster on Azure.
By following these steps, you can set up CockroachDB on Azure, whether on traditional VMs, using AKS for containerized deployments or a fully managed CockroachDB cluster on Azure. Each deployment strategy offers unique benefits and considerations, enabling you to choose the best fit for your organizational needs and application requirements.