Historically, putting the concepts of “cloud” and “SQL” together has been a bit of a challenge. SQL, after all, dates back to the mid 1970s, and the traditional SQL databases that are most popular today – MySQL, PostgreSQL, SQL Server, SQLite, etc. – were all developed two-plus decades ago, well before most developers were thinking about putting things in the cloud.
These days, though, modern applications increasingly require high availability, elastic scalability, survivability, geolocation, and other features that simply aren’t possible with the old architectural pattern: a monolithic application tied to a single-instance SQL database. And while NoSQL databases emerged to meet the growing need for cloud-native, fast-scaling databases, they don’t offer some important features that those decades-old SQL databases do, like ACID transactions for guaranteed data correctness and consistency.
So, there is a need for cloud SQL.
What is cloud SQL?
The term “cloud SQL” can mean two different things.
One usage refers to a specific product: Google Cloud SQL, often simply called Cloud SQL, is a Google-specific product that allows developers to run PostgreSQL, MySQL, and SQL Server databases as fully managed services on Google Cloud Platform.
The other, more general usage refers to any database that offers SQL in the cloud. CockroachDB, for example, could be called a cloud SQL database, as it was built to be cloud native and can be run on any of the three major public clouds (AWS, GCP, and Azure), or any combination of them in hybrid cloud and multicloud deployments. (CockroachDB can also be run on-prem so technically it isn’t always a cloud SQL database, but it can be one).
Anecdotally, the former usage is probably more common than the latter. But as Google Cloud SQL is limited to users of Google Cloud Platform, there are many applications that require cloud SQL but can’t make use of Google Cloud SQL because they’re on another cloud or multicloud.
Cloud SQL vs. distributed SQL
Being vendor-locked into GCP isn’t the only reason that a company in need of cloud SQL as a concept might want to avoid Google Cloud SQL as a product. While suitable for many applications, adapting decades-old databases to the cloud still comes with downsides. Particularly at scale, some of the reasons companies need cloud SQL databases to begin with – high availability, elastic scale, geolocation – are also reasons why companies might not want something like Google Cloud SQL (or the similar solutions offered by other cloud providers).
This is because there are a variety of ways to configure and run SQL databases in the cloud, and not all of them offer the same advantages. Distributed SQL databases, a specific subtype of cloud SQL databases, offer a number of advantages that aren’t available in other cloud SQL database solutions. Distributed SQL databases were built explicitly for distributed, cloud-based systems, whereas the traditional RDBMS that form the heart of Google Cloud SQL (MySQL, Postgres, SQL Server) were not.
To illustrate why, let’s get specific. We’ll look briefly at the goals mentioned above, comparing Google Cloud SQL with CockroachDB in the context of an enterprise-scale application to illustrate some of the differences between Google’s form of cloud SQL and the distributed SQL of CockroachDB.
(Please note that this is a quick summary, and to an extent the capabilities of any database are dependent on its configuration and how/where it is deployed.)
High availability
Google Cloud SQL offers an optional high availability configuration for all the databases it supports (Postgres, MySQL, and SQL Server). This configuration creates two instances of your database, a “primary instance” and a “standby instance,” located in two different availability zones within a single GCP region. In the event of a failure of the primary instance (due to an AZ outage, hardware failure, etc.), the system will failover to the standby instance, which becomes your new primary database.
However, this approach has some downsides. Chief among them is that by default, this configuration cannot survive a region outage, because both the primary and standby instances are in the same region. Cloud SQL does allow for cross-region read replicas, but failing over to one of these in the event of a region outage is manual, and still results in minutes of downtime (per Google’s public-facing information), and a small amount of lost data as well.
CockroachDB, in contrast, maintains at least three instances (replicas) of your data, all of which can serve reads and writes, and which can be deployed across multiple AZs but also, if desired, across multiple regions, allowing the database to survive cloud region failure with no downtime or lost data. Being cloud agnostic, a CockroachDB cluster can even be deployed across multiple clouds to survive a whole-cloud outage, although this type of deployment comes with additional complexities and cloud costs and typically isn’t required for enterprises to meet their availability goals – see our recent report on multicloud deployments for more details and best practices there.
Updating database versions and schema can also require downtime in Google Cloud SQL, whereas CockroachDB can support online schema changes and version updates without downtime.
Elastic scale
Google Cloud SQL can scale, but only up to some hard limits, one of which is 96 processor cores, per the product documentation. There are other limitations as well. Chief among them: only the primary instance can serve writes in a Cloud SQL deployment, so while reads can be scaled horizontally with read replicas, writes can only be scaled vertically.
CockroachDB, in contrast, doesn’t have those limits: you can add and remove as many nodes as you want, in as many regions as you want, and all nodes can serve both reads and writes. Scaling both up and down can be easily automated (and it is automated automatically when using CockroachDB serverless), which helps prevent “success disasters”.
Geolocation
While Google Cloud SQL does allow cross-region read replicas, those replicas cannot serve writes. Google Cloud SQL also does not allow for specific rows or tables within a database to be tied to specific regions – a practice that can reduce latency, and that is also often required for global businesses that must comply with data privacy and data sovereignty regulations.
CockroachDB, as previously mentioned, can serve writes from any node. It also allows for easy data homing; you can limit tables or even rows to a specific region using simple DDL statements. This can allow for a much lower-latency experience with global applications, and it also makes compliance with data privacy laws much easier by preventing any multi-region databases from accidentally replicating data outside of the locations where that data is legally required to stay.
What do you really need in a cloud SQL database?
The examples discussed above are just a few samples of the differences between cloud SQL in the context of services like Google Cloud SQL (e.g., traditional RDBMS as managed cloud databases) and cloud SQL in the context of distributed SQL databases like CockroachDB (e.g., modern RDBMS that were built to be distributed and cloud-native from the beginning).
This is not to say that CockroachDB is better than Google Cloud SQL. The truth is that there are valid use cases for both systems. CockroachDB is designed and best suited for mission-critical database applications at enterprise scale – that’s why you see it in the architectures of companies like DoorDash and Netflix. Cloud SQL is best suited for relational workloads that don’t require that level of scale or availability, but it is the right choice for some workloads.
Not sure what’s right for your use case? Book a call to find out how CockroachDB can help your application scale when others fail.