Going into 2024, we know two things to be true: the only constant is change. And change is coming at us with greater speed and intensity than ever before.
CTOs, architects and platform engineers at large companies know this better than anyone. Global data volumes are increasing by a staggering 60 to 70% each year. Regulations dictating exactly how this data must be protected, stored, and shared are also proliferating globally (at the risk of hefty fines for violating them). Companies are straining to keep up with skyrocketing data workloads — especially those for whom the data IS the business.
Meanwhile, enterprises rely more than ever on cloud services to keep their mission-critical applications alive and available for users. Unfortunately, public cloud outages are becoming more frequent: over 60% of companies reported losses in 2022 due to infrastructure outages. Which are also becoming more expensive: the average cost per hour of downtime has risen to $365,000. It’s no wonder that many IT leaders identify infrastructure outages as the most likely source of serious disruption for their organization.
To meet these risks head on, architects, operators, and technical decision makers are improving the reliability and resilience of existing infrastructure, and/or modernizing their enterprise architectures. CockroachDB 23.2 introduces new tools and capabilities to expedite both strategies: a live database migration service, enhanced disaster recovery capabilities, and increased performance through smarter data distribution.
Other new 23.2 features include simplified migrations with read committed isolation; physical replication for fast recovery from cloud failures; and added support for Stored Procedures/UDFs.
Here are three ways this latest version of CockroachDB helps make day-to-day life a little better and easier for users — and also gives companies on legacy databases a clear and supported path for getting off them.
Flexible data movement for real-world database migrations
Database migrations are notoriously difficult. Gartner compares the process to a heart transplant, due to both complexity and the database’s fundamentally interconnected role at the center of IT infrastructure. A mistake at any step can be costly, requiring either a complex rollback process or post-migration fixes.
Just as no two heart surgeries are the same, different applications and different business requirements demand different migration strategies. Because building the tooling to conquer the tradeoff between downtime, latency impact, and data integrity is beyond the skill set (and budget) of many teams, CockroachDB 23.2 now offers a full suite of migration tooling called MOLT: Migrating Off Legacy Technologies.
MOLT is designed to help you de-risk, execute, and validate your migration — in other words, to make your database transitions as smooth as possible. Our migration tooling supports the common elements of migration workflow, like schema conversion with MOLT SCT, and post-migration verification with MOLT VERIFY. CockroachDB 23.2 now introduces MOLT Fetch for preview support for data migration, and a powerful tool for doing live migrations: the MOLT Live Migration Service.
MOLT Live Migration Service
MOLT Live Migration Service is a locally hosted, horizontally scalable proxy that routes traffic between an application, a source database, and a target CockroachDB database to configure, test, and validate a migration before going live. Live migration offers three cutover strategies to suit your specific needs:
Offline Cutover.
Ideal for scenarios where application downtime is acceptable; a consistent and complete copy of the source database is ready to handle application workload.
Consistent Cutover.
This method keeps applications functional during the cutover, but comes at the cost of some latency increase.
Immediate Cutover.
Achieve a zero downtime migration with no latency increase, though some post-migration consistency adjustments may be required.
In this current release we support MySQL, Postgres, and CockroachDB as sources. The CockroachDB source simplifies self-hosted to dedicated/CockroachDBaaS migrations, as well as CockroachDB dedicated to CockroachDB serverless migrations — or vice versa!
To see MOLT Live Migration Service in action, check out this video demonstration of a live database cutover using each of the three different approaches.
Streamline application migrations with enhanced SQL compatibility
Architects often grapple with two challenges when migrating to CockroachDB: stored procedures and serializable isolation.
Stored procedures
In CockroachDB 23.2, we rebuilt stored procedures from the ground up to be significantly more powerful — and make it even easier for developers to migrate existing code. In addition to support for user-defined functions (UDFs), CockroachDB now supports stored procedures using SQL or PL/pgSQL, harnessing our distributed execution engine to process large amounts of transactional data in parallel.
Read committed isolation (preview)
23.2 also now provides preview support for read committed, the default isolation level provided by Postgres (including Aurora), Oracle, SQL Server, etc. This compatibility in transaction isolation allows developers to migrate high-concurrency Postgres applications without having to add additional error checking to handle the retries that are more common in higher isolation levels.
Read committed isolation also provides more predictable performance at high contention. Since reads no longer need to block on pending writes — so migrating applications achieve the expected performance.
Other Postgres compatibility improvements
We’ve also added more support for Postgres compatibility, including entity framework and column-level encryption.
Improve the resilience of your existing infrastructure
Enterprises with exacting resilience requirements often demand regional survivability. CockroachDB easily handles regional failures with zero RPO (Recovery Point Objective) and near-zero RTO (Recovery Time Objective) by replicating data across 3 regions, while consistently serving traffic from each location.
However, earlier generations of databases were fundamentally built around a legacy failover strategy. We have found that many organizations only have two regions in their environment and, until now, applications with limited regional distribution were not able to get the full benefit of CRDB’s high resilience features.
Even customers running CockroachDB in three or more multi-region environments, though, gain strategic benefit from a defense-in-depth approach: avoiding potential human errors that can take down an entire cluster.
New in 23.2, CockroachDB physical cluster replication solves for both of these scenarios by continuously sending all data at the byte level from a primary cluster to an independent standby cluster.
Physical cluster replication
For organizations that are built around two data centers (with no access to a third on-prem or cloud region), physical cluster replication allows them to survive full cluster failures — with RPO and RTO measured in seconds, rather than minutes.
This works by creating an exact, constantly updating replica of a primary cluster, while providing granular observability and control over data replication and cutover times. Data integrity checks ensure your recovery will support your business continuity goals. Because this replication is asynchronous, you can significantly improve your resilience without incurring a cross-region write-latency hit, which can push certain applications beyond their latency budgets.
Physical cluster replication is extremely flexible, and works across different hardware configurations and can extend to multiple standby clusters, and even multiple clouds.
Enhanced performance and resilience
The 23.2 release also improves CockroachDB’s resilience with improved load balancing, faster recovery from failures, and new security features. Enhanced performance and expanded CDC sinks for Confluent and Azure make it an all-encompassing update.
Ultimately CockroachDB 23.2 is all about resilience. Resilience in the technical sense, of course, including physical cluster replication, fast recovery from cloud failures, and enhanced disaster recovery capabilities. But 23.2 is just as much about operational and business resilience.
Global data volumes, customer expectations, and competitive pressures all continue to relentlessly escalate. Zero downtime for mission-critical systems is more crucial than ever — which is why the business case for modernizing enterprise application architectures has never been stronger. That’s why 23.2 zeroes in on key capabilities, such as stored procedures, MOLT Live Migration Service, and read committed isolation, that support and streamline the labor of infrastructure modernization.
This latest release of CockroachDB makes it easier for companies to move off legacy databases, feeling confident that migrating has been de-risked and their mission-critical workloads are safe and fully supported. For a deep dive into how this works, as well as all of CockroachDB 23.2’s many other new capabilities, be sure to check out the CockroachDB 23.2 release notes.