Blog
Product
Vectorizing the merge joiner in CockroachDB
Everybody loves a fast query. So how can we make the best use of the existing information to make joins on sorted data faster? The answer is lies in vectorizing the merge join operator. Today we’ll be looking into what a merge joiner is (or what it used to be), followed by what vectorization means and how it changes the problem, and ending with how we decided to make the merge join operator faster and what this means for your queries.
George Utsin
June 18, 2019
Performance
Automatic table statistics in CockroachDB
Last year, we rebuilt our cost-based optimizer from scratch for CockroachDB’s 2.1 release. We’ve been continuing to improve the optimizer since then, and we’ve added a number of new features for the CockroachDB 19.1 release. One of the new features is automatic collection of table statistics. Automatic statistics enables the optimizer to make better decisions when choosing query plans.
Rebecca Taft
May 9, 2019
Product
Introducing CockroachDB 19.1
It’s been a little over four years since we started our mission to deliver an enterprise-ready distributed SQL database. Today, we’re excited to release CockroachDB 19.1. With this release, we enhanced distributed SQL capabilities, and expanded upon enterprise-grade features for security and data integrations. 19.1 continues to solve the challenge of complex, distributed SQL while meeting all the “enterprise” requirements that are expected of a database. Here’s Nate Stewart, our VP of Product, with a quick intro on what you can expect in CockroachDB 19.1. And for a deeper tutorial with Nate, register for our CockroachDB 19.1 webinar.
Performance
Why are my Go executable files so large?
This blog post was originally published on the author's personal blog. Overview I built some tooling to extract details about the contents of a Go executable file, and a small D3 application to visualize this information interactively as zoomable tree maps. Here’s a static screenshot of how the app illustrates the size of the compiled code, in this example for a group modules in CockroachDB:
Raphael Kena Poss
April 18, 2019
GDPR & Data Regulations
Where is data regulatory compliance worth the cost?
In 2016, LinkedIn chose not to comply with Russia's requirement for data to be stored locally. As a result, they were kindly blocked from doing business in the country. Facebook and Twitter, on the other hand, both decided that compliance in Russia is worth the effort. Neither has fully met Russia's requirements but they have shown enough progress to avoid being blocked.
Dan Kelly
March 19, 2019
System
The future of data protection law
GDPR went into effect less than a year ago. And still, the era of conducting global business with limited legislative obstructions already feels like some free-spirited, far away past. Right now the global landscape of data protection law is littered with obstacles and exceptions. GDPR has been the loudest but there are plenty of other regions and countries with regulations in place. Even within the E.U., countries like Germany and Switzerland have their own unique protection regulations. Russia and China have very draconian laws, and they're changing quickly. There are around 120 countries now with data protection laws in place.
Spencer Kimball
February 26, 2019
Product
Why we're switching to calendar versioning
One small step for Cockroach Labs, one giant leap for our release numbering. Since our initial launch, Cockroach Labs has used semantic versioning in our release cycle guidelines. Two years, one major release, and n-patch fixes later, we're making the switch to Calendar Versioning. This means subscribers to our release notes will see quite the jump in today's version numbering, from last week's 2.1.5 to today's 19.1 beta.
Peter Mattis
February 25, 2019
testing
Lessons learned from 2+ years of nightly Jepsen tests
Since the pre-1.0 betas of CockroachDB, we've been using Jepsen to test the correctness of the database in the presence of failures. We have re-run these tests every night as a part of our nightly test suite. Last fall, these tests found their first post-release bug. This blog post is a more digestible walkthrough of that discovery (many of the links here point to specific comments in that issue's thread to highlight the most important moments).
Ben Darnell
February 21, 2019
System
Introducing the High Availability Architecture Guide (CockroachDB vs. Oracle)
Which is worse...? One of your users goes to check her bank balance in your app, and the service is down, or, One of your users goes to check her bank balance in your app and there's a data inconsistency. Engineers are frequently faced with this false tradeoff: do you place a higher premium on data correctness, or high availability? This problem only becomes more complicated when you begin dealing with users distributed across broad geographies. When IT experts consider high availability infrastructure for mission-critical services, their minds often leap to Oracle as the preeminent service provider. But Oracle's database was designed in a pre-cloud world, and the means by which it achieves high availability on geo-distributed workloads are complex. Oracle requires a staggering number of technologies that must be implemented, and still, their solutions can allow potentially costly anomalies into your data. As a cloud native database, CockroachDB introduces a new way of providing always-on availability, strong data consistency, and distributed performance. Today, we're releasing a side-by-side comparison of CockroachDB and Oracle to help you get a better understanding of the architecture (and cost) of setting up a highly available distributed service.
Charlotte Dillon
February 12, 2019