In exploring ways to improve transaction throughput performance for CockroachDB we have introduced buffered writes, in preview. Before buffered writes, each step that included a read or write would read or write from the leaseholder. Since the leaseholder is not always local, this communication could incur network round trips and serial execution of each statement.
Buffered writes introduces a new step in the transaction flow, which temporarily stores transaction writes on the KV-client side until the transaction commits. This approach minimizes the number of round-trips to the leaseholders, reduces pipeline stalls, and allows for the passive use of the 1-phase commit fast-path. By deferring writes until commit time, the system can reduce redundant writes and serve read-your-writes locally, resulting in performance gains.
Buffered writes benefits
Buffering writes client-side until commit has four main benefits:
It allows for batching of writes. Instead of sending writes one-at-a-time, we can batch them up at the KV-client layer and send them all at once. This is a win even if writes would otherwise be pipelined through consensus.
It allows for the elimination of redundant writes. If a client writes to the same key multiple times in a transaction, only the last write needs to be written to the key-value layer.
It allows the client to serve read-your-writes locally, which can be much faster and cheaper than sending them to the leaseholder. This is especially true when the leaseholder is not collocated with the client. By serving read-your-writes locally from the gateway, write buffering also avoids the problem of pipeline stalls that can occur when a client reads a pipelined intent write before the write has completed consensus.
It allows clients to passively hit the 1-phase commit fast-path. This is better than requiring clients to carefully construct “auto-commit” BatchRequests to make use of the optimization. By buffering writes on the client before commit, we avoid immediately disabling the fast-path when the client issues their first write. Then, at commit time, we flush the buffer and hit the 1-phase commit fast path if all writes go to the same range.
Buffered writes thus improve communications while keeping transactional ACID compliance.
How buffered writes works
The initial read
reads from the leaseholder, while subsequent writes
write to a local buffer on the “gateway node.” Writing to the local buffer avoids all network round trips after the first. Additionally, any subsequent reads can be read from the buffer, again reducing the number of round trips and improving latency. We’ll look at an example later to see how this works.
At COMMIT time, writes are flushed from the buffer and the transaction is only considered committed if all those writes are ack-ed.
Buffered writes help for particular situations where there would otherwise be lots of network hops to complete the SQL statement. In particular, they help:
For multi-statement transactions; the more statements, the more network traffic that gets eliminated by staying local
When you read your own writes; each read is served from the local buffer completely eliminating the network back-and-forth to serve information that started out locally
When rows are modified multiple times in a single transaction; the row modifications are essentially done all at once and then sent over the network in a final version
A schema that can take advantage of parallel commits
An example
Let’s take a look at an example.
Look at the SQL statement:
BEGIN;
Select from key1 on NodeB (check balance)
Update to key1 on NodeB (draw down the balance)
Update to key2 on NodeC (add to the balance)
Write to key3 on NodeA
Select final balance from key1 on NodeB
COMMIT;
Before buffered writes, each step above that included a read or write would read or write from the leaseholder (NodeB), calling for network round trips and serial execution. You can see in the SQL statement - steps 1 and 2 require communication to NodeB, then to NodeC, then to NodeB again. In total, 4 sequential round trips.
After buffered writes, steps 2, 3, and 4 (the writes to key1, key2, and key3) get put in a buffer on the gateway node (NodeA). The read in step 5 can be read locally from the buffer, only the read in step 1 reads from the leaseholder on NodeB, until the transaction is ready to be committed.
These transactions remain fully ACID compliant, but with dramatically reduced network chatter and latency.
Next Steps
While in Preview, buffered writes are default off. To turn buffered writes on, set cluster setting kv.transaction.write_buffering.enabled = true
. There are some things to watch out for, and definitely work with support or your field team to make sure that your application is appropriate as there are a few caveats you’d want to keep an eye out for. Read more here for further information on buffered writes.
And if you’re testing performance on CockroachDB, we strongly encourage you to use 25.2 with buffered writes turned on to see the best results.