I finally got around to watching the SyncFree consortium’s talk about
their work on CRDTs at
RICON 2014. Or at least half of it, rather, and I was really struck
by their Bounded CRDTshttp://christophermeiklejohn.com/crdt/2014/07/22/readings-in-crdts.html
idea. The idea of a bounded CRDT is
this: using a predefined invariant, you allow a certain count of
“interactions” per replica. They are likened to leases or
reservations in the talk. I like reservations better:
(edit 2023: I don’t have these drawings anymore, sorry)
Here, in an over-simplified example, we have a counter. This counter
changes with increments of 1, i.e. counter++
or counter--
. We
have set an invariant of cannot be < 0
, and we use this invariant to
make a guarantee. If each replica can only make 1 operation against
the current known value (4), then we will never violate the invariant.
However, should a second operation be performed on any replica, we are
no longer convergent: the result of this would have counter’s value
equal to -1:
(edit 2023: I don’t have these drawings anymore, sorry)
This is my primitive understanding, or my assumption, after watching 20 minutes of the RICON talk on this today. What I’m most interested in is the bound that comes from the reservation count. This is the third time I’ll bring this paper up on this blog, which I can accept – maybe I’m a bit obsessed. The Peter Balis, et al. paper on Coordination Avoidance. This paper, from a 10,000ft perspective, says only coordinate when you have to. Define invariants, coordinate when one is violated.
It looks to me that a bounded CRDT is using this same concept, but using the invariant to avoid conflicts.
As a background process, the “Reservation Manager” above doles out reservations, allowing each replica to operate on an object N times. When the Nth operation is reached, consensus is needed. If you are clever enough, and your strongly consistent Reservation Manager service can stay far enough ahead, you never have to sync.
Here comes some math
Mathematically, CRDTs are all about a join semilattice, and so I’m about to attempt to surmise what the lattice is in the context of a bounded CRDT.
Reservations is a semilattice \((R,\ \lor)\). The current operation, if operations are \(O \subseteq R\) is \(R \lor O\). That is, the Least Upper Bound of \(R\) is the current Operation Count. This meets semilattice requirements of commutativity, associativity, and idempotency in that with each operation \(⊤O\) will always be equal to some element in \(R\) so long as \(R\) remains unbounded. Therefore \(R \lor O\) holds for any \(O \subseteq R\). Like I said, if you are clever enough you can keep your Reservation Manager far enough ahead.
This CRDT concept is a bit more complicated than the basic commutative
or monotonic models, but is still farily easy to understand
conceptually. I’m very interested in what comes next from the
SyncFree humans, namely the adaptive stuff, which is used to reduce
replicas – which decreases the LUB, which makes pre-determining
reservations easier, which is :thumbs_up:
.