openEngiadina wiki

Commutative Replicated Data Types

Data types such as containers have operations to access and change the content data type. These operations however do not always commute - the order in which the operations are applied is important for the final state of the data type.

For example consider a set from which we can add and remove elements. These operations do not commute:

(xvxx! = (x x)vx

This is a problem in distributed settings as the order in which operations are applied can not be guaranteed to be same everywhere.

The solution is to use data types whose operations commute. These data types are called Commutative Replicated Data Types.

An excellent overview of CRDTs and rigorous formalization can be found here: A comprehensive study of Convergent and Commutative Replicated Data Types (2011).

Some useful CRDTs

Last-Writer-Win Register

Creates a total order on updates by associating a time stamp with every update.

For multiple agents this requires somewhat coordinated clocks which might be seen as a form of weak coordination.

A malicious agent can impose a value by associating a fake high time stamp to an update.

Nevertheless this seems like a very useful data type for things that can be changed by a single agent or a group of trusted and cooperating agents (e.g user profile or description of group).

Observed-Remove Set

Generate unique ids for every add operation and store elements of set together with id of the add operation. Delete operations removes element only for a specific id.

SU-Set

Extension of OR-Set to be able to handle all operations defined in the SPARQL-Update specification.

Allows addition of entire sets of elements (instead of a single element per operations) and handles the SPARQL insert-delete operation.

See the paper: Synchronizing semantic stores with commutative replicated data types (2012)

Irmin

Irmin is an OCaml library for building data stores on CRDTs.

See also Mergeable persistent data structures (2015).