This post announces the new mergeful
library. In this second part, we describe how mergeful can help a server and its clients agree on zero or one value with safe merge conflicts.
Why
Extending the mergeful approach of agreeing on a single value to agreeing on a collection of values is not straightforward at all. Instead of making that jump immediately, we take the smaller step to agreeing on zero or one value. The reason that this step is so important is because a deleted item looks like the 'zero' case. More about this in the next blogpost.
How
Pieces of the puzzle
The workflow for syncing an item is very similar to the workflow for syncing a value. The only difference is that there is no separate initial request necessary. For reference, here is the entire flow again:
A central server and one or more clients want to cooperatively agree on zero or one value of type a
. Clients store a ClientItem a
value and the server stores a ServerItem a
value. Clients regularly produce an ItemSyncRequest a
value from their ClientStore
using a makeItemSyncRequest :: ClientItem a -> ItemSyncRequest a
function.
When the server receives an ItemSyncRequest a
value, it uses its ServerItem a
and an processServerItemSync :: ServerItem a -> ItemSyncRequest a -> (ItemSyncResponse a, ServerItem a)
function to produce an ItemSyncResponse a
and a new ServerItem a
It then stores the new ServerItem a
and sends the ItemSyncResponse a
back to the client.
When the client receives the ItemSyncResponse a
, it uses a mergeItemSyncResponse :: ClientItem a -> ItemSyncResponse a -> ClientItem a
function to update its ClientItem a
to reflect the new synchronised state.
The following diagram should help:
More about the particulars of these types and functions later.
Rejected ideas
One could be tempted to implement the 'item' case in terms of the 'value' case. You could argue that a ClientItem a
should just be the same as a ClientValue (Maybe a)
and that does make sense. However, there are two unfortunate consequences of such an approach.
The first is that the type system over-estimates the possible cases. For example, MergeConflict Nothing Nothing
would be a valid value according to the types, while we already statically know that it should not be possible.
The second problem is that, in this approach, there is no difference between an added item and a modified item. Indeed, by a Just v
value alone, you cannot tell whether this value changed from Just u
to Just v
or from Nothing
to Just v
. We would like to be able to make this distinction.
The mergeful solution
We will keep the overall approach of value synchronisation, and the ServerTime
and Timed
types. They were useful.
The first change we need to make is that the ClientItem
type gets more constructors:
data ClientItem a
-- | There is no item on the client side
= ClientEmpty
-- | There is is an item but the server is not aware of it yet.
| ClientAdded !a
-- | There is is an item and it has been synced with the server.
| ClientItemSynced !(Timed a)
-- | There is is an item and it has been synced with the server, but it has since been modified.
| ClientItemSyncedButChanged !(Timed a)
-- | There was an item, and it has been deleted locally, but the server has not been made aware of this.
| ClientDeleted !ServerTime
Note that, because of the ClientAdded
constructor, we can see the difference between an added and a modified item.
The ItemSyncRequest
type is expanded in the same way. Again, the only difference is that there is no need to send over a synced value if it has not been modified.
data ItemSyncRequest a
-- | There is no item locally
= ItemSyncRequestPoll
-- | There is an item locally that hasn't been synced to the server yet.
| ItemSyncRequestNew !a
-- | There is an item locally that was synced at the given 'ServerTime'
| ItemSyncRequestKnown !ServerTime
-- | There is an item locally that was synced at the given 'ServerTime'
-- but it has been changed since then.
| ItemSyncRequestKnownButChanged !(Timed a)
-- | There was an item locally that has been deleted but the
-- deletion wasn't synced to the server yet.
| ItemSyncRequestDeletedLocally !ServerTime
The ServerItem
type also needs to be a tad bit bigger:
data ServerItem a
= ServerEmpty
| ServerFull !(Timed a)
Note that ServerItem a
is very similar to ServerValue (Maybe a)
.
So far so good, but now comes the complex part. There are 10 or 11 possible situations when it comes to an ItemSyncResponse
. The server and the client could be in sync, and you can split this situation up into whether they were in sync on an empty value or on a full value. Those are one or two scenarios, depending on whether you split them up Both the server and the client separately could have caused one of these three transitions: an addition, a change or a deletion. Those are six more scenarios. Lastly, there are three possible conflicts. If the client and the server run into a conflict, it could be that they have made a conflicting modification, or because one of them deleted the item while the other changed it.
data ItemSyncResponse a
-- | The client and server are fully in sync, and both empty
--
-- Nothing needs to be done at the client side.
= ItemSyncResponseInSyncEmpty
-- | The client and server are fully in sync.
--
-- Nothing needs to be done at the client side.
| ItemSyncResponseInSyncFull
-- | The client added an item and server has succesfully been made aware of that.
--
-- The client needs to update its server time
| ItemSyncResponseClientAdded !ServerTime
-- | The client changed an item and server has succesfully been made aware of that.
--
-- The client needs to update its server time
| ItemSyncResponseClientChanged !ServerTime
-- | The client deleted an item and server has succesfully been made aware of that.
--
-- Nothing needs to be done at the client side.
| ItemSyncResponseClientDeleted
-- | This item has been added on the server side
--
-- The client should add it too.
| ItemSyncResponseServerAdded !(Timed a)
-- | This item has been modified on the server side.
--
-- The client should modify it too.
| ItemSyncResponseServerChanged !(Timed a)
-- | The item was deleted on the server side
--
-- The client should delete it too.
| ItemSyncResponseServerDeleted
-- | A conflict occurred.
--
-- The server and the client both have an item, but it is different.
-- The server kept its part, the client can either take whatever the server gave them
-- or deal with the conflict somehow, and then try to re-sync.
| ItemSyncResponseConflict !(Timed a) -- ^ The item at the server side
-- | A conflict occurred.
--
-- The server has an item but the client does not.
-- The server kept its part, the client can either take whatever the server gave them
-- or deal with the conflict somehow, and then try to re-sync.
| ItemSyncResponseConflictClientDeleted !(Timed a) -- ^ The item at the server side
-- | A conflict occurred.
--
-- The client has a (modified) item but the server does not have any item.
-- The server left its item deleted, the client can either delete its item too
-- or deal with the conflict somehow, and then try to re-sync.
| ItemSyncResponseConflictServerDeleted
Actual synchronization
It is important to realise that when I was first writing this library, I did not work through the problem in the same order as you are now reading through the solution. But once the types are firmly in place, the following synchronisation functions should be relatively straightforward to write.
The function to make a sync request is relatively simple.
makeItemSyncRequest :: ClientItem a -> ItemSyncRequest a
The function to process a sync request is inherently complex, but using literate programming, it is straightforward to work through.
processServerItemSync :: ServerItem a
-> ItemSyncRequest a
-> (ItemSyncResponse a, ServerItem a)
The mergeItemSyncResponse
function, similar to the mergeValueSyncResponse
function, is not complex per-se, but again requires the programmer to either make a choice as to what will happen in specific situations or to make it a bit more complex and general.
These three functions are left as an exercise to the reader. The solutions can be found in the source code and are well-documented.
Testing the implementation
This module would never have been possible without thorough testing, and in this blogpost I would like to focus more on the testing than on the implementation.
Using validity-based testing, all of the tests were property tests. We let all the generators be derived automatically, see for example the generators for ClientItem
:
instance GenUnchecked ServerTime
instance GenValid ServerTime
There is no omitted code here. This is the full code for the generators. The GenUnchecked
instance has a default implementation using the Generic
instance of ServerTime
, and the GenValid
instance has a default implementation using the GenUnchecked
and Validity
instances. The shrinking functions are also automatically generated in a similar way.
For the actual testing, the standard producesValidsOnValids
property combinator came in very handy:
spec :: Spec
= do
spec "makeValueSyncRequest" $
describe "produces valid requests" $ producesValidsOnValids (makeValueSyncRequest @Int)
it "processServerValueSync" $ do
describe "produces valid responses and stores" $ producesValidsOnValids2 (processServerValueSync @Int)
it
Again, no code is omitted here. This is the entire definition of the test.
Like for the value syncing, we used a nice idempotency property again:
"is idempotent with one client" $
it $ \cstore1 ->
forAllValid $ \sstore1 -> do
forAllValid let req1 = makeItemSyncRequest (cstore1 :: ClientItem Int)
= processServerItemSync sstore1 req1
(resp1, sstore2) = mergeItemSyncResponseIgnoreProblems cstore1 resp1
cstore2 = makeItemSyncRequest cstore2
req2 = processServerItemSync sstore2 req2
(resp2, sstore3) = mergeItemSyncResponseIgnoreProblems cstore2 resp2
cstore3 `shouldBe` cstore3
cstore2 `shouldBe` sstore3 sstore2
Lastly, some custom tests came in handy for the specific cases for which we want to use the library. Here is one such example:
"syncing" $ do
describe "succesfully syncs an addition across to a second client" $
it $ \i -> do
forAllValid -- Client A has added an item 'i'.
let cAstore1 = ClientAdded i
-- Client B is empty.
let cBstore1 = ClientEmpty
-- The server is empty.
let sstore1 = ServerEmpty
-- Client A makes sync request 1.
let req1 = makeItemSyncRequest cAstore1
-- The server processes sync request 1.
let (resp1, sstore2) = processServerItemSync @Int sstore1 req1
let time = initialServerTime
`shouldBe` ItemSyncResponseClientAdded time
resp1 `shouldBe` ServerFull (Timed i time)
sstore2 -- Client A merges the response.
let cAstore2 = mergeItemSyncResponseIgnoreProblems cAstore1 resp1
`shouldBe` ClientItemSynced (Timed i time)
cAstore2 -- Client B makes sync request 2.
let req2 = makeItemSyncRequest cBstore1
-- The server processes sync request 2.
let (resp2, sstore3) = processServerItemSync sstore2 req2
`shouldBe` ItemSyncResponseServerAdded (Timed i time)
resp2 `shouldBe` ServerFull (Timed i time)
sstore3 -- Client B merges the response.
let cBstore2 = mergeItemSyncResponseIgnoreProblems cBstore1 resp2
`shouldBe` ClientItemSynced (Timed i time)
cBstore2 -- Client A and Client B now have the same store.
`shouldBe` cBstore2 cAstore2
References
The mergeful
library is available on Hackage. Mergeful originated in the work on Smos, Intray and Tickler. This post is part of an effort to encourage contributions to Smos. The simplest contribution could be to just try out smos and provide feedback on the experience. Smos is a purely functional semantic forest editor of a subset of YAML that is intended to replace Emacs' Org-mode for Getting Things Done.