Replication Control

2021-01-15

字数统计: 409字 | 阅读时长≈ 2分

Concurrency control is about how to control multiple clients operation concurrently. Replication control is about how to handle operations when data is stored at multiple servers, with/without replication.

The effect of replication

By applying replication strategy, the availability of object is highly increased, from $(1 - f)$ to $(1 - f^k)$, where f is the probability of single server to fail, k is the number of replication.

Challenge brought by replication

Replication Transparency: client should not aware of replicas.
Replication Consistency: All clients see single consistent copy of data.

To provide replication transparency, we use a middle layer of front end to communicate, in this way, client is not aware of replicas.

To provide replication consistency, we have two ways:

Active replication: treats all replica identically. It multicast (more details to be discussed) all replica and make updates when a client updates.
Passive replication: use a primary replica. It only updates primary replica, then the primary replica multicast other replica to update.

In a replication database, it should be guaranteed that it is “”one-copy-serializable“”, which means that the effect of transactions performed by clients on replicated **objects should be **the same as if they were performed one at a time on non-replicated objects.

Also, since data in a transaction may be stored in different server, we should discuss the strategy of committing updates.

Two phase commit

We select one leader server as coordinator to send update request made by a transaction to different server, stages are:

First phase commit: send request to update, the servers respond with yes/no based on serializability and one-copy-serializability.
Second phase commit: if any response of no/ time out on receiving, meaning that the transaction fails, then abort transaction. Send “abort” to server to roll back updates made if any. If no response of no, send “commit” to server to actually commit.
For server which stored data, if sending no message back, it can abort right away. if it sends yes, it need to wait for “commit” message. It also abort after timeout on waiting message.

Deal with failure:

In the above stages, all the intermediate messages and updates are stored in permanent disk (not committed yet) in case of server/coordinator crash since disk is retrievable after recovery.

In case of dropped message, a transaction is aborted after as timeout.

Besides timeout on message, a server can also poll from coordinator repeatedly to deal with any commit/abort message loss.

本文作者： Yu Wan
本文链接： https://cyanh1ll.github.io/2021/01/15/replication/
版权声明： CYANH1LL