In this tutorial you are going to learn about Consensus Problem in Distributed System.
Consensus is a task of getting all the processes in a group to agree on some specific value based on the votes of each process.
Places where consensus has come in useful are:
- Synchronizing replicated state machines and making sure all replicas have the same (consistent) view of the system state.
- Electing a leader (eg: for mutual exclusion).
- Deciding to commit or abort for distributed transactions.
System Assumption
- Failure models: Among the n processors in the system, at most f processes can faulty
- Synchronous/Asynchronous communication:
- Failure prone process in synchronous communication
- Failure prone process in Asynchronous communications.
- Network connectivity – The system has logical connectivity.
- Sender identification- A process that receives a message always knows the sender id.
- Channel reliability- The channels are reliable, and only the processes may fail.
- Authenticated vs non-authenticated messages – we will be dealing only with unauthenticated messages.
- Agreement variable- The agreement variable may be Boolean or multivalued and need not be an integer.
In reality the system components are never perfect, they are prone to hardware failures, packet drops, slow network, clock skews, etc. These components should agree on some data value that is needed during computation to act as a single entity.
Scenarios that require consensus
Let’s assume a distributed database where the data is replicated to all nodes:
- Ordering of the updates [Reliable multicast]
- Detection of suspected failure [ Failure detection]
- Exclusive access to a resource [Mutual exclusion]
- Leader election process [Leader election]
Consensus Protocol Constraints
- Agreement: Every component must agree on the same value.
- Validity: If all components propose the same value v, then all processes decide v
- Termination: Every non-faulty component decides some value. If the protocol never terminates, then the components are agreeing on the same thing, which is not deciding.
Consensus Algorithms:
- PAXOS (Google chubby)
- RAFT(Hashicorp Consul, etcd)
- ZAP (Apache Zookeeper)
Different Forms of Consensus Problem:
The Byzantine agreement problem: The byzantine agreement problem requires source process with an initial value, to reach agreement subject to the following conditions:
- Agreement
- Validity
- Termination
The consensus problem differs from the byzantine agreement that each process has an initial value and all the correct processes must agree on a single value.
- Agreement: All non faulty processes must agree on the same single value.
- Validity: If all the non faulty processes have the same initial values, then the agreed upon value by all non-faulty processes must be the same value.
- Termination: Each non-faulty process must eventually decide on a value.
The Interactive Consistency Problem:
- Each process has an initial value, and all the correct processes must agree upon a set of value with one value for each process.
- The formal specification is as-follows
- Agreement: All non-faulty processes must agree on the same array of value A[v]… A[Vn]
- Validity: If the process is non-faulty and its initial value is vi, then all non faulty processes agree on vi as ith element of the array A. If process j is faulty, then non-faulty processes can agree on any value for Aj.
- Termination: Each non-faulty process should decide on the array A.
This article on Consensus Problem in Distributed System is contributed by Hemalatha P. If you like TheCode11 and would like to contribute, you can also write your article and mail to thecode11info@gmail.com