Compare Finality and Liveness of various consensus algorithms

Due to the FLP Imposibility principle,

No completely asynchronous consensus protocol can tolerate even a single unannounced process death

So don't waste time designing consensus algorithms for purely asynchronous networks. The solution is to either strengthen the requirements of the network, require the network to be Synchronous or Partially Synchronous (from asynchronous to synchronous or semi-synchronous, the network status is better), or relax the requirements for finality, do not require Deterministic finality, only require Probabilistic Finality can be. It is also possible to do both at the same time.

Table 1 Comparison of various consensus algorithms

The contents of the above table are explained in detail below.

Finality

Finality looks very similar to Safety, Consistency, and seems to be different, very confusing.

For simple understanding, you can think of Finality, Safety, Consistency as synonym, namely Finality=Safety=Consistency.

For a deeper understanding, I think Finality is a complex, Finality>Safety>Consistency. Consistency applies to all distributed systems, including trusted environments such as Hadoop and untrusted environments such as Bitcoin; Safety applies to the Byzantine environment; Finality is the term in the blockchain scenario, and the blockchain is a Byzantine scenario. Subsets (Byzantine issues in addition to blockchains and non-blockchain solutions). Consistency is C in CAP theory, more general, requiring no more than Finality. Finality contains more meaning than Consistency. Finality contains all of the following meanings:

  • The data for all nodes should be consistent. The client issues a read operation to any node and the result should be the same. This is the Consistency in CAP. The term is applicable to all distributed systems, including trusted environments such as Hadoop and untrusted environments such as Bitcoin.
  • All nodes must ensure that they do not enter conflicting, split states, ie safety. This term is applicable to the field of Byzantine fault tolerance. In the open environment of Byzantine, a malicious node broadcasts two conflicting messages (such as a double-flower transaction), causing all nodes in the entire network to be in a split state. This destroys the nature of Safety. All consensus algorithms under Byzantine need to deal with the problem of malicious nodes and ensure the safety of the whole network.
  • Once the transaction enters the block, it should be irrevocable, ie Immutability. In the blockchain scenario, to ensure the safety, not only must the malicious node be dealt with by the malicious node, but also the malicious node should be prevented from revoking the transaction that has been packaged into the block. Because the blockchain is a single-linked list, it is a linear structure. The malicious node can theoretically start from the old one, split a longer new chain, and revoke all the transactions that have already been sent out. Spend the money again (is another form of double flower)

In summary, Finality=Consistency + Safety + Immutability.

Liveness

Liveness can be considered equivalent to Availability in CAP. When the partition occurs in the network, such as the break of the submarine cable, the global Internet is divided into two parts, can the entire blockchain system be able to write new transactions? Like Finality's consensus algorithm, this time I choose to wait indefinitely, the new transaction can't be written until the submarine cable repair, the Internet communication on both sides; like Availabilty's consensus algorithm, when the two networks will operate independently, the data is separated, both sides The data in the entire node becomes inconsistent.

For example, Tendermint is such a kind of sacrifice, and Liveness pursues Deterministic Finality. Suppose the submarine cable breaks the network into two sides, then each side has half of the validator, so in the vote and commit stages, all validators on each side, 100% vote for a proposal block, up to 50% Voting, less than 2/3, so the entire blockchain network will wait indefinitely until 2/3 of the votes are collected. During this waiting period, the next block cannot be made, the new transaction cannot be written, and the entire network is paralyzed.

When Bitcoin encounters this kind of network segmentation, the two parts of the bitcoin system will continue to move forward, and can still write new transactions to generate new blocks. When the submarine cable is repaired, the two sides of the Internet are connected, and then Choose to merge.

Before the submarine cable is broken, the status of all the nodes in the world is the same, as shown below:

When the submarine cable breaks, the global network is divided into two parts, and the two parts will be out of the block independently. At this time, the two sides are inconsistent, but the two sides are not perceptible, thinking that they are still a linear blockchain, as follows Figure:

On the left and right, although a new block is still dug every 10 minutes, the block hash on the left and block07 on the right are not equal. At this time, the Bitcoin network is still available, but the Finality is destroyed.

When the submarine cable is repaired, at this time, the two sides are synchronized with each other and will realize that there is a fork, as shown below:

Now the state of all the whole nodes in the world becomes the above picture, there is a fork, because the height of both sides is 8, it is impossible to decide which fork is correct. At this time, it depends on which side the miner supports, which side counts Ligao, which side first out of the new block, then which side wins, the short chain will be abandoned, for example, if the right side grabs a new block, then the right side wins, the left side fork is discarded, all the nodes The data in it has become a linear blockchain, and the agreement is reached, as shown below:

In fact, even if the submarine cable is continuous and the network has no partition, it will often happen that two miner each dig a new block at the same time. At this time, it is better to be lucky, and which one will win the next new block. In other words, Bitcoin theoretically never has a deterministic consistency state, and the fork will appear at any height at any time, so Bitcoin sacrifices a little Finality for a stronger Liveness.

Network Assumption

All distributed consensus algorithms have an implicit assumption about the network. Let me talk about the classification of the network:

Synchronization: The message must be delivered within a certain time T. The upper bound value T is a known constant and all nodes are aware of it. If the message is not delivered within T time, the message is not expected. The node believes that the message has been lost and will not continue to wait for it. All nodes are methodical, and each time T is passed, it goes further and is very neat.

Semi-synchronization: The message must be delivered within a certain time T, but the value of T is not a fixed value, but is dynamically changed, for example, dynamically calculated based on network conditions. All nodes

Asynchronous: Messages arrive at any time, either quickly or slowly. In short, there is no explicit bound or even indefinite delay. No matter how many nights arrive, the node accepts and processes the message. It cannot be simple. Drop message because of timeout

Bitcoin's assumption about the network is that the network is synchronized. The time limit is about 10 minutes. After Miner digs up a block, it broadcasts to the whole network. At this time, the entire bitcoin system expects that this block will be all in 10 minutes. The online full node is received, meaning that every 10 minutes, all the nodes will go one step forward, that is, add a block to the end of their blockchain. Even if the network speed is fast, for example, less than 3 minutes, the new block has been received by all the nodes, and Bitcoin will go one step at a time every 10 minutes (a new block). Ethereum is similar, but the time limit is 15 seconds.

Tendermint assumes that the network is semi-synchronous during the pose phase, because there will be a timeout in this step. If a new proposal has not been received for more than the time, then the other validator will think that the proposal node has been hung, so an empty block is generated. Until round-robin to the next proposal. Tendermint needs to collect more than 2/3 of the votes in both prevote and precommit, waiting indefinitely, that is, in these two phases it is assumed that the network is asynchronous. In the end, Tendermint's network requirements are semi-synchronous.

pBFT is asynchronous in all three stages of pre-prepare, prepare, commit. Since it is asynchronous and has no timeout mechanism, how can it progress forward? After collecting more than 2/3, you can continue to move forward. However, all nodes will start a timer when they receive a request from a client. If the request has not been executed within a certain time, View Change will be triggered. View Change This section is semi-synchronous.

Here you can appreciate the simplicity of Tendermint compared to pBFT. Tendermint moves the timeout mechanism to the propose phase (equivalent to the pre-prepare phase of pBFT). If the proposer broadcasts a proposal block within the specified time, then proceed to the next step. If it times out, proceed to the next step. However, this is a proposal block is an empty block (the empty block will also go through the pre-vote and pre-commit processes completely, but the block height will not increase by 1, which is equivalent to the network idling, until round-robin to the next The new proposal node re-broadcasts the new proposal block). In any case, the pose phase will move on to the next step. But Tendermint's pre-prepare is asynchronous and it is possible to always be the owner. pBFT moved the timeout mechanism to the View Change section, so pBFT added a View Change step that was more complicated than Tendermint. Tendermint replaces the proposal node by submitting empty blocks and round-robin, while pBFT replaces the primary node with View Change. Tendermint eliminates the complex Step View step.

In addition to eliminating View Change, Tendermint is simplified in another place, and all Tendermint information is stored in the blockchain. The pBFT was proposed in 1999. At that time, there was no blockchain. Therefore, all nodes of the pBFT have consistent data, but the data is distributed. The data for each node of pBFT includes:

The state of each replica includes the state of the service, a message log containing messages the replica has accepted, and an integer denoting the replica's current view.

Blockchain is a distributed database. For example, before the DBMS database such as MySQL did not appear, people wrote data to the file and then stored it on the hard disk, inventing various strange file formats and organization methods. With MySQL, managing data is much more convenient. Similarly, Tendermint stores all data in the blockchain. pBFT does not have a distributed database such as blockchain. All nodes need to manage data on the hard disk themselves, for example, to compress message logs, discard old messages, save hard disk space, and introduce the concept of checkpoint. .

The relationship between Tendermint and pBFT is similar to the relationship between Raft and Paxos. Tendermint is a simplified version of pBFT. It is a simplified version of pBFT for the scene of blockchain.

The following figure is the algorithm flow chart of Tendermint:

The following figure is the algorithm flow chart of pBFT:

To be continued. . .

Reference material

  1. 1999.Castro. Practical Byzantine Fault Tolerance
  2. Tendermint: Byzantine Fault Tolerance in the Age of Blockchains
  3. Consensus Compare: Casper vs. Tendermint
  4. A Proof of Stake overview
  5. Compared with traditional PBFT, what advantage does Tendermint algorithm has?
  6. Synchronous, partially synchronous and biological consensus algorithms
  7. GRANDPA Block Finality in Polkadot: An Introduction (Part 1)
  8. Finality in Blockchain Consensus

Author: microblogging @ soul machine