Technical Primer | Explore the key details of BFT and Libra's Consensus components

Libra involves many things. We introduce the design and implementation of Libra from three lines:

  1. By analyzing the process of Node startup and joining the Libra network, the design and implementation of Network components are introduced;
  2. Around the life cycle of the transaction, analyze the process of receiving transactions, packaging blocks, and running on the chain, and introduce Libra's core components such as Mempool, Executor, and Storage, VM;
  3. Focusing on LibraBFT, introduce Consensus components and the process of reaching consensus on blocks.

Earlier we described the second main line of Libra-the life cycle of transactions, and learned the approximate design and implementation of Libra core components. Among them, the Consensus component is just a brief introduction. In practical scenarios, Consensus components need to ensure that consensus is reached in many Validator nodes distributed in different regions of the world. In the distributed case, it is guaranteed that the order of blocks or transactions is eventually consistent. It can be said that this is the soul of the entire blockchain. Therefore we introduce the Consensus process separately:

  1. Why do you need Consensus?
  2. What is the current mainstream consensus?
  3. How does BFT reach consensus?
  4. Libra's consensus component

Why do you need Consensus?

When we introduced the account model earlier, we mentioned "record the change process of each Address in the order approved by most people". Here, "the order approved by most people" is to reach consensus. However, in reality, it is very challenging to make many, many distrustful nodes all over the world reach a consensus on the order of transactions of all people in the world, and this is where all public chains are working.

consensus-1

Mainstream consensus

In order to solve the problem of global consensus, many talented people in the industry have been exploring for a long time. At present, there are about 3 types of representative consensus:

consensus-2

These three types of consensus have derived various kinds of consensus:

  1. POW, POW-DAG, NC-Max, etc.
  2. Pos, PoA, DPos, etc.
  3. PBFT, LibraBFT, etc.

These consensuses each have their own characteristics, and at the same time, there may be some correlations between them. Consensus is a very broad topic, and those who are interested can learn about it by themselves. Since BFT itself is more complicated, let's talk about BFT in depth, and approach our subject step by step-LibraBFT.

How BFT reached consensus

BFT is more complex and has many concepts. Therefore, we will explain it in multiple steps, starting with simple scenarios and gradually expanding:

  1. BFT safety and activity
  2. Tolerant number of Byzantine nodes
  3. Synchronous and asynchronous
  4. PBFT and two-phase confirmation
  5. Three-stage confirmation of Hotstuff
  6. Chained Hotstuff

BFT safety and activity

Many articles about BFT or Paxos will tell the story of Byzantium generals, with different versions and similar core ideas. Here we quote Baidu Encyclopedia: Byzantium is located in Istanbul, Turkey today, and is the capital of the Eastern Roman Empire. Because the Byzantine Roman Empire had a vast territory at that time, each army was separated far away for the purpose of defense, and the general and the general had to rely on messengers to spread the news. During the war, all generals and deputies in the Byzantine army must reach an unanimous consensus to decide whether they have a chance to win before attacking the enemy's camp. However, there may be traitors and enemy spies in the army, and the decisions of the generals will disturb the order of the entire army. In consensus, the results do not represent the opinions of the majority. At this time, with the members' rebellion known, how the remaining loyal generals reached an agreement without being influenced by the traitors, and the Byzantine problem was thus formed.

The above is the story of the Byzantine generals extracted by Baidu Encyclopedia. In a sentence, it is to make all loyal generals act in the same way, either with the strongest strength or the strongest combat effectiveness. In other words, the loyal general acts in concert with the highest safety factor. If some loyal generals attack and some loyal generals retreat, the consequences will be unimaginable.

Byzantine Fault Tolerance (BFT) is designed to solve this problem. Here are two key indicators:

  1. Security: all correct nodes must agree on the same value, which means that all loyal generals agree
  2. Activity: all nodes must eventually decide on an output value. It can be understood that voting must produce results, that is, all nodes reach agreement;

Security is the goal, and activity is a holistic overview of all the anomalies that make voting impossible. To ensure both safety and activity, it is easy to ask questions:

  1. What is the maximum number of Byzantine nodes that can be tolerated in a certain number of clusters?
  2. What can I do if messages are delayed in a distributed environment?

Tolerant number of Byzantine nodes

Regarding the maximum number of Byzantine nodes that can be tolerated, the God of Lamport has mathematical derivation. If you are interested, you can check it out, but I saw a more accessible version.

Let's simplify the problem: Suppose there is a department of n people who is preparing for a spring outing and choose from two places, A and B. Wherever there are the most votes, go there. Among them, there are f people who don't want to go anywhere (dishonest nodes). And all the remaining h individuals want to travel (honest nodes). Here, no matter whether it is an honest or dishonest node, it is possible not to vote. Then there may be such a result. The number of votes for A and B is the same, and the department administrator does not know what to do and is stuck. Dishonest nodes just want to be stuck, for which dishonest nodes may vote as appropriate:

  1. If A and B have as many votes, then dishonest nodes will not vote
  2. If the votes of A and B are similar, then the dishonest node will, according to its own interests, invariably choose the lesser party, and eventually let A and B have as many votes

In short, the dishonesty node hopes to have the embarrassing situation of "no conclusion". In order to avoid such a situation where "a consensus cannot be reached", at most the votes must reach at least x in order to form an absolute advantage and win.

Returning to the question of Byzantine generals, whether it is an offensive or a retreat, the loyal general can only receive most of the orders passed from the general (note that because of the existence of traitors, it can only be the majority and cannot wait to collect all the orders, Otherwise, the vote will get stuck), count the command with the most votes, and execute this command. In order for all loyal generals to have the same order, the winning order should reach at least x. The loyal general can safely execute this command boldly, because he knows that this order has reached x. Other loyal generals also execute the same command. .

The example of Byzantium generals is a bit more complicated than the example of departmental tourism above: the votes for departmental tourism are counted by one person in the departmental administration and published uniformly; and the example of Byzantium generals is that all generals send messages to other generals, and each generale counts himself Received message. Then there will be situations where the traitor will attack General A and retreat to General B. So when making a decision, x> n / 2 + 1 is not enough. In this case, the following expression 3 will be reflected.

We use the important information hidden:

1. 总人数2. 最少票数不应该比诚实节点数多,否则不诚实节点只要全部不投票,投票就将进行不下去3. 如果一个结果要代表所有诚实节点,那么起码有一半以上的诚实节点投了这个结果4. 对于不诚实节点,可能给不同的人的投票信息不一样 

We turn this information into expressions:

 1 => n = f + h 2 => h >= x 3+4=> x > h/2 + f 

Derive according to the above 3 inequality

 => h > h/2 + f => 1.5h > h + f => 1.5h > n => h > 2/3 * n => f < 1/3 * n 

Although the above derivation revolves around the number of winning votes x, the conclusion is that the number of Byzantine nodes f that can be tolerated at most. That is, to reach consensus, the number of Byzantine nodes f must be less than n / 3 of the total number of nodes, n = 3f + 1 and x = 2f + 1. Why count this? Because it will be used later. At the same time, we also know the possible operations of the Byzantine node:

  1. Do not vote
  2. Vote for different nodes

For the second operation, it can be avoided by message signing. That is only the case where the Byzantine node does not vote or the leader does not initiate a vote (leader itself is a Byzantine node). This situation is called the Byzantine general problem under weak suspension conditions.

Synchronous and asynchronous

Earlier we mentioned the issue of network latency. For distributed systems, abnormal conditions such as network congestion may cause the network delay to be very large, even without an upper limit. According to the delay dependence of the protocol, the protocols are divided into three categories:

  1. Synchronization: There is an upper limit on network latency and the upper limit is known;
  2. Asynchronous: There is no upper limit for message delay;
  3. Partially asynchronous: there is an upper limit on network latency but the upper limit is unknown;

Synchronous models are suitable for scenarios that are particularly sensitive to network delays; some asynchronous models can be understood as covering network anomalies under normal circumstances, which are closer to daily general scenarios and are most practical.

In some asynchronous models, voting is usually initiated by the leader. Since the leader may be a Byzantine node, in order to ensure the activity, multiple nodes will be sorted and the leader will take turns. Once the situation where the leader is a Byzantine node results in a certain delay and no agreement can be reached, the next leader is used to maintain the voting process. The LibraBFT consensus implemented by Libra uses Hotstuff as a Byzantine fault-tolerant algorithm and belongs to a partially asynchronous model.

PBFT and two-phase confirmation

BFT revolves around voting, with PBFT (Practical Byzantine Fault Tolerance Algorithm) being the most common.

The following is the general flow of the PBFT algorithm. Let's first look at the meaning of each stage:

  1. request: trigger the leader to initiate a proposal
  2. pre-prepare: The leader prepares the proposal and broadcasts the proposal to all nodes
  3. prepare: The node broadcasts its own vote to other nodes, so the message complexity is O (N ^ 2), and statistics are collected on all received votes.
  4. commit: When the proposal reaches a 2f + 1 vote, the node will consider the proposal approved. At this time, the current node will notify all other nodes that it intends to commit the proposal. The commit message must not only indicate that it has accepted the proposal, You must also include 2f + 1 votes that you collected. If the current node receives 2f + 1 commits for this proposal, then it means that the result has reached agreement. This phase is more complicated, and will be highlighted below.

bft-1

The above is equivalent to initiating two rounds of voting. Why should two rounds of voting be conducted to reach an agreement?

Let's imagine a scenario where there is only one round of voting:

Just at that moment, after 3 nodes sent voting messages to 1 node, they became Byzantine nodes. Although node 2 is a non-Byzantine node, it has not yet initiated a vote. At this time, 1 node received 3 votes, 0, 1, 3 respectively, so 1 node has reason to think that all honest nodes have reached a consensus. But in fact, no agreement was reached. At this time, 2 nodes may initiate a request to re-vote due to timeout, and 0 and 3 may agree to this request. Therefore, there may be no consensus in only one round of voting.

bft-2

In order to solve the above problem, another round of voting was conducted in the design of the PBFT protocol, and the situation in which the first round of voting could not reach an agreement was called the commit phase. But if you think about it, the second round of voting will not be consistent:

bft-3

Although the problem of the first round of voting was solved, it seems that the same problem occurred in the second round of voting? In fact, PBFT has optimized the second round of voting:

When all nodes send a confirmation message, they not only need to tell other nodes about their status, but also need to bring a certificate, that is, they need to bring a 2f + 1 vote signed message sent by other nodes to themselves (this is O (N ^ 3) message complexity).

In the application scenario of the blockchain, the latter block is based on the previous block. If BFT is used as a consensus, the order of block generation is determined. The nodes that produce blocks later must not only build new blocks, but also need to give proof of the previous block in the proposal, either a 2f + 1 signed commit, or 2f +1 timeout signature (this is also the message complexity of O (N ^ 3)), otherwise, the block producer is a Byzantine node and will initiate a timeout vote to the next block producer.

The above is a general description of PBFT and the two-phase confirmation. Non-Byzantine nodes reach consensus through two rounds of voting, and guarantee the activity of the protocol through mechanisms such as multiple leaders and timeouts, but they require O (N ^ 3) message complexity.

Three-stage confirmation of Hotstuff

PBFT is a very classic Byzantine fault tolerance algorithm. In the commit phase of the two-phase confirmation, since the vote messages signed by other nodes are required to prove that their status is not lying, this leads to the message complexity of O (N ^ 3), and therefore there are obvious bottlenecks. Is there an algorithm to solve this problem? The Hotstuff Byzantine fault-tolerant algorithm selected by Libra's LibraBFT consensus protocol solves this problem ingeniously through "threshold signature + three-phase confirmation".

The first author of Hotstuff is Mr. Yin Maofan, a PhD student at Cornell University. Comparing the previous two stages of confirmation, we see that Hotstuff has an additional pre-commit stage between prepare and commit. Why does one more round of voting solve the problem of message complexity?

First, we briefly explain the role of threshold signatures. Those who are interested can study it by themselves. The n nodes generated a private key for each node in some way, but only one public public key. Next, all voting information is signed by this private key (k, n). For the same message, only after collecting the signatures of k nodes, can we construct a total signature that can be successfully verified by the public public key. In this case, if the node's proposal is to reach a consensus, it must collect the signatures of 2f + 1 nodes for the same "agree with the proposal" message, in order to construct a general signature that can be verified successfully using the public public key, otherwise enter Timeout process.

Next, let's look at the approximate process of three-phase confirmation after using threshold signature. Let's focus on the four messages initiated by the leader:

① prepare phase: the leader broadcasts the message msg1 containing its own "proposal + previous commitQC" to all nodes

②Pre-commit stage: The leader receives 2f + 1 nodes' "passed msg1 proposal" signature messages, then uses these signatures to construct a "prepareQC general signature" message msg2, and broadcasts msg2 to all nodes, letting them construct themselves PrepareQC for validation

③ Commit phase: The leader receives 2f + 1 nodes' msg2 prepareQC verification passed signature messages, and then uses these signatures to construct a "pre-commitQC general signature + submit proposal" message msg3, and broadcasts to all nodes pre -commitQC for verification

④decide stage: the leader receives 2f + 1 nodes "msg3 pre-commitQC verification passed" signature messages, this time equal to the leader received a consensus agreement, and then use these signatures to form a commitQC general signature message msg4, Broadcast to all nodes

hotstuff-1

The above is the approximate process of the three-phase confirmation, which is a bit ridiculous. As can be seen from the figure, there are two main differences from the two-phase comparison:

  1. Three-phase confirmation has one more pre-commit phase than two-phase confirmation. In fact, the pre-commit phase + commit phase of the three-phase confirmation is equal to the commit phase of the two-phase confirmation. In other words, the commit phase of the two-phase confirmation includes a vote of 2f + 1 nodes to prove that he did not lie. This certificate was independently taken out in the three-phase confirmation for a round of voting, which is the pre in the figure above. -commit phase. This is the main difference between the two-stage confirmation model and the three-stage confirmation model. In this understanding, the above process is not spared.
  2. All nodes only deal with the leader: The three-phase confirmation cleverly passes the threshold signature, and optimizes the message that should have been collected by all nodes into a "leader unified collection, other nodes only need to verify the overall signature" process. The message complexity is reduced to O (N). Of course, the timeout mechanism is similar. You need to collect the 2f + 1 timeout signature to construct a general signature and replace the commitQC.

The above is what I think is the main difference between three-phase confirmation and two-phase confirmation. Among them, QC (quorum certificate) is a statutory node number certificate, which can be understood as the total signature.

Chained Hotstuff

Earlier we described that the three-phase confirmation is actually Basic HotStuff. In the application scenario of the blockchain, the overall process can be summarized as follows:

hotstuff-2

In general, the three threshold QCs “prepareQC-> pre-commitQC-> commitQC” are continuously converted. Based on the three-phase confirmation, the hotstuff authors have further optimized the algorithm. This is Chained HotStuff :

hotstuff-3

The voting rounds and network news have been well optimized. The three rounds of voting that were originally required have been merged into one round. The end result is this:

hotstuff-4

This is where the chained hotstuff is cleverly designed.

Libra's consensus component

Earlier we introduced the background knowledge of BFT, including the story of the Byzantine generals, the number of Byzantine nodes that the Byzantine fault tolerance algorithm can tolerate, and a partial synchronization model. Then, we detail the two-stage confirmed Byzantine fault tolerance algorithm. Finally, we talk A clever combination of threshold signatures and three-stage confirmation of Hotstuff, as well as further optimized chained Hotstuff.

LibraBFT consensus is implemented based on Hotstuff. Let's first look at the block structure of Libra:

libra-bft-1

Is it similar to chained Hotstuff? In each round of voting, Libra will not only verify the current Proposal Block, but also reach a consensus on the Grandpa Block. In this way, Grandpa Block will be committed and store the transactions included in the Block and the user status involved in the DB.

libra-bft-2

Therefore, Libra's Ledger storage always seems to be two heights lower than Block storage, because the two blocks of height have not yet reached a consensus, and they are in the pre-commitQC phase and prepareQC phase. The consensus process implemented by Libra looks like this:

libra-bft-3

There are two things to note in the picture above:

  1. Round represents a round of voting, round events are maintained by Pacemaker (pacer), and Pacemaker components are mainly responsible for algorithm activity and maintaining timeouts;
  2. Green indicates the current round leader, which is responsible for generating blocks and initiates proposals; yellow indicates other Validator nodes, which are responsible for verification and voting; red indicates the next round leader, which is responsible for collecting statistical votes, processing commits, and then constructing blocks in the next round and initiating proposals ;

There are several key issues that are not reflected in the process:

  • How to determine the proposal
  • How to update a group of proposer

Let's discuss them one by one.

How to determine the proposal

There are three types of proposal strategies in Libra's implementation: FixedProposer, MultipleOrderedProposers, RotatingProposer.

  1. FixedProposer: indicates that a fixed node is designated as a Proposer, which is generally used for testing;
  2. RotatingProposer: indicates that a batch of nodes take turns as Proposer, and each round returns a Proposer;
  3. MultipleOrderedProposers: a bit more complicated, see the figure below, which also uses a random number VRF algorithm (you can study it if you are interested), to ensure that all nodes in each round get a set of Proposers in the same order, but the order of Proposers between each round different;

libra-bft-4

Therefore, in the case of using MultipleOrderedProposers, each round of voting has a set of Proposers. Proposers have a priority. Non-Byzantine nodes will vote for the highest-level Proposer according to the priority of Proposer. This reduces the risk that the Proposer is a Byzantine node. If a group of Proposers are Byzantine nodes, then the Validator casts a timeout TC.

How to update a group of proposer

Earlier we described the approximate determination process of Proposer. Although the order of multiple round Proposer groups is different, it is always the same Proposer is constantly changing the order. How about replacing these Proposers? In particular, it is necessary to reach consensus on the same result between so many nodes at the same time.

libra-bft-5

In fact, updating the Proposer group requires calling the add_validator or remove_validator contract through the transaction (the contract is stored under a special account). The transaction will be executed when it is packaged. If there is a validator update, the update will be placed in the block's block_info At the same time, the transaction will be packaged into the Block. Finally, as the block is committed, all validators will update the local proposer group based on the block_info information. In this way, all nodes update the proposal group in the same round. The entire process is called Reconfiguration in libra.

to sum up

In the third main line of Libra, there are many concepts and contents. We have introduced these contents one after another:

  1. In order to ensure that the data of all accounts is correct, it is necessary to quickly reach a consensus on the order of transactions globally;
  2. Current mainstream consensus protocols, such as Pow, Pos, BFT, etc .;
  3. Libra uses the Hotstuff algorithm, which is a type of BFT, so we have learned a lot about BFT-related background knowledge, including two-phase confirmation, three-phase confirmation, and chained Hotstuff;
  4. Finally, we learned about Libra's consensus components, including the voting process, the process of determining the proposer, the reconfiguration process, etc., which basically covered the main process of the LIbraBFT consensus protocol.

This article is written by Deng Qiming, a technical expert at Westar Laboratory. This is the official website of Westar Labs. Welcome everyone to pay attention to http://westar.io/