Introduction to Technology | Analyze Libra core components based on the life cycle of Transaction

Libra involves many things. We introduce the design and implementation of Libra from three lines:

  1. By analyzing the process of Node startup and joining the Libra network, the design and implementation of Network components are introduced;
  2. Around the life cycle of the transaction, analyze the process of receiving transactions, packaging blocks, and running on the chain, and introduce Libra's core components such as Mempool, Executor, and Storage, VM;
  3. Focusing on LibraBFT, introduce Consensus components and the process of reaching consensus on blocks.

Earlier we described the first main line of Libra-the process of Node startup and joining the network, and introduced the design and implementation of the Network component in detail. Here, we will talk about the second main line of Libra-the life cycle of Transaction, and then around the life cycle of Transaction, we will talk about the design and implementation of each core component of Libra one by one. Before describing the life cycle, let's first understand the account model and the relationship between Transaction and Move contracts.

Account model

In fact, the blockchain can be simply understood as follows: Use Transaction as a carrier to record the change process of each Address in an order approved by most people. In order to achieve this purpose, the development of the blockchain has so far abstracted two account models: the UTXO model represented by BTC and the Account model represented by ETH. These two models have their own advantages and disadvantages, a simple comparison:

libra_account

UTXO in English is Unspent Transaction Output. The literal translation is unspent transaction output. The current state of an Address is a UTXO list. Under the UTXO model, when consuming (constructing a transaction), one or more UTXOs are taken as the input of the current transaction, and then multiple UTXOs are generated. The total of Input and Output is equal. At some point in the future, these outputs will be used as inputs for other transactions. Is it something like paper money? In the Account model, each Address usually contains a total and a SequenceNumber counter. Each time a transaction is constructed (construction transaction), the consumption amount is subtracted from the total amount of the current Address, and the corresponding consumption amount is added to another Address. At the same time, SequenceNumber is incremented to ensure that all transactions constructed by the current Address The sequence is to ensure the correct status of the account.

Libra uses the Account model to express ledger data, so transactions have a strict sequence. We will mention this later.

Transaction and Move contracts

Earlier we learned about the account model. In order to facilitate understanding, we made an analogy with the payment scenario. It feels to us that the role of Transaction is to add and subtract a number. For example, Alice transfers an account to Bob. Can't it be applied to more and more complex scenarios? Like games. When the blockchain started, the ability to express was relatively simple. With the promotion of the blockchain, everyone's needs became more and more abundant, and the original design was difficult to meet. We want to be able to express our needs on the chain through a language, so virtual machines, smart contracts, and contract languages ​​have emerged. This is a very broad topic. Libra introduced the Move language as a contract language, which we will not discuss here. So what is the relationship between Transaction, Chain, and Move?

libra-tx-1

Let us assume that the above figure is the account data stored on the chain at a certain moment. Among them, Alice has a Move-defined contract, and the code is stored under her account. In step ① of the figure above, Bob constructs a Transaction, specifies a method for running the corresponding contract under the Alice account in the Transaction, and takes out the data that the contract method can understand from his account as a method parameter, and Sign and broadcast. Step ② in the figure, the miner receives Bob's transaction, packs it into Block, then executes Bob's transaction, and writes the result to Bob's account. In the whole process, the general understanding is that Move defines a piece of logic, Transaction sets the data used to run the logic, and the chain records the final state after the logic runs.

Transaction life cycle

Earlier we talked about two backgrounds. Next, we have a holistic understanding of the life cycle of transactions:

libra-tx-2

This picture is also a picture in Libra's technical white paper. It is similar to the picture that introduced the core components of Libra, but there are some numbers on the arrow. This picture actually represents the complete life cycle of a Transaction from generation to packaging, from execution to on-chain. Let's introduce the meaning of each number in turn:

1. 交易被用户使用wallet或者cli提交到AdmissionControl 2. AdmissionControl运行VM做一些Transaction的前置校验,例如交易的签名校验等等,过滤掉一些无效交易3. Transaction前置校验通过后,会被提交到Mempool中4. Transaction被设置为Ready状态,等待被打包进Block中5. Transaction被设置为Ready状态之后,会被广播给其他Mempool 6. Validator节点的Consensus组件pull对应的Mempool组件,获取一批Ready状态的Transaction,用于创建Block 7. 新创建的Block被广播给其他Validator节点,并且选举Block 8. 拿到新的Block之后,提交到Executor组件执行Block 9. 新Block中的所有交易被提交给VirtualMachine组件,VM按顺序执行Block的所有交易10. 提交被共识选举胜出的Block 11. 广播被共识选举胜出的Block 12. 存储胜出的Block中所有被KEEP的Transaction以及每个address对应的最终状态 

We have an intuitive understanding of the life cycle of Transaction. Next, we go deeper into each component to learn more design and implementation details.

AC Services

From the time the transaction is submitted by the user, first to the AC service.

When talking about the first main line, we mentioned that AC is a GRPC service, which is equivalent to a gateway for Node. Node contains multiple GRPC services and many RPC interfaces. However, only two types of interfaces that deal with users need to be exposed to the wallet or cli calls:

  1. Interface for submitting transactions
  2. User status related interface

So AC doesn't have much logic, it just encapsulates the internal GRPC interface of Node to expose it to users. In addition, AC also has a role to perform simple filtering on submitted transactions.

Mempool service

The transaction was submitted to the Mempool service via AC.

When talking about the first main line, we know that Mempool is used to store unchained transactions. Let's take a look at the overall design of Mempool:

libra-tx-3

Mempool mainly contains two modules:

  1. Mempool Service: a Grpc service, used to receive transactions submitted from the AC
  2. Share Mempool: It has two main functions. One is to synchronize transactions between different Mempool nodes through the Mempool protocol (mentioned in the first mainline), and to store and process transactions.

We have a general understanding of Mempool, but there are still some questions. What exactly does Mempool do with transactions? Under what circumstances will transactions be packaged? When is the transaction broadcast to other Mempool? We will answer these questions next.

Transaction status transition in Mempool

After a transaction is submitted to Mempool, it is first marked as a different state based on the source:

  1. Unready: The transaction status submitted by the user to Mempool Service
  2. NonQualified: Transaction status synchronized from other nodes

These transactions will be sorted in a certain order, waiting to be marked as Ready. Earlier we mentioned that Libra uses the Account model to associate user-initiated transactions in sequence through SequenceNumber. When Mempool finds that all other transactions in front of a transaction have been uploaded or are in the Ready state, then this transaction can be Marked as Ready, which means that this transaction has the conditions to be packed into the block. If the transaction that is currently set to the Ready state is transitioned from the Unready state (that is, the user submitted to the current Mempool through the AC), then the transaction will be forwarded to other Mempool.

libra-tx-4

The above picture is the approximate state transition process of Transaction in Mempool, and the general ordering rule of Transaction is: gas_price> expiration_time> address> sequence_number

Consensus components

The state transition of Mempool was introduced earlier, and the transaction submitted by the user is in the Ready state, waiting to be packed into the block. Considering the complexity of Consensus, and the current main line mainly introduces the life cycle of transactions, here is only a brief introduction to the on-chain process (the process of multi-node consensus reaching will be described in detail in the third main line), which is roughly as follows:

libra-tx-5

Among them, compute-> execute and commit-> store will be described later. Vote will be explained in detail in the third main line. Here, only two places need to be noted for the time being:

  1. The Consensus component actively pulls a batch of Ready transactions in Mempool and packs them into the Block.
  2. After the block is elected and submitted, the Consensus component will actively delete the committed transaction in Mempool.

Executor & VM components

Since the Executor is only an entry point for running the VM, the Executor and the VM are introduced together here. In the previous Consensus component process, after the Block is built, it will be submitted to the Executor to comput, and then enter the VM to execute. This is the execution of the transaction. This is the compute-> execute process. Some details need to be noted:

libra-tx-6

The light color is initiated by Executor, and the dark color is the Move contract executed in the VM. After the Consensus component submits the new block to the Executor component, the Executor will provide the block with an operating environment, initialize the VM, and run the Coinbase of the block and other user transactions in turn in the VM. So the VM will execute the Coinbase transaction first, which will also run the block prologue in the LibraAccount contract. Then execute the transactions packaged in the block in order, and finally return the state after execution to the Consensus component.

Storage service

When introducing the Consensus component, we mentioned that Block will be committed and data will eventually be written to the Storage service. This is the commit-> store process. At this time, the transaction submitted by the user has been recognized by everyone. Regarding the Storage service, we may have two questions:

  1. What modules are included in the Storage service?
  2. What data does Storage ultimately store?
Storage module

Storage is a GRPC service that stores all data on the chain. Information such as the user's ledger status is obtained from Storage. Libra chose RocksDB as the underlying storage database. Based on RocksDB, SchemaDB encapsulates unified CRUD operations on data and serialization and deserialization methods of Key-Value. LibraDB is based on Libra's ledger data and characteristics, defines a series of data structures, and performs database operations on these data structures. Encapsulate all these operations into Storage services and provide them to components such as Executor and AC.

libra-tx-7

Ledger data

The modules included in the Storage service were described earlier. We learned that LibraDB defines some data structures around the characteristics of the Libra ledger. What are the characteristics of the Libra ledger? So what are the core data structures?

  1. Ledger Features Libra uses Account, which needs to store global user status, all historical transactions in the current status, and the order of transactions. In other words, the main data Storage needs to store: user status, transactions, and transaction order. Different from other public chains, the general public chain records the order of the blocks (the transactions in the blocks are also ordered) to achieve the purpose of recording the order of all transactions and transactions. Libra stores transactions directly and uses the Merkle Accumulator to record the order of transactions. libra-tx-8
  2. The core data structure Libra uses Sparse Merkle Tree and Merkle Accumulator to store user status, transactions, and transaction order, respectively.

libra-tx-9

SparseMerkleTree uses 256 bits to store user status. In theory, there can be a total of 2 256 power accounts. The figure above is an example of a 4-bit SparseMerkleTree. Each orange leaf node represents a user; the square box is a placeholder, indicating that there is no account under this branch, reducing account storage;

libra-tx-10

MerkleAccumulator stores transactions and the order of transactions. In the figure above, each dark-colored leaf node represents a transaction; the square squares are placeholders. Newly listed transactions will be added one after the other in order.

The two core data structures of Storage are mentioned above. The entire Storage is storing and optimizing around them. More details will not be expanded.

to sum up

The above is the process of the entire life cycle of the transaction, which is processed by AC, Mempool, Consensus, Executor, and VM in order, and finally stored in Storage. Then we dived into each component or service, not only introduced some of their design and implementation, but also learned the core details of transaction processing.

This article is written by Deng Qiming, a technical expert at Westar Laboratory. This is the official website of Westar Labs. Welcome everyone to pay attention to http://westar.io/