Libra Move hardcore interpretation

Facebook recently published the alliance chain project Libra, the biggest highlight of which is the Move language. Below we will interpret the white paper "Move: A Language With Programmable Resources" from a technical perspective for your reference.

For the sake of understanding, we use Bitcoin, Ethereum and Libra to make a comparison.

Programmable currency, programmable applications and programmable resources

In fact, from the title of the white paper, we can roughly see the difference in the design goals of the three projects.

The goal of Bitcoin is Programmable Money, so the white paper title is "Bitcoin: A peer-to-peer electronic cash system".

Ethereum's goal is to program Programmable dApps, which expands to more general areas on a currency basis. So the title of the white paper is: "Ethereum: a next generation smart contract and decentralized application platform", the title of the yellow book is: "Ethereum: A secure decentralized generalized transaction ledger".

And Libra's design goal is just between the two – Programmable Resources, or programmable assets.

Facebook's technical route is more pragmatic, without attempting more subversive innovations, but focusing on the "assets" between "currency" and "universal applications", focusing on solving practical problems and facilitating engineering realization. From this point of view, Libra is neither blockchain 3.0 nor 4.0, but a blockchain 1.5. But this does not mean that Libra's goal is not challenging. In fact, it is more difficult to realize a system that can guarantee the security of assets and provide sufficient flexibility than to think of a perpetual motion machine that solves the "impossible triangle".

So, what is the difference between “programmable currency”, “programmable application” and “programmable resource”?

Since they are all "programmable XX" sentences, their main difference lies in two points: 1) programming, 2) programming.

What is the programming?

What is programmed is what the system describes or abstracts, and what is in the real world.

The bitcoin system abstracts the concept of "currency" or "book". Currency can be described by a number, which is the "balance" of an account. Users can transfer some money to others through “transactions”. When the Bitcoin network receives a transaction, each node checks whether the transaction is legal, such as whether you are spending your own money and whether there is enough balance (bitcoin does not allow overdraft). When these checks are successful, the node will do a simple addition and subtraction calculation: deduct the amount of the transfer in your account and add the same amount to the other party's account. Therefore, the only function of Bitcoin is to keep accounts, so that the total amount of money will not increase or decrease inexplicably in the process of transferring accounts to each other (regardless of special cases such as mining rewards and black hole addresses).

The Ethereum system abstracts the "application", and the types of applications are all-encompassing, such as games, lending systems, e-commerce systems, exchanges, etc. These are applications. In theory, any traditional computer program can be ported to Ethereum. Therefore, Ethereum records the internal data of various applications (ie “contract status”), such as inventory, order, settlement information, etc. of an e-commerce system. This information cannot be described by a simple number, it must allow the user to define very complex data structures, and allows the user to perform any desired operations on the data through code (smart contracts). Of course, these applications also include "currency books." In fact, it is this type of application (called the “ERC20 Smart Contract”) that is currently the most widely used in Ethereum. Because Ethereum sees such applications as one of the many applications that the platform can support, there is nothing special about other types of applications, so there is no more for such applications. Security protection only provides interface specifications like ERC20. A new currency issued in Ethereum, the correctness of the transfer logic is entirely the responsibility of the developer.

In the storage structure of Ethereum, the ERC20 token is a “secondary object” and is stored in a different place from the ETH native token balance. For example, as shown in the figure above, 0x0, 0x1, and 0x2 are three Ethereum addresses, where 0x0 and 0x2 are ordinary accounts and 0x1 is a contract account. We can see that each account stores an ETH balance, which is a First-Class Object. In contract address 0x1, a smart contract code MyCoin is also stored, which is an ERC20 token application. The entire book of MyCoin is stored in 0x1 space. How to modify it is determined by the contract code in 0x1.

Whether intentional or not, ERC20 tokens are very vulnerable to security breaches. That is to say, in the Ethereum system, the native token ETH and the token issued by the user do not enjoy the same level of security.

So, can you not go so far, trying to abstract some asset types that are more complex than simple numbers, and not pursuing all-encompassing "universality"? This is the starting point for Libra. Libra can define asset types that are more complex than currencies, such as a basket of currencies, financial derivatives, and how to operate them. Such assets are called “resources”. Move improves asset security by restricting operations on resources to prevent inappropriate modifications. Regardless of the operational logic of the resource, two constraints must be met:

  • Scarcity. That is, the total amount of assets must be controlled, and users are not allowed to copy resources at will. In layman's terms, it is to allow banks to print money, but does not allow users to use the copier to "make" new money;
  • Permission control. Simply put, the operation of a resource must satisfy some predefined rules. For example, Zhang San can only spend his own money, not the money of Li Si.

The picture above is the state of the world of Move. Unlike Ethereum, it treats all assets as “First-Class Resources”, whether it is Libra's native token or the user's own assets. The balance of any "currency" is stored in the space corresponding to the user's address, and its operation is strictly restricted. This object, called a resource, can only be moved in a transaction and can only be moved once and cannot be copied or destroyed. Even strict to assign a value to a local variable in the code, and not using it later is not allowed.

This asset is not stored by Libra. It has been used in some of the previous public chains. For example, in the Vite public chain, the currency balance issued by users is also a top-level object. However, Move can support more complex asset types and provide additional protection, which is Libra's main contribution.

How to program?

Let's take a look at how the three projects can be programmed to achieve rich scalability.

In Bitcoin, a "bitcoin script" is defined to describe the rules for spending a sum of money. Bitcoin is based on the UTXO model and can only cost a UTXO if it meets predefined script rules. Complex logic such as "multiple signatures" can be implemented with bitcoin scripts. Bitcoin scripts are a very simple stack-based bytecode that does not support complex structures such as loops, nor is Turing-complete. Although it can be used to distribute new coins on the Bitcoin network, its description capabilities are very limited and unfriendly to developers and cannot be applied to more complex scenarios.

In Ethereum, a Solidity programming language is defined that can be used to develop "smart contracts." The smart contract code can be compiled into a stack-based bytecode, EVM Code, executed in the Ethereum virtual machine EVM. Solidity is a high-level language that references C++, Python, and Javascript syntax. It is a static-type, Turing-complete language that supports inheritance and allows users to customize complex types. Solidity is more like a general-purpose programming language that can theoretically be used to develop any type of program. It has no data or currency type data, and it has any restrictions and protections in terms of syntax and semantics. For example, to use it to develop a new token contract, the balance of the token is usually declared as uint type. If the logic of the balance increase and decrease logic is not carefully handled during the coding, the balance variable will overflow, resulting in excessive coinage, random additional issuance, Serious errors such as underflow and holding.

Looking at Libra, it defines a new programming language, Move, which is primarily oriented to asset-like data, based on the "top-level resources" structure set by Libra. The main design goals are flexibility, security, and verifiability. At present, the syntax design of the Move high-level language has not been completed. The white paper only gives the Move intermediate and Move bytecode definitions. Therefore, we can't evaluate whether the final Move language is friendly to developers, but from the design of Move IR, we can feel its security and verifiability characteristics.

Move syntax

Let's take a brief look at the syntax of Move. The basic packaging unit of Move is "Module", which is somewhat similar to the "smart contract" in Ethereum or the "class" in object-oriented languages. The "Resource" and "Procedure" can be defined in the module, similar to the "Member" and "Method" in the class. All modules deployed on Libra are global, referenced by a similar package name + class name in Java, such as 0x001.MyModule, 0x001 is a Libra address, and MyModule is a module name. The procedures in the module have public and private visibility. The public procedure can be called by other modules. The private procedure can only be called by the same module. The resources in the module are private and can only be accessed by other modules through the public process. Moreover, the modification of the module resources by the external module or process is strictly restricted. The only allowed operation is "Move", and the resources cannot be assigned at will. For example, in Move, an interface similar to MyCoin.setBalance() is not allowed, giving other users the opportunity to modify a currency balance at will.

In addition to restricted resource types, the Move module also allows you to define unrestricted members, called Unrestricted Types, including native types (boolean, uint64, address, bytes) and non-resource class structures. (struct). These unrestricted types have less stringent access restrictions and can be used to describe other application class data that is not related to the asset. From this perspective, the Move language should theoretically have the same descriptive power as Solidity, but since the actual decentralized application always involves the data of the asset class, any structure that references the resource type is also Restricted, there are not many opportunities to really get out of the strict restrictions of the Move language. Therefore, when actually using the Move language development, the programmer must have a feeling of wearing a slap dance, and the code may be more likely to fail at compile time and runtime. In layman's terms, writing code with Move doesn't make you feel "very cool," which is the price of security and verifiability. Imagine that you use the C language to control the allocation and release of memory. Although there is a feeling that "I am God", I will always worry about potential risks such as buffer overflow and memory leaks. However, while developing in Java, although you are not You can control the memory as you want, but don't worry about these memory security issues. Freedom or security is often not compatible.

In a Libra transaction, you can also embed a piece of Move code called Transaction Script. This code does not belong to any module, it is executed once and cannot be called by other code. A script can contain multiple procedures, which are executed as an entry through the main procedure, in which processes in other modules can also be called. This design is a bit like Bitcoin and completely different from Ethereum. In Ethereum, a transaction itself cannot contain a piece of executable code, only a new contract can be deployed or a deployed contract can be called. I don't like Libra's design very much. Since any Move code must be rigorously checked by the Bytecode Verifier to be released to the chain, the marginal cost of this one-time code is much higher than that of reusable. Modules will slow down the speed at which transactions are confirmed and reduce system throughput. Trading scripts are not required. Most real-world scenarios can be covered by modules. Moreover, its existence increases the difficulty of developing and using Libra wallets. When I have the opportunity, I will propose to Libra's development team to cancel the design. .

Let me take a look at the sample code snippet in the white paper to visualize the Move language. Please note that this code is the intermediate language of the Move (IR). In the future, the Move high-level language will definitely provide a series of syntactic sugars to make the code more concise and elegant.

  Public main(payee: address, amount: u64) {
     Let coin: 0x0.Currency.Coin = 0x0.Currency.withdraw_from_sender(copy(amount)); // Deduct amount Coin from the sender balance
     0x0.Currency.deposit(copy(payee), move(coin)); // Add the coin to the payee's Coin balance} 

This code is a transaction script. There is only one main process. It implements a token transfer logic called Coin. It accepts a destination address and the transfer amount as parameters. The expected execution result is the amount of amount of Coin from the transaction initiator. The account is transferred to the address.

The process body has only two lines, and the second line declares a coin variable of type 0x0.Currency.Coin. 0x0 is the Libra address of the deployment of the Currency module. Coin is a resource type and belongs to the Currency module. This is an assignment statement, and the value of the coin is obtained by calling the withdraw_from_sender() procedure of the 0x0.Currency module. When this process is executed, the amount of Coin is deducted from the balance of the sender;

Line 3 calls another process deposit() of the 0x0.Currency module to accumulate the resource obtained above to the balance of the payee address.

What's special about this code is that each place that takes the right value of the variable has a copy() or move(). This is the most characteristic part of the Move language. It borrows the move semantics of C++ 11 and Rust. It requires that when reading the value of a variable, you must specify the value, either copy or move. Using copy to take the value is equivalent to cloning the variable, the original variable value is unchanged, you can continue to use; and use move to take the value, the original variable reference, or transfer ownership to the new variable The original variable will be invalid. The purpose of introducing Move semantics in C++ is to reduce unnecessary object copying and the construction and destruction of temporary variables to improve code execution efficiency. The purpose of the Move language is to improve by stricter grammar and semantic restrictions. The security of the "Resources" variable. In Move, the resource type can only be moved, not copied, and can only be moved once.

If the programmer's coffee is finished, the state is very poor. When writing this code, there is a bug, and the move (coin) of the third line is written as copy(coin). What happens?

  Public main(payee: address, amount: u64) {
     Let coin: 0x0.Currency.Coin = 0x0.Currency.withdraw_from_sender(copy(amount));
     0x0.Currency.deposit(copy(payee), copy(coin)); // move(coin) -> copy(coin)

Since the coin is a resource type and copy is not allowed, the bytecode verifier of Move will report an error on line 3.

If the programmer writes the code, his cat just walks over the keyboard, stepping on the Command and D keys, so the third line of code repeats twice (line 4), what happens again?

  Public main(payee: address, amount: u64) {
     Let coin: 0x0.Currency.Coin = 0x0.Currency.withdraw_from_sender(copy(amount));
     0x0.Currency.deposit(copy(payee), move(coin));
     0x0.Currency.deposit(copy(payee), move(coin)); // Cat did it!

This time the bug is more serious, which will result in the source address being deducted only once, and the target address is doubled. In this scenario, the static check of Move really works. Since the first coin variable is not available after the value of move, the second move (coin) will cause the bytecode verifier to report an error.

Not so lucky in Ethereum, such as the following code:

  Pragma solidity >=0.5.0 <0.7.0; 

Contract Coin {
Mapping (address => uint) public balances;
Event Sent(address from, address to, uint amount);

Function send(address receiver, uint amount) public {
Require(amount <= balances[msg.sender], "Insufficient balance.");
Balances[msg.sender] -= amount;
Balances[receiver] += amount;
Balances[receiver] += amount; // Cat did it again!
Emerge Sent(msg.sender, receiver, amount);
} Ethereum can't find a line of balances[receiver] += amount; (line 11) in the code. Each time send() is called, the total amount of Coin tokens will be oversized.

Move bytecode verifier

Read this, you should be able to realize that the core component of Move is the bytecode validator. Let's see how it validates a section of Move bytecode. The verification process usually involves the following steps:

  • Control flow graph construction: This step breaks the bytecode into code blocks and builds a jump relationship between them;
  • Stack height check: This step is mainly to prevent the cross-border access of the stack;
  • Type checking: This step will type check the code through a "type stack" model;
  • Resource Check: This step focuses on security checks for resource types, preventing resources from being copied or destroyed, and ensuring that – resource variables are used by subsequent code. The bug in the example above is discovered at this step;
  • Reference Check: This step refers to Rust's type system and performs static and dynamic checks on references. The check is done at the bytecode level, ensuring that there are no dangling references (references to unallocated memory) and that the read and write permissions of the references are safe;
  • Global Status Link: This step mainly checks the signature of the structure type and procedure, ensures that the module's private procedure is not called, and that the called parameter list conforms to the declaration of the procedure.

Move virtual machine

Move's virtual machine has more similarities to EVM. It is also a stack-based virtual machine. The instruction set contains six types of instructions: data loading and moving, stack operations/algebraic operations/logic operations, module members and resource operations, reference-related operations, control flow operations, and blockchain-related operations.

Similar to the EVM, each instruction will calculate a gas, and the code will stop after the light is consumed. In Move, the code execution of a transaction is atomic, or all execution is successful, or one is not executed. Interestingly, although Libra is a standard blockchain ledger structure, all transactions are globally ordered, but the Move language itself supports parallel execution, which means that Libra can be improved into a Vite-like DAG ledger in the future. The efficiency of transaction parallel processing.

Future work

Currently Move is still in an early development phase, and the follow-up work includes:

  • Implement the basic functions of the Libra chain, including accounts, Libra tokens, reserve management, verification node addition and removal, transaction fee management, cold wallets, etc.
  • New language features, including paradigms, containers, events, contract upgrades, etc.
  • Improve the developer experience, including designing a humanized high-level language;
  • Formal modeling and verification tools;
  • Support for third-party Move modules.

If there is any error in this article, please readers are correct. For more details, read the white paper or open source code. By the way, this white paper is written quite well, the concept is accurate, and it is easy to understand. Without the use of special formal descriptions or complex mathematical knowledge, a reader who has an understanding of blockchain technology can read it at once. This also reflects the professional and pragmatic style of the Facebook team.

Author: Liu Chunming, Vite Labs founder, block chain technology experts, China Block Chaining Applied Research Center executive director. Please indicate the source.