Interpretation of Libra blockchain programming language

On June 18th, Libra, an encrypted digital currency project initiated by Facebook, was officially unveiled. Libra's white paper was also released in parallel with several technical documents, highlighting the new blockchain programming language Move and the consensus protocol LibraBFT. This article explains the Move language from a technical perspective and leads everyone to make a preliminary understanding of the language.

Digital asset management

The most eye-catching feature of Move is its complete set of programming systems for digital assets. Compared to the existing blockchain programming language, Move focuses on strengthening the status of digital assets. With the Move language, developers can more flexibly and securely define and manage digital assets on the chain.

Challenge

Defining digital assets on blockchains is challenging. Because of the nature of assets in the physical world, it is difficult to express in the digital world.

  • Scarcity: The supply of assets should be controlled. Assets cannot be copied, and creating new assets is a privileged act.
  • Access Control: The system must be able to ensure that the participants' own assets are protected by access control policies.
Existing solution

Existing blockchain programming languages ​​have the ability to define digital assets on blockchains, but still have some shortcomings and limitations. Let's take a brief explanation of Bitcoin and Ethereum as an example.

Bitcoin encodes its assets in UTXO and uses bitcoin scripts to define asset transfer rules to ensure scarcity and access control. Developers can use Bitcoin scripts to define various access control policies.

However, bitcoin scripts have considerable limitations and poor scalability. Bitcoin scripts are not Turing-complete languages, and developers can't customize data types and procedures. Therefore, if you want to define a new digital asset on the Bitcoin blockchain, or implement a more complex access control strategy, you can only use some external means to be unfriendly to developers.

Ethereum uses numerical values ​​to represent Ethereum and guarantees the scarcity and access control of Ethereum at the system level. Developers can use EVM bytecode to write smart contracts that interact with digital assets on the chain. The EVM bytecode is a Turing-complete language with great scalability. Developers can not only use it to operate Ethereum, but also customize digital assets and write complex access control strategies. In addition, you can use the more advanced Solidity language to write smart contracts, and the Solidity code will be compiled into EVM bytecode before running.

However, in Ethereum, custom digital assets are at a lower level than Ethereum. Ethercoin security has system level protection, and the security of custom digital assets can only be guaranteed by developers. Regardless of the underlying EVM bytecode or advanced Solidity, developers can only use numerical values ​​to indirectly represent digital assets. Since there is no support for scarcity and access control at the level of the programming language, developers are prone to errors during the encoding process, leading to serious consequences such as asset duplication, reuse, and loss.

First class resources

In response to the above problems, Move proposed the concept of a first-class resource. Developers can not only use first-class resources to implement secure digital assets, but also write the right business logic for digital assets.

Move distinguishes digital assets from other data types (such as integers, booleans, etc.) and defines them as resource types. The semantics of resource types are inspired by linear logic: a resource cannot be copied, nor can it be implicitly discarded (that is, at the end of the program, the resource is in an unowned state) and can only be transferred. In addition to the above differences, resource types can be saved in the data structure in the same way as ordinary data types, and can be used as parameters of the process.

Move uses modules to manage resources. First, the type structure of the resource is defined in the module, and each resource must be defined in a module. Second, the module also defines processes that can operate the resource, such as create, modify, destroy, and so on. Developers can and can only operate on resources through the public process provided by the module, and cannot modify the resources by themselves. That is, when resources are used to define a digital asset, all of its business logic, including access control policies, is already defined in the module. Any operation that detaches or bypasses the module for digital assets is illegal and not allowed.

In the Libra blockchain, Libra tokens are also defined as a resource type, consistent with user-defined resources. This means that both custom digital assets and Libra tokens are equally protected.

However, it should be noted that the security constraints imposed by the Move language on digital assets are limited to modules. The security constraints inside the module still need to be guaranteed by the developer.

flexibility

Another feature of the Move language is that it provides Libra with greater flexibility, including the following two aspects. For ease of understanding, the following uses Ethereum and Solidity language for comparison.

Transaction script

The Ethereum transaction contains the address of the target smart contract and the input data provided to the target smart contract. If the Solidity language is used, the content of the input data is a function signature and a parameter list. That is, a transaction in Ethereum is essentially an interface that invokes a contract (regardless of calls between contracts).

In Libra's trading, it is replaced by a trading script. A transaction script is a complete, arbitrary content of the Move program. In the transaction script, the process of the module that has been published in the book state can be called several times, and some additional local processing, such as a simple control flow, etc., is performed based on the result of the call.

Trading scripts give Libra great flexibility. Users can not only invoke the process in a module like Ethereum, but also perform some one-time behavior very easily. For example, transferring a token to a plurality of people at once, even if the module of the token does not provide a process of bulk transfer.

Modular system

Both Ethereum and Libra use account-based ledger status.

In Ethereum, accounts are divided into user accounts and contract accounts. User accounts do not contain data (except for Ethereum). The contract account contains both the code and the data. From an object-oriented perspective, Ethereum's smart contracts are like singleton objects. Since the data cannot be saved directly in the user's account, most of the digital assets on Ethereum implement a separate ERC20 Token contract that contains asset information for all users.

Libra can use the Move module to implement smart contract-like functions. However, the Move module is not the same as the smart contract in Ethereum, and it needs to be more flexible. The Move module contains only code (including resource structure definitions and procedures), and the data is stored in the resource. Although the operation of the resources needs to be done through the procedures in the module, the two are not bundled together. There can be several resources of the same type and they are published under different accounts. Therefore, in Libra, digital assets are kept under the user's own account, rather than being stored in a centralized manner like Ethereum. In addition, multiple resource type structures can be defined in one Move module.

Although the relationship between modules, resources, and processes in Move is similar to the relationship between classes, objects, and methods in object-oriented programming, there are actually big differences. The design of the Move module is more like the style of functional programming.

safety

Move is designed with security in mind, and Move programs that do not meet security requirements will be rejected.

Typed bytecode

Move must reject programs that do not meet critical security attributes (such as resource security, type safety, memory security, etc.), so you need to perform chain verification before the program is executed. To this end, Move uses a typed bytecode as the format of the executable.

The typed bytecode is between a high level language and an assembly language. Take Ethereum as an example. Solidity is a high-level language, and EVM bytecode can be considered as assembly language. Although Solidity has also made various verifications, it can guarantee that the compiled EVM bytecode has certain security. However, since the EVM is executed with EVM bytecode, Solidity's guarantee of security actually occurs during the compilation process. To perform chain verification, the compilation process must be placed on the chain. Otherwise, the attacker can completely bypass Solidity and directly write the vulnerable EVM bytecode. On the one hand, the compilation process on the chain is bound to have an impact on performance. On the other hand, the EVM bytecode lacks data type information and is difficult to verify. In summary, the use of typed bytecode provides both security and performance loss due to compilation.

More conducive to static verification design

For computational cost considerations, Move's on-chain verification contains only some of the key security attributes. In addition to chain verification, Move is also designed to support advanced under-chain static verification. To this end, Move has designed the following to be more suitable for static verification than most common languages.

First, Move does not support dynamic scheduling. All called targets can be statically determined. This allows Move to accurately infer the execution of the program without having to build complex call graphs.

Second, limited variability. Move borrows Rust's "borrow check" mechanism to ensure that each value has at most one variable reference at the same time.

Third, modularity. The Move module enforces data abstraction and key operations localization on resources. That is to say, for programs other than modules, each resource is a black box. The external code cannot know the details inside the resource, and can only operate on the resource through the module's public process.

virtual machine

Bytecode interpreter

The Move bytecode interpreter is stack based and similar to the CLR and JVM. The instruction uses the operands in the stack and pushes the result onto the stack after execution.

Move supports six major bytecode instructions:

  • Operations for copying and moving data from local variables to the stack, such as CopyLoc, MoveLoc, etc., and instructions for moving data from the stack to local variables, such as StoreLoc.
  • Operations that type stack values, such as pushing constants onto the stack, and performing arithmetic and logical operations on stack operands.
  • Module-related built-in instructions, such as: Pack and Unpack, are used to create and destroy the declaration type of the module; MoveToSender, MoveFrom, which is used to publish and unpublish the module type under the account; and BorrowField, which is used to obtain a certain module A reference to a type of field.
  • Reference related instructions, such as: ReadRef for read references, WriteRef for write references, ReleaseRef for release references, and FreezeRef for converting variable references to immutable references.
  • Control flow operations, such as conditional branches, as well as procedure calls and returns.
  • Blockchain specific built-in operations, such as getting the sender address of a transaction script, creating a new account, and so on.

In addition, Move also provides cryptographic primitives such as sha3. These primitives are implemented by modules of the standard library, not bytecode instructions.

Bytecode verifier

The bytecode validator enforces security attributes for modules and transaction scripts. If you do not pass the bytecode verification program, you cannot publish the module or execute the transaction script.

The bytecode verifier is mainly checked in three aspects.

  • Structure check to ensure that the bytecode table is formatted correctly. Through structural checks, errors such as illegal table indexes, duplicate table entries, illegal type signatures, etc. can be found.
  • Semantic check for errors such as illegal parameters, dangerous references, resource copies, etc.
  • link. By associating the used structure type and process signature with its declaration module, it is found that the internal procedure, the process and the definition mismatch are illegally called. During this process, the global ledger status is accessed.

to sum up

The above briefly explains the Move language, hoping to give readers a basic understanding of the main features and design of the Move language. If there is any error in this article, please readers are correct. For more details on the Move language, please read the official technical documentation for the Move language .

Source: China Banknote Blockchain Technology Research Institute

Author: Wang Nan