Popular Science | Use the overlay to change the format of the Ethereum state tree

Author: Guillaume Ballet

Translation & Proofreading: Pei Qi & A Jian

Source: Ethereum enthusiasts

The way that accounts and contracts store data is one of the many issues affecting Ethereum. The Ethereum protocol chose Merkle Patricia Tree (MPT, Merkel Patricia Tree) to organize account and contract data. Although this data structure works well in theory, in practical applications, it brings more problems than it can solve. The core developers have been discussing for many years and want to change this data structure into a binary tree. In this article, I will explain my views on this issue and how to achieve this transformation.

The processing method I proposed includes a period of transition, during which time the network must maintain two tree structures at the same time. The advantage of this is that the process of converting the tree structure will not affect the operation of the chain, and can ensure that all accounts are converted into a binary format.


Currently, Ethereum's state tree is made in hexadecimal. Hexadecimal means that each node has 16 children nodes. In theory, this approach is fine, because having more child nodes means that fewer "layers" (ie tree heights) are needed to store all the data.

For example, the following figure is a key-value pair (170, v) represented by a hexadecimal tree. In hexadecimal, 170 is recorded as 0xaa, so you only need two layers: the first layer records the first a, and the second layer records the second a.

-Figure 1. An example of a hexadecimal tree showing how the value v is stored at the corresponding key 0xaa. The key length of this tree is only 2 bytes, and only the subtree along 0xaa is shown. For brevity, unrelated subtrees are replaced with "…"

It can be seen that the tree above is very short and wide. Given the same key-value pair, the following figure shows the case of binary tree storage. 170 is represented as 10101010 in the binary tree.

-Figure 2. The same key-value pairs as in Figure 1, stored in a binary tree. For brevity, unrelated subtrees are represented as "…"

It can be seen from the figure that the binary tree is much deeper and narrower.

In Ethereum, each block contains a stateRoot field, which is the root hash value of the MPT that represents the global state of Ethereum after the block is processed. In general, this hash value is obtained by hashing a list of 16 child nodes of the root node. The hash values ​​of these child nodes are obtained by hashing the list of 16 child node hash values, and so on.

Every time a new block is generated from a packaged transaction, the miner updates the account tree and recalculates the root hash. The root hash is stored in the stateRoot field of the new block, and then the new block is agreed.

-Figure 3. The state root field in the block header, pointing to the root of the hexadecimal tree-

The problem is: if you want to hash all the nodes, the time to recalculate the root hash is too long. Therefore, in order to calculate the hash of the root node, the miner will retrieve the sibling hash value of the nodes at the same level from the database ( sibling hashes ). Although the latter (miners retrieve sibling hashes from the database) takes less time than the former (miners get all the leaf nodes from the database and do hashes), this operation is still time-consuming. Because each hash must be taken from the database.

In a hexadecimal tree, you usually need to extract 15 sibling hashes for each layer. In the example I constructed above (Figure 1), (recalculating the root hash) requires 30 hash values.

Although the binary tree level is a little deeper, only one sibling hash value is needed at each level. In the above example (Figure 2), only 8 hashes are needed! This is why the binary tree is better in practice.

Overlay transformation method


Unfortunately, converting to a binary tree is not simple. There is too much data to be converted, and it will take more than 15 seconds to generate the block.

In addition, suppose you want to translate a 5000-page book. The author keeps telling you that they have made some changes to the story, and these changes will affect the pages you have translated … Then the process will be endless Too. The format of the conversion status tree is also the same problem: maybe you just completed the format conversion of an address, and the user used the address (so the status of the address is updated), then you have to convert it again from the beginning (because the status of an address Updates will also affect the entire state tree).

The solution to this problem is to add a transition period. During the transition period, build an overlay tree on the base of the hexadecimal tree . This overlay tree is in the form of a binary tree, and its role is to save all changes that occur in the state until the underlying hexadecimal tree is completely converted to a binary tree. The conversion is carried out in 3 steps.

Step 1-Conversion

Under this method, when the block height is H1 , there must be two state roots: one is the "base" hexadecimal tree state root, and the other is the "overlay" binary tree state root.

-Figure 4. During the conversion process, the block has two state roots: one is the read-only root of the traditional hexadecimal tree, and the other is a read-write root that covers the binary tree-

The hexadecimal tree is set to read-only, so any updates to the status will be made on the overlay tree.

When a transaction reads or updates an account, the system first searches the overlay tree. If the account is not found in the overlay tree, then the value will be searched in the old hexadecimal tree.

At the same time, the hexadecimal tree is converted in the background. There is no need to worry about value insertion at this time, because all changes will be stored in the overlay tree above.

Step 2-Primary tree switch

When the background conversion process is complete, the miners announce that they are ready to replace the read-only hexadecimal root of the root tree with the conversion result (read-only binary root). The reading and writing of the status is the same as in step 1. (Translator's Note: The read-only binary base tree at this time is obtained from the original hexadecimal state tree)

-Figure 5. In the second phase of the conversion, the miner uses the root of the converted binary tree in the block header to replace the root of the hexadecimal tree, indicating to the network that they are ready-

When enough series of blocks give the same value to the converted binary root of the root tree, it means that most miners have completed the conversion and approved the converted tree. The merge process begins. (Translator's Note: The merge at this time is for read-only binary base trees and read-write binary overlay trees)

Step 3-Merge two trees

The merge process continues to advance: each time a new block is generated, n keys are deleted from the overlay tree, and they are reinserted into the binary base tree. This process continues until all keys are removed from the overlay tree. When this step is reached, the block header no longer retains the root of the overlay state tree.

There is only one core of the whole step: if the key to be written when the transaction is executed exists in the overlay tree, the key will be deleted from the overlay tree, and the write operation is directly performed on the binary base tree.


Next step

In order to estimate the time required to complete the conversion, I have made a prototype system with a low conversion rate. We are sure that the time spent in the whole process will not be too ridiculous, that is to say, a few days is enough. We will announce more details as the algorithm improves.


This proposal benefited from the valuable comments of Alexey Akhunov, Vitalik Buterin, Anna George, Sina Mahmoodi, Tomasz Stanczak and Martin H. Swende.

Original link: https://medium.com/@gballet/ethereum-state-tree-format-change-using-an-overlay-e0862d1bf201