Dry Goods | Ethereum Stateless Client R & D Progress and Difficulties

Written in front: Due to the current data structure adopted by Ethereum 1.0, the state explosion problem of Ethereum makes the operation requirements of full nodes increasingly high. To solve this dilemma, researchers have proposed two solutions One is state leasing, the other is the so-called "stateless client", and the recent eth1-> eth2 merger proposed by God V requires the use of "stateless client". What exactly does this concept mean What is the status of its R & D? A new article published on the Ethereum Foundation's official website, " The Development of Stateless Ethereum, " will give the answer. The original author is Griffin Ichiba Hotchkiss.


Here is the translation:

In a previous article on Ethereum 1.x, we gave a quick introduction to the origins of Eth 1.x research projects, the risks they face, and some possible solutions. We end with the concept of stateless Ethereum and leave this article with more details about stateless clients.

Stateless is a new direction in Eth 1.x research, so we will do a fairly in-depth study to really understand the challenges and possibilities of this direction in the future. For those readers who want to dig deeper, I will do my best to put out links to related resources.

Stateless Ethereum

To know where we are going, we must first understand the concept of "state". When we say "state", it means "state of the transaction."

Ethereum's complete "state" describes the current state of all accounts and balances, as well as the collective memory of all smart contracts deployed and running in EVM. Every final block in the chain has one and only one state, which is mutually agreed by all participants in the network. This state will change and update with each new block added to the chain.

In the context of Eth 1.x research, we must not only know what the state is, but also that it is in the protocol (as defined in the Yellow Book) and most client implementations (such as geth, parity, trinity, besu, etc.) How to express it.

What is trie?

The data structure used by Ethereum is called MTP (Merkle Patricia Trie). The interesting fact is: "Trie was originally taken from the word 'retrieval', but most people pronounce it as' try when they read it. 'To distinguish from' tree '. "Well, I digress. Regarding MTP data structure, what we need to know is:

At the end of the trie, there are all specific pieces of data describing the state (value nodes). This can be the balance of a particular account or a variable stored in a smart contract (such as the total supply of ERC-20 tokens). In the middle is the branch node, which links all values ​​together through hashing. A branch node is an array containing a hash of its children. Each branch node is then hashed and placed in an array of its parent. This continuous hash will eventually reach a state root node at the other end of the trie.


In the simplified diagram above, we can see each value and a path describing how to get those values. For example, to get V-2, we traverse the paths 1,3,3,4. Similarly, V-3 can be achieved by traversing paths 3,2,3,3. Note that the path length in this example is always 4 characters, and usually only one path is available to get the value.

This structure has the important characteristics of determinism and cryptographic verification: the only way to generate the state root is to calculate it from each individual part of the state. By comparing the root hash and the hash that caused it ( Merkle proof ), Easily prove two identical states. In contrast, two different states cannot be created with the same root hash, and any attempt to modify the state with different values ​​will result in different state root hashes.

Ethereum optimizes the trie structure by introducing some new node types (expansion nodes and leaf nodes) that increase efficiency. They encode parts of the path as nodes, making the trie more compact.


In this improved version of the MTP structure, each node will result in a choice between a compressed portion or value of a path shared by multiple subsequent nodes. It is the same data and the same organization, but this trie structure only needs 9 nodes instead of 18 nodes. This may seem more efficient, but in reality it is not optimal. We will explore the reasons in the next section.

To reach a specific part of the state (such as the current ETH balance of the account), you need to start from the state root and climb from node to node along the trie until the desired value is reached. On each node, the characters in the path are used to decide which node to move to next (like a diviner, but it is used to navigate the hash data structure).

In the "real" version of the data structure used by Ethereum, the path is a 64-character (256-bit) address hash, and the value is RLP-encoded data . Branch nodes are an array of 17 elements (of which 16 are each possible hexadecimal character and the remaining one is value), while leaf nodes and extended nodes contain 2 elements (one is a partial path and the other Is the value or hash of the next child node). Regarding this, the Ethereum wiki page may be the best choice, or if you like to dig into the weeds, then this article did a great DIY-trie exercise using Python (unfortunately it Was rejected).

Put it in the database

At this point we should remind ourselves that the trie structure is just an abstract concept. This is a way to package the entire state of Ethereum into a unified structure. However, this structure needs to be implemented in client code and stored on disk (or thousands of disks scattered around the world). This means taking a multidimensional trie and populating it into an ordinary database that only understands [key,value] pairs.

In most Ethereum clients (except turbo-geth), MPT is implemented by creating different [key, value] pairs for each node, where value is the node itself and key is the node's hash.


Therefore, the process of traversing the trie is more or less the same as the theoretical process described earlier. To find the account balance, we will start with the root hash and look up its value in the database to get the first branch node. Using the first character of the hash address, we find the hash of the first node. We look up the hash in the database and get the second node. Using the next character of the hash address, we find the hash of the third node. If we are lucky, we may find an expansion node or leaf node without having to traverse all 64 nibble, but in the end, we will find the required account and be able to retrieve its balance from the database.

Calculating the hash of each new block has the same process to a large extent, but it is the other way round: starting from all edge nodes (accounts), building a trie through continuous hashing, until finally building a new root Hash and compare it with the last agreed block in the chain.

This is the apparent efficiency of the state trie: rebuilding the entire trie on disk is very dense, and the improved version of the MPT structure used by Ethereum improves the efficiency of the protocol at the cost of efficiency.

These additional node types (leaf nodes and expanded nodes) theoretically save the memory needed to store the trie, but they complicate the algorithm of modifying the state in a regular database. Of course, a powerful computer can perform this process extremely fast. However, there are also bottlenecks to pure processing power.


What we have discussed so far is what happened in a single computer running on an Ethereum client such as geth. But Ethereum is a network.The key to all this is to achieve a unified state consensus among thousands of computers and different clients around the world.

The constantly shuffled #Defi, cryptokitty auction or cheeze wizard war tokens, as well as ordinary ETH transmission transactions, will be combined to create a rapidly changing state for the Ethereum client, and they all need to be kept in sync (the more difficult it becomes, the more difficult it becomes ) As Ethereum becomes more popular, the state trie structure will become deeper and deeper.

"Turbo-geth is an implementation that solves the root problem: it flattens the trie database and uses the path of the node (instead of its hash) as the [key, value] pair. This effectively makes the depth of the tree independent of the lookup , And enabled a variety of beautiful features that can improve performance and reduce disk load when running full nodes. "

The state of Ethereum is very large, and it changes with each block. How big is its state? How big is each change? We can roughly locate the current state of Ethereum on about 400 million nodes in the state trie, of which about 3000 (sometimes as many as 6,000) nodes need to be added or modified about every 15 seconds. Keeping pace with the Ethereum blockchain is actually constantly building new versions of the state trie.

"This multi-step process with state trie data is the reason for the huge disk I / O and memory footprint of the Ethereum client. Even the" quick sync "mode may take up to 6 hours. Done. And to run a full Ethereum node, you need to use a fast SSD (not a cheap HDD), because the processing of state changes requires very high disk read / write requirements! "

The important point here is that setting up a new node and then synchronizing it is very different from using an existing node for synchronization. And when we implement stateless Ethereum, (hopefully) the distinction between them will be blurred.

The simple way to synchronize nodes is to use the "full sync" method: starting with the founding block, retrieving a list of every transaction in each block, and building a state trie. For each subsequent block, the state trie is modified, and nodes are added and modified with the replay of the complete history of the blockchain. It takes a full week for the synchronizer to download and execute the state change of each block from the beginning.

The other method, appropriately named "fast-sync", is faster but more complicated: new clients can start from a recently trusted 'checkpoint' ( checkpoint) blocks request status entries instead of requesting transactions from scratch. The total amount of information downloaded this way is much less, but there is still a lot of information to be processed (synchronization is currently not limited by bandwidth, but is limited by disk performance).

Fast synchronizing nodes is essentially competing with the top of the chain. It needs to get all the states in a 'checkpoint', then the state becomes stale and stops being served by the full node (if this happens, it can 'turn' to the new checkpoint).

Once the fast-synchronization node overcomes the obstacles and makes its state fully checkpoint, it can switch to full synchronization, which generates and updates its own copy of the state from the transactions contained in each block.

What is the content of block verification?

Now we can begin to dissect the concept of stateless Ethereum. One of the main goals is to reduce the pain of the synchronization process for new nodes. Considering that only 0.1% of the state is switching from one block to another, it seems like there should be a way to reduce all the extra "stuff" that needs to be downloaded before a fully synchronous switch.

But this is one of the challenges brought by Ethereum 's encrypted secure data structure: In the trie data structure, changing only one value results in a completely different root hash. This is a feature, not a bug! It gives everyone the confidence that they are on the same page (in the same state) as others on the network .

In order to take shortcuts, we need a new piece of information about the state: the block witness content.

Assume that only one value in this trie structure has changed recently (the green block in the image below):


Full nodes that synchronize states (including this transaction) will take the traditional approach: get all state fragments and hash them together to create a new root hash. They can then easily verify that their state is the same as everyone else's (because they have the same hash and the same transaction history).

What about the person who just received the message? What is the minimum amount of information required for a new node to verify to ensure that its observations are consistent with other nodes (at least during the time it observes)?

A new "rookie" node will require older, more sensible full nodes to provide evidence that the observed transactions are in line with everything they have seen so far about state.


In very abstract terms, a witness proof provides all the lost hashes in the state trie and combines some "structural" information about where those has belonged in the trie. This allows "Mengxin" nodes to include new transactions in their state and compute new root hashes locally without requiring them to download the entire copy of the state trie.

In short, this is the idea behind beam sync. The Beam synchronization scheme no longer waits to collect each node in the checkpoint trie, but starts to monitor and try to execute transactions as they occur, requesting the content of each block's witness from the full node for information it does not have .

As more and more states are "touched" by the new transaction, the client can gradually rely more on its own copy of the state, which is gradually filled through beam synchronization until it finally switches to full synchronization.

Pedigree of stateless clients

With the introduction of the content of block verification, the concept of "completely stateless" has begun to be more defined. At the same time, this is where we start to encounter open issues and problems with no obvious solution.

Compared with the beam synchronization scheme, a true stateless client never keeps a copy of the state, it only gets the latest transactions with the witness content and has everything it needs to execute the next block.

You may see that if the entire network is stateless, then this may actually continue-the verification content of the new block can be generated from the previous block, and then there will be verification along the way. )content! At the very least, until the last agreed "transaction state", and the first witness content generated from that state. This is a huge and dramatic change for Ethereum, and it is unlikely to win widespread support.

A less compelling method is to cater for varying degrees of "statelessness". In this network, there will be some nodes that hold a complete copy of the state and can provide new witness content for everyone else.

  1. Among them, the full state node will work as before, but it needs to calculate another witness content and attach it to the new block, or propagate it through the auxiliary network subprotocol;
  2. Partial state nodes can maintain a complete state for only a short period of time, or may only "observe" a state of interest for them, and then obtain the rest of the data they need to verify the block from the content of the witness, This will be of great help to dapp developers running their infrastructure;
  3. By definition, zero-state nodes want their clients to run as lightly as possible, and they can rely entirely on the content of the witness to validate new blocks;

For this scheme to work, it may require chunking and swarming behavior similar to that adopted by bittorrent, in which witness content fragments are propagated as needed and interact with other nodes with (complementary) partial states Establish the best connection. Alternatively, it may involve developing an alternative implementation of a state trie that is more suitable for witness content generation. And these are the things we need to study and experiment!

For a deeper analysis of the trade-offs between stateful nodes and stateless nodes, see the article " The shades of statefulness " by Alexey Akhunov.

An important feature of this semi-stateless approach is that these changes do not have to be made through a hard fork. And through small, testable and incremental improvements, Ethereum's stateless components can be built as a complementary subprotocol, or as a series of uncontested EIPs, rather than as a large "leap of confidence" upgrade.

Roadmap for stateless clients

The elephant in the lab is the size of the content of the witness. The ordinary block contains a block header and a transaction list, and their size is about 100 kB. This is small enough that the propagation of the block is fast relative to the network delay and 15 second block time.

However, witness content needs to include node hashes at the edges and deeper of the state trie. This means that they are much larger: early data shows about 1 MB . Therefore, compared to network latency and block time, simultaneous verification content is much slower, which can be a problem.

This dilemma is similar to the difference between downloading movies and streaming movies: if the network is too slow and the streaming experience is bad, downloading the full movie is the only viable option. If the network is much faster, you can play movies without any problems. And in the middle, you need more data to make a decision. Those sub-standard Internet service providers will recognize the limitations of low-speed networks during high-demand periods.

For the most part, this is the specific problem that the Eth 1x R & D team is trying to solve. At present, our understanding of the hypothetical verification network is not enough to determine whether it works properly or optimally. The main problem is the details (and the data) .

One approach is to consider compressing and reducing the amount of witness content by changing the structure of the trie itself (such as a binary trie) to make it more efficient at the implementation level. The other is a prototyped network primitive (bittorrent-style swarming) that allows witness content to be effectively passed between different nodes on the network. Both of these schemes will benefit from a formal content specification (this specification does not yet exist).

Developers will compile these directions (and more possibilities) into a roadmap and release them in the coming weeks. The main points highlighted in the roadmap will also be the subject of future in-depth analysis articles.

If you can understand the above, then you should have a good understanding of the meaning of "stateless Ethereum" and the background of some emerging Eth1x R & D projects.

As always, if you have any questions about Eth1x's efforts, themes, or if you want to contribute to this, please visit ethresear.ch and introduce yourself, or contact @gichiba and @JHancock on twitter.

Special thanks to Alexey Akhunov for the technical feedback and some trie structure diagrams.

Happy New Year and Happy Moore Glacier!