Introduction to Ethereum Technology: Synchronous Experiments in Quasi-Stateless

本实验用到的原始数据和脚本: https://github.com/mandrigin/ethereum-mainnet-resolver-witness-stats 

introduction

There is a method that may speed up the initial sync process (initial sync of the block chain from the genesis block), which is to use the block witness data to construct a cache tree in advance to avoid speed Slow state access. This requires additional hard disk space and network bandwidth, but may greatly speed up the synchronization process.

The principle is that, in general, to execute a block, we need some data on the Merkel tree. Although there was some data on the Merkel tree before a block was executed, the data may not be enough to execute the block. So, normally, we also need to extract data from the state database (state db) and add it to the Merkel tree before we can verify the transaction. This process can be slow because hard disk access / database queries are slower.

According to this problem description, we can divide three different schemes:

1) Normal process (that is, the scheme currently used in Ethereum nodes)

  1. Before block B was executed, we had state tree T1;
  2. When B needs to be executed, we add the missing data from T1 to T1 to form T1 ', T1' ', and so on. Every time we encounter information that is not on T1, we look it up in the database (slow speed).
  3. After executing B, we have a state tree T2, which has all the account states needed to execute B.
  4. Keep T2 for later use.

2) Stateless process

  1. Before the execution of block B, we did not have a state tree; however, we can get a witness data W to reorganize the state tree required to execute this block.
  2. We execute W and obtain the state tree T2.
  3. Block B is executed on T2 without searching the database.
  4. After the block is executed, T2 is discarded.

3) Semi-stateless folw (ie the scheme to be tested in this experiment)

  1. Before the execution of block B, we have a state tree T1 and witnessed the data W1, W2, …, which is enough to convert T1 to T2
  2. Execute W1, W2, … on T1 in order, and finally get T2 without querying the database.
  3. Block B is executed on T2 without querying the database.
  4. Keep T2 for future use.

Using a quasi-stateless process in the initial synchronization can get most of the benefits of a stateless process without having to transfer that much data because we reuse the state tree cache.

In quasi-stateless schemes, the parallel execution of blocks is more limited


So, to test the performance of a quasi-stateless solution, we need to measure two things:

  • How much additional hard disk / bandwidth does this method require? Is it really better than a fully rich state approach?
  • How much faster is the initial synchronization?

In this article we will focus on testing hard drive requirements.

Set up experiment

  • The maximum size of the state tree (Merkel tree): 1 million nodes. Once the number of nodes exceeds this value, we evict the LRU nodes to free up memory. In this way, we can control the memory usage of the state tree.
  • Part of the witness data is stored in the database ( we use boltdb ). The structure of each entry is as follows:
  key: [12]byte // 区块号+ 状态树上节点的最大数量value: []byte // 见证数据,按文档中的描述予以序列化(https://github.com/ledgerwatch/turbo-geth/blob/master/docs/programmers_guide/witness_format.md) 
  • We do not store contract code in witness data (this is a deficiency of our current architecture).

The data is obtained as follows (requires a synchronized turbo-geth node)

 (in the turbo-geth repository) make state ./build/bin/state stateless \ — chaindata ~/nvme1/mainnet/mainnet/geth/chaindata \ — statefile semi_stateless.statefile \ — snapshotInterval 1000000 \ — snapshotFrom 10000000 \ — statsfile new_witness.stats.compressed.2.csv \ — witnessDbFile semi_stateless_witnesses.db \ — statelessResolver \ — triesize 1000000 \ 

Experimental results

storage

Starting from the genesis block, 6,169,246 (619 million) blocks were synchronized, and the witness database (bolt db) reached 99GB.

Quantile analysis of witness data size

 python quantile-analysis.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv 

 平均值0.038 MB中值0.028 MB 90 分位值0.085 MB 95 分位值0.102 MB 99 分位值0.146 MB最大值2.350 MB 

Data size

 python absolute_values_plot.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv 

The size of the witness data from the genesis block to the height of 6.1 million blocks is truncated at 1MB. Take the moving average over 1024 blocks.

Data size under normal circumstances (stage after resolving Shanghai attack)

 absolute_values_plot.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv 3000000 

The size of the witness data after the Shanghai DDoS attack is resolved, and the moving average is taken as 1024 blocks.

Zoom in on the witness data size during the DDoS attack

 python ddos_zoom.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv 

Zoom in to see the impact of DDoS attacks on witness data size (raw data).

It can be seen that during the period between the heights of 2.3 million and 2.5 million and the height of 2.65 million and 2.75 million, the size of witness data increased significantly.

Full stateless vs. quasi-stateless witness data size

 python full_vs_semi.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv 

The size of the witness data in the completely stateless state is adjusted based on the witness data in the quasi-stateless state plus the missing contract code part.

It can be seen from this figure that using quasi-stateless methods can save a lot of data (compared to completely stateless methods).

in conclusion

Adding a stateless parser will increase the amount of data that needs to be transferred / stored by 0.4 MB per block. Compared with providing witness data by block, this value saves too much, even if we compare the gain we can get by changing the state tree mode, it also saves a lot (about the size of the witness data in the hex tree and binary tree modes) Section, see my previous article ) (Translator's Note: For the Chinese translation, see the hyperlink at the end of the article).

If this performance is okay, then it is obviously a good way to speed up the initial synchronization; and its data requirements are smaller than the completely stateless method.

(Finish)


Original link: https://medium.com/@mandrigin/semi-stateless-initial-sync-experiment-897cc9c330cb Author: Igor Mandrigin translation & proofreading: A Sword