Vitalik: Eth2 Fragmentation Chain Simplification Proposal

Author: Vitalik Buterin
Source: ECN Ethereum China  

Essential refinement

  • The concept of a persistent shard chain will no longer exist. Instead, each tile is a direct crosslink. The proponent issues a proposal, and the cross-linking committee is responsible for approval.
  • The number of tiles is reduced from the previous 1024 to 64, and the tile size is increased from (target value 16, upper limit 64) kB to (target value 128, upper limit value 512) kB. The total fragmentation capacity is 3-2.7 MB/s, depending on the time slot. If needed, the number of tiles and the size of the block can increase over time, say 10 frames later, eventually reaching 1024 slices, and 1 MB blocks.
  • A number of simplification schemes have been implemented at the L1 and L2 layers: (i) fewer fragmented chain logic required, and (ii) cross-sharding through Layer 2 due to "local" cross-chip communication occurring in 1 time slot Acceleration, (iii) no need to go through decentralized exchanges to facilitate payment across the transaction costs, (iv) the execution environment can be further simplified, (v) no need to mix serialization and hashing.
  • The main disadvantages are: (i) higher beacon chain costs, (ii) longer fragmentation time, (iii) higher demand for “sudden” bandwidth, but lower demand for “average” bandwidth.

Introduction / concept

The current architecture of Ethereum 2.0 is too complex, especially in the expense market. This problem is caused by the layer 2 solution (designed for major failures in the 2.0 base layer): although the block time within the slice is very short (3-6s), the base layer communication time between slices is particularly long. It takes 1-16 epochs (if more than 1/3 of the certifiers are offline, it will take longer). This is an urgent need for an "optimistic" solution: a sub-system subsystem uses a medium-security mechanism (such as a light client) to "pretend" to know the state roots of other fragments in advance and use these uncertain states. The root handles the transaction to calculate its own state. After a while, all the shards will go through the "defender" process, check which calculations use the "correct" information for other shard states, and discard all calculations that do not use the "correct" information.

While this process is problematic, although it can effectively simulate ultra-high-speed communication time in many cases, the gap between "optimistic" ETH and "real" ETH has spawned other complications. Specifically, we cannot assume that the block proposer "knows" the optimistic ETH, so if the user on slice A sends ETH to the user on slice B, then the user on slice B has the protocol layer ETH ( A time delay occurs before the only ETH that can be used to send transaction fees. If you want to avoid delays, you either need to decentralize the exchange (there are complexities and inefficiencies) or you need to relay the market (this will create concerns that monopoly repeaters review users).

In addition, the current cross-linking mechanism greatly increases the complexity. In fact, it requires a whole set of blockchain logic, including reward and punishment calculations, separate storage of intra-segment reward status, and fork selection rules. Inclusion in the segmentation chain as part of Phase 1. This document presents a bold alternative to address all of these issues, enabling Ethereum 2.0 to be used faster and at the same time reduce risk, with some compromises.

Program details

We reduce the SHARD_COUNT (slice count) from 1024 to 64 and increase the maximum number of fragments per slot from 16 to 64. This means that the "optimal" workflow is now between each beacon chain block, and each slice produces a cross-link (for clarity, we no longer use the word "crosslink" because There is no "connection" to the segmentation chain, and it is more appropriate to use the "sharding block" directly.

Please note a key detail: now any slot-N+1 block of a slice can know all slot-N blocks of all slices through one path. Therefore, we now have first-class single-slot cross-sliced ​​communication (via Merkle receipts).

Approximate status

New proposal

In this proposal we have changed the structure of the connected object: no longer contains "crosslinks", which include the "data root" of many fragmented blocks represented in a complex serialized form, but only a single The data root of the block, which represents the content within the block (the content is completely determined by the "proposer"). The tiled block will also include the signature from the proposer. In order to promote the stability of the p2p network, the way to calculate the proponent is still based on a persistent-committee-based algorithm. If no proposal is available, members of the Cross-Link Committee may also vote on the “Zero Proposal”.

We still store a map latest_shard_blocks in the state: shard -> (block_hash, slot), the difference is that the storage epoch becomes a time slot. In the "optimistic situation", we hope that this mapping can update each time slot.

The online_validators are defined as a subset of active verifiers, and the active verifier has at least one epoch in the past 8 epochs containing its proof. If 2/3 of the total number of online_validators agree on a new block in a given slice, the mapping will be updated.

Suppose the current time slot is n, but for a given slice i, latest_shard_blocks.slot < n-1 (that is, the slice has a block skip in the previous time slot), we need the proof of the slice Provide the data root of all time slots in the range [latest_shard_blocks.slot + 1….min(latest_shard_blocks.slot + 8, n-1)].

The fragmentation block still needs to point to the "previous fragmentation block", and we still have to enforce consistency, so the protocol requires multi-slot proof to be consistent. We recommend that the committee adopt the following "forking selection rules":

  • For each valid and available fragment block B (the ancestor block of the block must also be valid and available), calculate the total weight of the verifier of the descendant of the recent message supporting B or B, and temporarily refer to the weight as the fragmentation area. Fast B's "score". Even a blank tile can have a score.
  • Select the highest scored block for slot + 1.
  • Select the highest scored block for slot + k(k > 1). Consider the block in the range that needs to point to the last selected block of latest_shard_blocks[i].slot + (k-1).

Overview

The publishing process between the beacon block N and the beacon block N+1 is as follows:

  1. Beacon block N is released;
  2. For any given slice i, the proposer of slice i proposes a slice block. Execute the block visible beacon block N and the root of the previous block (if necessary, we can reduce the visibility to block N-1 and the old block, which makes it possible to be on the beacon block and the fragment block Make a proposal at the same time);
  3. The prover mapped to the slice i performs the proof, including its opinion on the slot N beacon block and the slice block in the slice i (in the special case also includes the previous slice block in the slice i);
  4. The beacon block N+1 is released, including these proofs for all shards. The state transition function of block N+1 processes these certificates and updates the "latest state" of all slices.

cost analysis

Please note that participants do not need to actively download the tile data at any time. Conversely, when the proponent issues a proposal, it only needs to upload a data limit of 512 kB in 3 seconds (assuming there are 4 million verifiers, each of which requires an average of 128,000 slots to upload once), and then the committee verifies When proposing, it only needs to download data with an upper limit of 512 kB in 3 seconds (required that each verifier wants to download the data in each epoch because we keep one attribute: the specific time slot in each given epoch Each certifier is assigned to a specific cross-link).

Please note that the requirement for this operation is lower than the current long-term load requirement of each verifier, ie about 2MB per epoch. However, this requires a higher "burst" load: the previous limit of 64KB in 3 seconds, now needs to reach the upper limit of 512KB in 3 seconds.

The beacon chain data for the attestations load is changed as follows.

Each certificate has about 300 bytes of fixed data, plus a bit field, which is 4 million bits per epoch, and 8192 bytes per time slot. Therefore, the maximum load of the current scheme is 128 * 300 + 8192 = 46592, and the load in the average case may be closer to 32 * 300 + 8192 = 17792, even if it can reduce the load by compressing the redundant information in the proof.

In this proposal, we can see two kinds of loads (see the "Compression Certificate" section below for details):

  • The proof of slot n will be included in slot n+1. We can allow the inclusion of the two most popular slice/block header combinations, so there are 128 uncompressed certificates.
  • The number of compressed versions in slot n after slot n+1 (about 200 bytes per proof) is up to 128

Therefore the maximum load is calculated as 128 * 300 + 128 * 200 + 8192 = 72192 and the average load is approximately 80 * 300 + 10 * 200 + 8192 = 34192.

Also note that the cost of each time slot in each fragment is 65,536 * 300 / 64 = 307,200 bytes. This provides a natural basis for system requirements for running nodes, so there is no point in recompressing block data.

At the computational level, the only significant increase in spending is the need for more pairings (more precisely, more Miller cycles), with the upper limit of each block increasing from 128 to 192, which will make the block The processing time is extended by 200ms.

"Basic Operating System" shard

Each shard has a state that maps to ExecEnvID -> (state_hash, balance). A tile is divided into a set of chunks, each chunk specifying an execution environment. A chunk of execution relies on the state root and the contents of the block (ie, part of the fragmented block data) as input, and outputs a list of [shard, EE_id, value, msg_hash] tuples, each of which has at most one EE_id ( We add two "virtual" shards: the transfer to shard-1 indicates that the certifier pays the deposit in the beacon chain, and the transfer to shard-2 is the payment to the offer), and we are from the balance of the EE Subtract the total number of values.

In the fragment block header, we placed a "receipt root" containing a map: shard-> [[EE_id, value, msg_hash]…] (up to 8 elements per shard; and should Realize that the vast majority of EE transfers across fragments are sent to the same EE, in which case the number of elements is even smaller).

The fragment block on slice i must contain a Merck branch of the fragment j receipt, and this fragment j is fragmented from each other, and the branch is located in the "receipt root" of another slice (any branch) The slice i knows the receipt root of any slice j). The received value is assigned to its EE (execution environment), and msg_hash is accessible to EE execution.

This allows EEs in different shards to perform ETH transfers immediately, at which point the cost per shard is (32 * log(64) + 48) * 64 = 15360 bytes. Msg_hash can be used to reduce the size of cross-sliced ​​information to be verified with the ETH transfer, so in a highly active system, 15360 bytes of data is essential.

Key benefits : a simpler cost market

We can then modify the Execution Environment (EE) system: each shard has a state that contains the state root and the balance of the execution environment. The execution environment will be able to send receipts and send money directly to the same EE of the other shards. This process will be done using the Merkel branch processing mechanism. The EE state of each slice stores a random number of each of the remaining slices to defend against replay attacks. EEs can also be used to pay directly to block submitters.

This provides sufficient power to enable EEs to be built on the foundation of allowing users to deposit coins on shards and use them for transactional expenses, transferring them across shards as if they were in the same It is as easy to operate within the shards, eliminating the urgency of demanding the relay market and allowing EEs to bear the burden of implementing an optimistic cross-sharding state.

Compression certificate

For the sake of efficiency issues, we have also made the following optimizations. As mentioned earlier, the proof of looking at slot n can be completely contained in slot n+1. However, if such proofs are embedded in subsequent time slots, they must be nested in "simplified form", containing only beacon blocks (headers, objects, sources) without any cross-linking data.

This is both useful for reducing data and, more importantly, by forcing the “old proof” to hold the same data, the number of pairs needed to validate the evidence can be reduced: in most cases, all old proofs from the same time slot Both can be verified via a single pairing. If the chain does not fork, then in the worst case, the number of pairs needed to verify the old proof will be limited to twice the length of the epoch. If the chain does fork, the ability to include all the proofs depends on a higher percentage of honest proposers (such as 1/32 instead of 1/64) and include earlier proofs.

Ensure light client participation

Every day, we randomly select a committee of approximately 256 verifiers, which can be signed on each block, and the verifiers whose signatures are included can be rewarded in block n+1. The purpose of this is to allow light clients with low computing power to participate.

Off topic: data availability root

The operation to prove the usability of a 128 kB data is superfluous and has little value. In contrast, it makes sense to require that a block be able to provide a series root of all the fragmented block data that the block accepts and combines together (perhaps this fragmented block data root exists in a list, where each represents 64 kB data, which makes series connection easier). You can then create a single data availability root based on this data (average 8 MB, worst case 32 MB). Note that creating these roots may take longer than a time slot, so it is best to check the availability of data before an epoch (ie, sampling from these data availability roots will be extra "deterministic" an examination").

Other possible options

  • The fragmented block of slotn must consult the beacon chain block of slot n-1 instead of slot n. Such a measure would allow each time slot to occur in parallel, rather than in tandem, thereby reducing slot time, at the expense of causing cross-sliced ​​communication time to rise from 1 time slot to 2 time slots.
  • If a block proposer tries to expand the block size to more than 64KB (note: target 128kB), he needs to generate 64kB of data first, then let the cross-linking committee sign it, then they can add a reference to the first Signed 64kB data, and so on. This will encourage block creators to submit partial completion versions of their blocks every few seconds, creating a mechanism for pre-validation.
  • Accelerate the development of secret leader elections (for simplicity, even an anonymous ring signature version of about 8 to 16 people can make a difference).
  • Rather than using a "forced embedding" mechanism, we might as well seek a simpler alternative: each shard maintains an "incoming random number" and an "outgoing random number" for each of the remaining shards (ie 8 * 64) * 2 = 1024-byte status), a fragment-made receipt will need to be manually added and processed by the fragment receivers in order. Receipt generation will be limited to a small receipt for each target shard in each block to ensure that one shard can process all incoming receipts, even if all shards simultaneously distribute receipts to it.