Technical Interpretation | On-chain Capacity Expansion Technology Guide (1): Data Layer and Network Layer

On-chain expansion (that is, one-layer expansion) is for the expansion of the blockchain protocol layer, which is to transform the underlying blockchain, including its data layer, network layer, consensus layer, and incentive layer, making the blockchain itself more Faster and larger capacity, to achieve the purpose of capacity expansion. This article will introduce the existing on-chain capacity expansion technology starting from the consensus layer, network layer and data layer.

Data layer

Directed acyclic graph

DAG (Directed Acyclic Graph) is a data structure. Originally a common data structure in the computer field, because of the excellent characteristics brought about by the unique topology, it is often used to deal with a variety of algorithm scenarios such as dynamic planning, navigation to find the shortest path, and data compression.

As the latest competitive technology for distributed ledger, DAG can be used to solve the efficiency problem of traditional blockchain. It was first proposed as the "Ghost protocol" in the bitcointalk forum in 2013. This proposal was to solve the Bitcoin expansion problem at that time. The PoS consensus protocol Casper described by Vitalik in the Ethereum Purple Book is also based on the GHOST PoW protocol. PoS variant. Later, in the NXT community, someone proposed the DAG of block, which uses the topology of the DAG to store blocks and solve performance problems.

The traditional blockchain has only a single chain, and the packaged blocks cannot be executed concurrently, while the mesh topology of the DAG can be written concurrently. By promoting synchronous bookkeeping to asynchronous bookkeeping, DAG is considered to solve the high concurrency problem of traditional blockchains, which is an innovation of blockchains from capacity to speed.

What is a directed acyclic graph?

In simple terms, a directed acyclic graph is that multiple branches follow the main chain, and the general direction of these chains is consistent and there is no loop.

Figure: Directed Acyclic Graph Data Structure

DAG vs Blockchain


  1. The constituent unit of the blockchain is block (the block that packages multiple transactions, and the dispute over the block capacity is not shown for the time being). The constituent unit of the DAG is tx (transaction).
  2. The blockchain is single-threaded, and the DAG is multi-threaded.
  3. All transaction records of the blockchain are recorded in each block, and each transaction of the DAG is recorded separately in each transaction.
  4. The latest block on the blockchain only needs to be added to the longest chain, and new transactions in the DAG need to be added to all previous "chains".
  5. Blockchain requires miners, and DAG does not need miners (no packaging, verification transactions required).

Advantages :

  1. Improve transaction speed: DAG can multi-thread transactions, while blockchain can only single-thread transactions, and the more traders in DAG the faster the speed.
  2. Cost savings: DAG directly delegates the environment of transaction confirmation to the transaction itself, so no commission is required, which will help increase transaction volume.
  3. Save resources: There is no miner role in DAG, so there is no need to consume social resources.

Disadvantages :

  1. The transaction time is difficult to determine: DAG is an asynchronous communication mode. The biggest problem brought by asynchronous communication is that the consistency is uncontrollable, so the confirmation time will be longer.
  2. The double-spend problem is difficult to solve: DAG is a multi-chain structure, so the double-spend problem is prone to occur, and this problem has not yet been well solved.
  3. Transaction redundancy: It is easy for multiple chains to process the same transaction, which will cause the system pressure to increase exponentially.

DAG project :

Picture: DAG related projects, picture source:

As representative projects, IOTA, Byteball and NANO have proposed their own innovations. For example, Tangle (Tangle) proposed by IOTA is a distributed ledger structure based on DAG. These projects are actually quite interesting, here we dig a pit first, and then have time to fill in.

to sum up

In general, DAG is expected to be used in the blockchain data layer to seek breakthroughs due to its fast speed and high throughput. However, as a newer data structure, its security and consistency issues need to be solved urgently, and large-scale testing also takes some time. We are concerned that more and more DAG-based innovation projects and DApps are emerging, and DAG deserves our continued attention.

Segregated Witness or Expansion

We know that the block generation time of BTC is about 10 minutes = 600 seconds, the size of each block is 1 MB, and the size of the transaction status and witness data of each transaction is about 250 B. In this way, BTC per second The transaction limit is approximately 1,000,000 / 250/600 ≈ 7 tps (tx per second). 7 What does tps mean? Compared with the millions of tps in the current transaction systems such as UnionPay and Alipay, it is dwarfed. Let's take a look at this simple formula again. The throughput of BTC is determined by three factors: the numerator-the block size , the denominator- the size of a transaction, and the block interval .

Throughput ≈ block size / (transaction data size * block interval)

Block interval

In the Bitcoin network, Satoshi Nakamoto uses the differential parameter to dynamically adjust the generation rate of new blocks every 2016 blocks, making the average block generation time approximately equal to 10 minutes. There is a reason for this. The block generation interval is composed of verification time, propagation time, and consensus time (BTC: PoW time). Blindly reducing the block generation interval will increase the score fork rate. The 10 minutes selected by BTC as the output The block interval is actually a compromise between network efficiency and network security. It seems that the improvement brought by shortening the block production time is limited and temporarily unworkable for BTC.

Single transaction size

Bitcoin transaction information mainly includes two parts, the transaction status (based on UTXO transaction details) and witness data (to verify the legitimacy of the transaction, mainly the signature). The data size of these two parts is about the same. The reality is that not all participants need to care about these two parts of data. In other words, if the witness data is extracted from the single transaction data, even if the block size is not changed, the number of transactions in the packaged block can be increased, thereby increasing the throughput of Bitcoin. One technique is called Segregated Witness .

Segregated Witness , Seg regated Wit ness (SegWit for short), was first proposed by PieterWuille (Bitcoin Core Developer, Blockstream Co-Founder) in December 2015. Witness refers to the verification of the legitimacy of a transaction in Bitcoin. Segregated means that the witness data is extracted from the transaction information and stored separately.

Roughly the underlying data of SegWit changes as follows:

1) SegWit defines a new data structure for the block and adds an attribute "witness" ("Witness" field);

2) Move the witness data part of the original block data (mainly including script data and signature data ScriptSIG) into the witness data structure,

And the structure data is submitted to the block independently of the transaction data.

3) The data in the witness structure is only related to verifying the validity of the transaction, and has nothing to do with the actual transaction details data (does not participate in the calculation of the Merkel tree root).

4) In the "Segregated Witness" version, a new weight measurement unit is added: WU (weight unit). The maximum size of each block is 4M WU.

The data in the original 1 MB area is equal to 4 WU per byte; the data in the witness data structure is equal to 1 WU per byte.

5) Although Segregated Witness does not directly increase the original block size, the Witness data is extracted, and the signature and other information are moved back. The actual block capacity has reached 1.8 MB.

Beyond that, the main goal of Segregated Witness is

  1. Solve Bitcoin's Transaction Melleability. (Ref: Transaction Melleability Explained)
  • Pave the way for Lightning Network. (Ref: BTC Lightning Network)

Although Segregated Witness can perform a soft expansion of Bitcoin, using only Segregated Witness, the performance improvement brought about by the size of a single transaction is limited, and there are still several orders of magnitude gaps from the ideal performance.

Block capacity

That is to expand the block . The topic of block size adjustment is endlessly debated, even involving beliefs and underlying philosophy, leading to a triad situation of BTC, BCH, and BSV. For detailed ins and outs, please Google #blockchain expansion capacity dispute #, and will not repeat them here .

To put it simply, the so-called expansion is to increase the limit of the block size of Bitcoin from 1MB to 2MB, 4MB or even TB level in order to achieve the purpose of capacity expansion. Let's try to summarize the core contradiction of the dispute over expansion:

  1. The main contradiction comes from within the Bitcoin developer team. The differences in the underlying philosophy and beliefs of Bitcoin are a bit vague, but it is indeed the root cause.
  • Enlarging the size of the block requires the modification of the bitcoin code. The direct impact is the hard fork, which is the emergence of subsequent fork chains such as BCH and BSV. Of course, the hard fork itself has two sides.
  • With the increase in block size, for full nodes that need to download the entire ledger, the hardware and network investment costs and maintenance costs will become higher and higher, which will make large-scale mines with concentrated resources more and more advantageous, Many nodes will be eliminated, and the reduction of nodes will be detrimental to the security and decentralization of the entire network.
  • The increase in block size requires greater network bandwidth and network delay for the mining pool. For China Mining (and all mine owners who do not take advantage of the network) with 4/5 large mining pools but insufficient network bandwidth, Is unacceptable. Going further, unfair competition is clearly contrary to the vision of the blockchain.

Let's take a look at Bitcoin and its two forks:

  • BTC electronic gold adopts the soft expansion direction of Segregated Witness + Lightning Network, and the block size remains 1MB.
  • BCH electronic cash, forked BCH from BTC on August 1, 2017, increased the block size from 1MB to 8MB (which will eventually increase to 32MB), and reduced transaction costs, claiming to be Satoshi Nakamoto ’s “point-to-point Electronic cash system.
  • BSV global ledger. On November 15 , 2018, the internal division of BCH again diverged due to the future development direction of BCH. On November 15, a hard fork was born to create the Nchain BCHSV (later named BSV) with a block size of 128MB. On July 24, 2019, Bitcoin SV upgraded the protocol, and the block limit was adjusted to 2 GB.

Let's take a look at the current status of these three chains. As shown in the following figure, BTC is still a well-deserved leader, and the market value of BCH and BSV has also successfully squeezed into the top 5 and top 11 of the cryptocurrency respectively.

  • The number of online full nodes BTC is far ahead of BCH and BSV, which on the one hand represents the support of mining and the entire network's computing power, and is also the result of miners voting with their feet after the hard fork.
  • From the perspective of network security (that is, to prevent 51% computing power attacks), the computing power of BCH is maintained at 2.13 EHash / s, which is about 1/40 of BTC. BSV is maintained at 903 PHash / s, which is about 1/80 of BTC.
  • The results of the number of transactions on the chain seen from the chart do not seem to be that big, but in fact, another data report shows that most of BCH and BSV are currently the Data Recording Output at the beginning of the OP_RETURN opcode, rather than The real deal.

If a transaction is output, its lock script starts with the OP_RETURN operation code. This transaction is also called an OP_RETURN transaction, or a Null Data transaction. It will be written into the ledger along with the transaction, but it will not be treated as UTXO and will not be taken. The expansion of the UTXO set, so its amount is usually 0. BCH and BSV limit their data size to 220 B.

  • The transaction fees of BCH and BSV are much lower than BTC. This is due to the advantages of large blocks that can attract miners to pack blocks with low fees, and the design strategies of BCH and BSV.
  • Figure: Current status of BTC, BCH and BSV

    BTC, BCH, BSV The dispute over the expansion of the blockchain is, in simple terms, mainly the dispute over the size of the block. Let's take a look at the actual block size of these three and their current block size upper limit:

    • We simply use the formula block usage = block actual size / block size upper limit to get it.
    • BTC's block utilization rate is maintained at about 80%, which is almost fully loaded. It can be imagined that once the transaction demand increases, the network will fall into a congested state. It is worth noting that the values ​​in the figure are isolated witness and the situation after the Lightning Network goes online.
    • The block usage of BCH and BSV is usually less than 1%, especially after the BSV is expanded to 2GB. It is worth noting that the transaction composition of BCH and BSV is completely different from that of BTC. As mentioned above, most of the transactions of BCH and BSV are composed of OP_RETURN transactions, while the maximum data size of OP_RETURN transactions is only 220 B, which Is a non-single variable compared to BTC.

    Figure: Actual block utilization status of BTC, BCH, BSV in 2019

    In theory, from a design perspective, there is an upper limit on the block size of Bitcoin . In the paper "On Scaling Decentralized Blockchains", it is mentioned:

    [Throughput limit.] The block size should not exceed 4MB, given today's 10 min. Average block interval (or a reduction in block-interval time). A 4MB block size corresponds to a maximum throughput of at most 27 transactions / sec.

    According to a series of demonstration processes in the paper, the conclusion is that under the current block interval of 10 minutes, the block size should not exceed 4MB, and the corresponding throughput is at most 27 transactions / second.

    to sum up

    Above, we have analyzed the different expansion strategies of BTC and BCH & BSV: BTC adopts the "soft" capacity expansion scheme through Segregated Witness + Lightning Network without changing the block size of 1MB, while BCH and BSV use capacity expansion Blocks and "hard" expansion strategies that add opcodes. At the same time, we also analyzed the current status of the three and the actual block utilization.

    Personally, I believe that BTC's current on-chain capacity expansion scheme can enhance throughput and reduce transaction costs, but its bottlenecks are also obvious and it is difficult to adapt to the needs of future high-volume transactions. BCH, especially BSV's large block strategy, although effective Improve transaction throughput and reduce transaction costs, but the problem of centralization and security issues caused by it is still a huge challenge, especially the behavior of BSV expanding the block size from 128MB to 2GB in the above two It seems a bit blind before the problem is solved, and it is completely unnecessary. It seems that the topic is back to the "impossible triangle" problem of the blockchain. How to find a delicate balance between the three issues of scalability , security , and decentralization is both philosophy and belief.

    Network layer


    The concept of sharding originates from the database field. Sharding refers to the horizontal partitioning of the data in the database (dividing different rows of the table into different partitions), and each shard is stored on a separate database server instance to spread the load. The basic idea of blockchain sharding is to divide the nodes in the blockchain network into several relatively independent shards. A single shard handles small-scale transactions and even stores only part of the network state. Multiple shards process transactions in parallel. In theory, the throughput of the entire network will increase .

    See a blog on Near's official website, which introduces the ideas of mainstream sharding protocols, which is worthy of reference. From the perspective of this article, we introduce the classification of sharding technology and the challenges faced by sharding to build our understanding of the future direction of the blockchain.

    Challenges facing sharding:

    Network security

    The most direct challenge of network security is to reduce the cost of evil . In network sharding, the nodes in the network are allocated to different shards in accordance with the established rules to achieve the purpose of capacity expansion. This will bring a problem: The force size and the number of validator nodes will be much smaller than the original entire network, which will greatly reduce the cost of attacking a single shard compared to attacking the entire network.

    • 51% attack on PoW consensus network
    • Witch attack on non-PoW consensus networks

    The comparison of evil costs before and after slicing is as follows:

    Figure: Comparison of attacking non-fragmented network and attacked fragmented network partition, image source:

    Aiming at the problem of network security, the main idea of ​​current fragment design is to focus on which consensus algorithm, how to divide the fragment size, and random node allocation to reduce the probability of being attacked by a single fragment and increase the cost of evil.

    Data validity

    Data validity is mainly about how to identify illegal blocks .

    A typical scenario is shown in the following figure. The malicious nodes conspired to generate an illegal block B on the 1st shard. An illegal transaction was packaged in block B so that Alice obtained a token. A legitimate block C appeared on block B, attempting to obfuscate the illegal block B, and initiated a cross-segment transaction in C, and transferred this 1 token to Bob. At this point, the illegal transaction token Stuck in the correct shard 2 and the legal block Y.

    Figure: Example of cross-shard transaction data validity issues, source:

    A possible solution to the above problem is to use an undirected graph structure to arrange the shards, where each shard is connected to several other shards, and only allows neighboring shards to perform cross-shard transactions. , Cross-shard transactions between non-adjacent shards need to be routed through multiple shards, and the verifier in each shard needs to verify all transactions of this shard and adjacent shards at the same time. Adopting this implementation is Kadena's Chainweb.

    Figure: Sharding structure using undirected graphs

    Interestingly, although forcing validators to verify adjacent shards at the same time can solve the potential risk of a single shard being compromised, if the collusion of multiple shards is compromised, the above problems will still occur. For example, the following figure: Segment 2 is adjacent to 1, and 3, but segment 1 and 3 are not adjacent. Among them, segments 1 and 2 are controlled by the malicious node, and 3 is a trusted segment. If shards 1 and 2 conspired to commit evil, the illegal transactions generated on shard 1 were confirmed and packaged in shard 2 through cross-shard transactions. Please note that the blocks that are packaged on shard 2 at this time Y is completely legal and can be verified by the shard 3 verifier, but shard 3 cannot verify the illegal transaction that occurred on shard 1.

    Figure: Using undirected graphs to solve data validation failure scenarios, source:

    There are two main directions that are currently seen to correctly solve the validity of data: Fisherman and cryptographic proofs of computation.



    The idea of ​​the phisher solution is to set a challenge period when the block header information is transmitted across the shards, during which any honest node can provide proof of block invalidation, as long as there is at least one honest validator node in the shard, this method can Ensure the security of the system, as shown in the following figure:

    Figure: Fisherman mechanism schematic, source:

    In the currently proposed protocols, this is the main method, but it has two major disadvantages:

    • The challenge period needs to be long enough to ensure that honest nodes can download potentially invalid blocks and verify and initiate queries, which will reduce network efficiency.
    • The second problem is that the existence of questioning allows the attacker to use it as a new attack vector. The attacker sends a large amount of spam to the network through invalid questioning, thereby delaying honest nodes to effectively question, and allowing illegal transactions to pass the challenge period.

    Coincidentally, the mechanism of phishers is also used in other expansion schemes (such as the second-tier capacity expansion project Polkadot). Although the scheme of phishers has been more optimized than the scheme of directly identifying invalid blocks, the solutions to the above two issues are currently I haven't seen a better answer, and follow-up development is worthy of attention.

    Concise non-interactive knowledge argument ( SNARK , Succinct N on-interactive Arguments of K nowledge)

    The second solution is to prove that a calculation is performed correctly by using a certain encryption structure. Regarding what SNARK and zk-SNARKs are, please Google it yourself. Here is an introduction to Zcash: What are zk-SNARKs?

    The main challenges of zk-SNARKs come from:

    • Performance, it takes a certain time to create the certificate itself. For example, each transaction in the Coda protocol takes about 30s to create a certificate.
    • The technology itself is still in its infancy and has not been tested for a long time.
    • "Toxic waste" / "Toxic waste". SNARK relies on a trust preset, in which some people perform some calculations and discard the calculation intermediate products faithfully. Conversely, once all participants conspire and retain the computing intermediate, fraud and deception will occur.
    • System design introduces additional complexity.
    • Protocols with Turing-complete smart contract language will not be able to use SNARK to prove the validity of the chain.


    Data availability

    Data availability simply means whether the data recorded on a block is available .

    Generally, the nodes operating a specific blockchain are divided into full nodes ( nodes that download each complete block and verify each transaction) and light nodes (only the block header is downloaded, and Merkel proofs are used only for the part of the state and transaction of interest). Then, in the sharding scenario, the verifier in each shard is actually the full node of the shard , and other participants in the system (including the beacon chain) serve as light nodes.

    Figure: Schematic diagram of full node conspiracy to destroy data availability, Source:

    Now, if most of the full nodes are colluding, they can generate a valid or invalid block, and provide the hash to the light nodes to take advantage (such as providing the transfer information to the merchant and tricking the merchant to provide the service), and then The full node no longer distributes the content of this block, and even if the verifier of the next block is credible, it cannot prevent the node from maliciously deleting the historical data and causing the block to be unavailable. And this kind of problem, in the sharding scenario where the number of full nodes is smaller, the cost of evil is even lower (network security).

    The auxiliary methods to solve this problem include Proof of Custody and Erasure Code.

    Figure: Schematic diagram of guardianship certificate, Source:

    The main idea of ​​the guardianship certificate is to let the notary rotate between shards more frequently than the verifier. The only thing an impartial person needs to do is to prove that the block can be downloaded ie the data is available. Because the notary does not need to download the complete block data, it can rotate and verify between different shards faster and more frequently.

    Figure: Schematic of erasure code, Source:

    Another idea is to use the structure of erasure coding , so that even if a part of a block is unavailable, the entire block can still be completely recovered. Polkadot and Ethereum Serenity are also designing around erasure codes to ensure that their light nodes can confirm that block data is available.

    to sum up

    In this article, we briefly introduced a part of the on-chain capacity expansion technology in blockchain expansion, including the mainstream ideas and technologies of the data layer and the network layer. In the following, we will continue to analyze the on-chain capacity expansion mainly from the consensus layer. The content of the volume is supplemented, including zero-knowledge proofs.


    The authoritative guide to Blockchain Sharding, part 1-NEAR Protocol

    Unsolved Problems in Blockchain Sharding-NEAR Protocol

    In-depth introduction to the sharding of the expansion solution on the blockchain

    Vernacular DAG: Comprehensive inventory of 3rd generation blockchain technology DAG

    Next Generation Blockchain DAG Technology Lab

    Report | The End of the Bitcoin Scaling