Let us show you in non-technical language how zero-knowledge proofs can change the blockchain

Written by: Ronald Mannak, co-founder and CEO of blockchain startup Starling Protocol

Compilation: Lu Jiangfei

Source: Chain News

There are many technical blog posts about zero-knowledge proofs (ZKP), and I recently wrote an article comparing various new general purpose zk-SNARKs. I find that there are very few articles in non-technical language about use cases for zero-knowledge proofs. In fact, zero-knowledge proofs can be used not only for privacy, but also for many other purposes. It is so feature-rich that it may even redefine the way blockchain works.

Streamline the blockchain, compress from GB to KB

Blockchain blocks can be large and their size is constantly increasing. This stems from its original design. We have gradually accepted this reality. However, the testnet recently released by the Coda project is different.

First, Coda's blockchain is fixed in size and will not grow; second, it is only 22KB! Even an 8-bit home computer, the Commodore 64 or ZX Spectrum, from the 1980s can still be plugged in. Moreover, compared to traditional blockchains, Coda's security is similar, or even higher.

There are more and more similar "thin blockchain" projects with more functions, such as Mir and Starling (I am also involved in the Starling project).

How does this work?

As long as you try to deploy a blockchain node, you will know how painful this process is: synchronizing a node can take hours or even days. The blockchain is so large that the disk space and bandwidth of most home computing devices fail to meet the basic requirements. The result is centralization. Even a popular blockchain like Ethereum has 10,000 nodes, most of which are hosted on Amazon AWS and owned by a few entities. Blockchain is not as decentralized as many people think.

Why does it take so long to synchronize a blockchain? There are two main reasons:

The first reason is obvious: downloading hundreds of gigabytes or more of data takes a long time;

Second, the blockchain must complete verification after downloading, because malicious nodes may send you incorrect data.

To verify a blockchain, the entire chain must be replayed from the genesis block: the first transaction is executed and the calculated state is equal to the state obtained by the download. Then go to the next transaction until you check all the transactions on the blockchain. This is not only time consuming but also a waste of resources. Before you, thousands of nodes were doing exactly the same calculations.

Why do you do that? Because in traditional computing, the only way to know whether a calculation is performed correctly is to redo the calculation once. If it's small-scale calculations, that's fine, but the "slow calculation" like replaying a blockchain is completely different.

Zero-knowledge proof that improves efficiency and bandwidth

In fact, there is a way to verify a calculation result at a low cost without redoing the calculation. That is Zero Knowledge Proof (ZKP), the most famous of which is probably zk-SNARK.

How does it work? We need to rewrite the blockchain's replay function as a zk-SNARK. This zk-SNARK will output two things: the original output (just like the original replay function); a small mathematical "proof" that proves that the result is calculated correctly. This "proof" can be as small as 200 bytes (yes, less than 1KB).

This way, we don't need all (or multiple) computers to run the replay function again. One computer creates this "certificate," and an unlimited number of other computers can verify it at a time they see fit. No matter how long the original calculation takes (even if it doesn't matter hours, days, or even years), verification takes only a few milliseconds. This "certificate" can be distributed online, stored on a USB flash drive, or even printed on a T-shirt.

If a malicious node changes the balance of a transaction, then this "proof" will be different from the result, and all validators will reject this state. If a malicious node changes the zk-SNARK code, the result will also be rejected. (There is a "third parameter" and a publicly shared string that will tie this "proof" to the zk-SNARK code. If the code is modified, then this "proof" and the shared string will not match, so The verifier will reject the result.)

We no longer need to redo expensive calculations or download the blockchain (because we already have a valid mathematical proof that the blockchain exists). All you need is the current state (such as the last block), plus a small amount of "proof" that can prove that the current state is part of a valid blockchain, and spend a few milliseconds to verify the result.

Recursive combination

Validating a "proof" is fast, but what about creating it? In fact, time is not fixed. Compared with traditional computing, it is much less efficient in terms of computing and memory. In fact, although the zk-SNARK version of a replay function sounds good, this solution is not good in practice. Compared with the previous non-zk-SNARK replay function, it requires more memory and is even slower.

However, there is another elegant solution. We found that with a little trick, you can actually use recursive zk-SNARKs. With recursion, we don't have to verify this blockchain from scratch, we can build on the previous state. Much faster.

It should be noted that the efficiency of recursive zk-SNARK is still not as good as that of non-recursive zk-SNARK. However, the construction of zk-SNARK has made great progress recently.

A recursive zk-SNARK program takes as input "proofs" and new transactions that belong to the "previous state". It will verify the previous state (with the provided proof) and check if the transaction in the new state is valid. If it is OK, it will output the new status and a "proof".

Once the new status and "proof" is distributed to the network, all nodes can directly discard the previous status without doing so without any negative impact. New nodes only need to download the latest status and "proof". This is the secret of projects such as Coda, Mir, and Starling that can achieve small, fixed-size blocks.

In the above example, only one node is needed to create a new block and "proof". Obviously, we don't need to have the same node to generate all the blocks. For example, a node can be randomly selected from many nodes (using a "verifiable random function", many nodes can even choose themselves randomly without deceptive behavior). We can even do better: divide the block generation logic into multiple zk-SNARKs.

The end result is that the block producer does not need a complete blockchain, it only needs the previous state. How much will this reduce the block size? A regular Coda node only needs 22 KB to store the "proof", the current state, and the Merkle path of a certain account balance. With only 22KB, a node can verify the entire blockchain, query balances and create transactions. However, if a block is to be generated, this node needs more: it requires a Merkle tree with the full balance of the previous state. The size of the Merkle tree depends on the number of wallets. If Coda has as many wallets as Ethereum, then Coda block producers need only about 1 GB of capacity. The smallest full node on Ethereum is 230 GB (data for December 2019). The gap is huge.

With zero-knowledge proof, the blockchain network will have more active nodes, which will increase the degree of decentralization and make various programs more likely to interact with the blockchain without the need for solutions like Infura or Metamask . Think about it, 99% of new users chose to give up when installing Metamask. So this change will have a huge impact.

Thanks to everyone for reviewing this article: Daniel Lubarov (Mir), Shane Vitarana, Stan van de Burgt, Taariq Lewis, and Dmitriy Berenzon. The author authorizes Lianwen to translate and publish the Chinese version of this article.