Babbitt Live | Guo Yu: 3 minutes to understand zero-knowledge proof, why is it a double-edged sword?

On December 22, the 2019 Digital Assets and Blockchain Annual Conference and China Investment Association Digital Asset Research Center Inauguration Conference was held in Beijing. Guo Yu, the founder of Abe Laboratories and the academic and technical committee member of the Digital Asset Research Institute, shared the title of "Zero-knowledge proof, a missing link in blockchain technology."

The following is a compilation of the content of Guo Yu's speech, organized and released by Babbitt.

What is a blockchain? Searching the Internet for articles criticizing the blockchain, one of the reasons is the low throughput. Bitcoin transaction speed is only 7 transactions / second, Ethereum transaction processing speed does not exceed 30 transactions / second, and international credit card Visa at least 2,000 transactions / second. The core reason for the low throughput of the blockchain is network bandwidth. When more than 10,000 nodes around the world perform distributed consensus, it is inevitable that the throughput rate will drop. By simply increasing the block generation speed to increase the throughput rate, it is already extreme to Ethereum. The fast block generation speed will cause frequent forks, which will not substantially increase the throughput rate, or reduce the number of nodes to increase the throughput rate. Making the system secure will not be guaranteed. So, how to improve the throughput without reducing the security?

Anti-intuitive zero-knowledge proof

Zero-knowledge proof is one of the solutions. Ethereum founder Vitalik conducted an experiment in 2018, introducing zero-knowledge proofs, and the throughput of Ethereum can be significantly improved by dozens of times, which can reach 500TPS. The latest Loopring protocol uses ZK Rollup, a zero-knowledge proof technology solution. Based on their test data, it can implement decentralized applications up to 10500TPS on Ethereum.

Picture 1

In 1985, MIT researchers Shafi Goldwasser, Silvio Micali, and Charles Rackoff proposed the concepts of "interactive proof system" and "zero-knowledge proof", and later the top two won the 2012 ACM Turing Award for this work.

Picture 2

Before explaining the zero-knowledge proof, we must first ask, what is "proof"? The word began in ancient Greece and it stands for "Insight"; by the 1920s, proof meant formal logic. Whitehead and Russell spent hundreds of pages in the Principles of Mathematics to prove 1 + 1 = 2. It stands for "Symbol Reasoning"; by the 1970s, proofs were found to be virtually indistinguishable from "procedures"; by 1985, the concept of proofs was extended to a broader concept of "interaction systems", and zero-knowledge proof systems An interactive system.

Picture 3

However, zero-knowledge proofs are counter-intuitive. Why do you say that? As shown in the figure, Bob on the right passes the input X to the left hardware device. On this hardware device, a program Y = F (X, W) will be run. W is a secret data. After the device calculates the result, it will return Y, how does Bob believe that there is no problem in the calculation of Y? This requires zero-knowledge proofs. As long as the device is added with a zero-knowledge proof, we can fully believe that Y is indeed the result of the F calculation, and we can fully trust an uncontrolled hardware, which may be a hardware device with a backdoor. The results of the running calculations have not been maliciously tampered with. To be precise, zero-knowledge proofs can guarantee the integrity of remote computing, but this is counter-intuitive.

Picture 4

What's the use of zero-knowledge proofs? Let me briefly list, the blocks are constantly expanding and getting bigger? With zero-knowledge proof technology, you only need to download one block; transactions on the chain can be traced, zero-knowledge proof can protect the user's transaction anonymity, identity authentication can be performed under the condition of protecting user identity, and the chain can be protected Data privacy, secure data sharing, and on-chain and off-chain data associations can all be accomplished with zero-knowledge proofs.

Understanding zero-knowledge proofs through triple coloring problems

The principle behind zero-knowledge proof is the focus of what I am going to talk about today. I hope everyone can quickly understand it. This is a question to find the answer to the third coloring of the map. We imagine a map as a city, and then connect the cities with lines. Adjacent cities must be colored with different colors. Then it is equivalent to the requirement that the vertex colors of each end of each edge on a graph must be different. Finding the answer of map three coloring is an NP-Complete problem, which is also an NP-hard problem.

Picture 5

Suppose Alice has a three-coloring answer, she wants to prove it to Bob, but she can't let the other party know what the color of each point is, and realize the so-called "zero-knowledge" proof. What should I do at this time? In the first step, Alice will reverse the colors to another set of colors, but after reversing the colors, the map is still a three-stained answer, and then Alice covers each node of the answer with a piece of paper for Bob to see.

Pictures 6

In the second step, Bob can't see every vertex, but he will randomly select an edge for Alice to uncover the paper.

Pictures 7

In the third step, after Alice opened the paper, Bob would see that the colors on the two sides are different. At this time, can he believe that the vertex three coloring problem is a correct answer? Not necessarily, because it may happen that this side he chose is fine, but the other side may be problematic.

Pictures 8

Then again, repeat three steps, Alice exchanges the colors again, and Bob picks a random side to see, can Bob believe this time? Not necessarily. Maybe he was blinded by Alice exactly twice. It is possible.

Pictures 9

But Bob can keep trying, trying N times. As long as N is large enough, Alice's probability of cheating will decrease exponentially, and it is almost impossible.

Pictures 10

Through this scheme, Alice successfully proves to Bob that he does have a triple-colored answer in a zero-knowledge way. This is the most basic concept of zero-knowledge proof.

Zero-knowledge proof operation logic

Having said so much, everyone may feel that this is very far from what we call "anti-intuition". How to trust a remote computer? I specifically explain how the zero-knowledge proof technology does it.

Pictures 12

Suppose we have a super camera, which shoots the detailed process of remote computing, including all states of CPU, memory, and programs. I put the video over from the beginning and theoretically I know that there is no problem with this computing process because cheating at any step In theory, you can find them out. But it can't be done because the video is huge and the inspection process is basically impossible.

Pictures 13

Next we make a simple transformation and use "arithmetic circuits" to re-express this calculation process. What does an arithmetic circuit mean? In theory, most of the computational processes that can be terminated can be converted into circuits composed of "+" and "×" circuit gates to complete the calculation. Enter X at this end of the circuit and you will get Y from the other end. So we don't need a camera anymore, we just need to take a picture, and then I can "check separately" whether the calculation process of each door is correct.

However, there is still a problem. The verification process is too long, which means that we have to check all the gates. Checking the hash calculation requires checking the input and output of more than 20,000 gates. Obviously this is very troublesome. An improvement idea is to encode all gates with polynomials and compress the N-th test to 1 degree.

Pictures 14

As shown on the right, this yellow curve encodes the correct calculation process. If Alice wants to cheat, as long as she changes the number on any pin, the curve will become very different and the change will be magnified. In theory, we can use Schwarz-Zippel's theorem to check any point on the X axis to see if this curve has been changed. This is to use algebraic theory to convert N tests into one.

But this problem is not over yet. In this process, we need to test the data, but we don't want Bob to know any intermediate data and result data in the calculation process, that is to say, we must have zero knowledge of Bob. At this time, our idea is to map the polynomial operation homomorphism to the elliptic curve group. In normal operation, we get integers such as "0, 1, 2, 3 …", which can be mapped to N points on the elliptic curve group for one-to-one correspondence.

Pictures 15

It's hard to push backwards through the points on the elliptic curve back to the integer domain because it is a "discrete logarithmic problem". Next we need to remove the interaction process and turn it into a non-interactive zero-knowledge proof. The idea is that we randomly choose a point on the X axis, and a third party generates a crypto challenge number in advance to challenge the curve. In the end, a concise solution will be formed, that is, a third party generates a cryptographic challenge number and generates two keys. Bob hands the calculation to Alice, and Alice performs the calculation to complete the action of zero-knowledge proof. This is the Pinocchio protocol born in 2013 .

Pictures 16

Zero-knowledge proof is inseparable from formal verification

Is Zero Knowledge Proof Safe? Cryptography professor Matthew Green commented: "Using zero-knowledge proof technology is like taking a shortcut: Via Moria, the huge underground city of Middle-earth, this is a lot faster than going over the mountains, but you may be in contact with Balrog Fight. "So zero-knowledge proof is a sharp double-edged sword, because once you use zero-knowledge proof, hacking will be zero-knowledge, which means that no one knows that it has been hacked.

Pictures 17

2018.3.1 Ariel Gabizon of the ZCash team discovered a fatal error in Appendix B of the paper [BCTV14] that could lead to unlimited coinage. But it is interesting that no other cryptographer discovered this vulnerability for up to 4 years. Until February 2019, the author of the BCTV14 paper and the Zcash team announced this serious vulnerability at the same time.

My conclusion is that if you use zero-knowledge proof, be sure to perform strict formal verification. Finally, I want to emphasize three aspects of credibility: 1. the trust of the consensus protocol provided by the blockchain; 2. zero-knowledge proof provides data trust and computing integrity; 3. formal verification finally guarantees that this computing logic is completely free of problems of. The combination of these three can form a truly trusted mathematical foundation. Thank you all!

related suggestion:

Zero-knowledge proof study notes: background and origin

EY releases third-generation zero-knowledge proof blockchain technology, which can reduce transaction costs through batch processing

Understanding Bulletproofs of Zero-Knowledge Proof Algorithms: Arithmetic Circuits