Hard core! 360 senior security expert Peng Brewing takes Zcash as an example to talk about the security and privacy issues of zero-knowledge proof

On October 12th, the "2019 CCF Blockchain Technology Conference" hosted by the Chinese Computer Society (CCF) was held in Chengdu. 360 senior security expert Dr. Peng Brewing shared the topic “Safety and Privacy Issues in Blockchain Application”, and the content is very hard.

The following is the full text of Dr. Peng's share of content, published by Babbitt.

Bitcoin privacy issues

Bitcoin is a decentralized digital currency that broadcasts transactions to public accounts to prevent double spending. Bitcoin does not have anonymity, personal transaction records, account balances, and merchant cash flow may be tracked, which may create privacy issues and generate equivalence issues with money. People want money to be equivalent, but bitcoin's historical information can be found, some may have been black money, or some have higher collection value, which makes bitcoin not equivalent. In order to solve these two problems, people want to introduce privacy into Bitcoin.


Bitcoin transactions are shown in the figure. If you want to add privacy protection to public historical transaction information, the simplest idea is to encrypt everything, such as the sender of the transaction, the recipient, and the transaction amount, so that you can get a private transaction. But there is a problem here. If the data is encrypted, other users on the whole network may not be able to decrypt the data, which conflicts with the open auditability of bitcoin or blockchain. Therefore, scholars have proposed using zero-knowledge proof to solve this problem.


Briefly introduce the zero-knowledge proof, there is a prover and a verifier, and a statement to prove. The prover has an evidence W that proves that X belongs to a certain language R. The certifier will prove to the verifier that there will be some interaction during the discussion of X. At the end of the interaction, the prover sends a certificate to the verifier, and the verifier decides to accept or reject the statement.

Zero knowledge proves to satisfy the following properties:

Completeness: If the correct proof is discussed, the arguer can be persuaded;

Correctness: If the wrong proof proves that there is no way to convince the arguer;

Zero knowledge: The verifier cannot obtain any other information except that the argument is correct. It is often difficult to ask the prover and the arguer to be online at the same time, so the non-interactive zero-knowledge proof (NIZK) is more valuable in practical applications. In the non-interactive zero-knowledge proof: the prover sends a π to the verifier, and the verifier can directly verify the conclusion without interaction. For NIZK, there is a simple conclusion that if factorization is difficult, there is a non-interactive zero-knowledge proof for any NP language.


With zero-knowledge proof, it's easy to introduce privacy on the blockchain. I encrypt all the transactions. After encryption, I will add a certificate to prove that the transaction is legal. The so-called legality is to satisfy the rules of blockchain trading. My user does have so much money, and there is no money in the transaction. Being double flowered. This resolves the conflict between encryption and public verifiability.


The performance of the universal zero-knowledge proof scheme is usually not high, and the blockchain has strong requirements for transaction performance. In practice, zkSNARKS is loved by everyone, and it has additional performance advantages.

As shown in the figure, the process of zkSNARKS generates two keys in the parameter generation process. One is the proof key PK, the other is the verification key VK, and the proof that the PK has a W, and the PK can generate the evidence π. The discussion and π are sent to the verifier, and the verifier can directly verify the transaction using VK.


Regarding the application of zkSNARKS in the blockchain, I drew this diagram and there is a specific problem to be verified, such as verifying whether the transaction is legal. This problem can be equivalently converted to a QAP problem. Then the public chain project party can specifically generate the public chain parameters, and the parameters are actually the verification key VK and the certification key PK. I want to send a transaction to the average user, first encrypt the transaction as: C1, C2, C3 (actually a code), and use the proof key PK to prove that the transaction is legal, get a proof π, put the transaction and Prove that π is sent to the network together, and the entire network user can use the verification key VK to verify the legality of the transaction. This is the application flow of zkSNARKS applied on the blockchain.


At present, zero-knowledge proof that the application of panoramas in the blockchain is very large, there are a variety of applications, we can see a variety of projects on the Internet, this is the overall map of the project.

The above explains why you should use zero-knowledge proof. Next, talk about zero-knowledge proof of security and privacy issues in blockchain applications. Many projects have security issues, and I will summarize them from the following aspects.

First, the implementation of the loophole

When it comes to security, the first is to implement vulnerabilities. Zero-knowledge proofs are summarized in three aspects:

1. Memory corruption vulnerability: Many cryptography projects are now developed using memory security language, mainly using rust, java, and go. The memory corruption problem is relatively small. There are also projects such as LibSNARKS written in C++, but in this scenario, memory corruption related vulnerabilities are difficult to exploit and generally only cause node crashes. So the memory corruption problem is not serious.

2, logic loopholes: mainly circuit design issues and application layer logic issues.

3. Password implementation vulnerability: The zero-knowledge proof scheme is relatively new, and new password scheme implementations are likely to have some new problems. The first is the circuit design problem. Zero-knowledge proof has a feature in the circuit design of the blockchain. The circuit design is usually very complicated. There are a lot of password implementations, there are many constraints, and the performance needs to be optimized. It requires superb skills.


Let's take a look at the blocked transaction with Zcash as an example. Here is the input and output of the zcash mask transaction, as well as the binding signature, which is to bind the input and output of a transaction.



This is the output circuit and the input circuit diagram. In general, these circuits are very complicated, time problems, and are not described in detail here. If you look at zcash's standard documentation, there are hundreds of pages of documentation for blocking transactions.

There are three common problems in the circuit. One is the circuit design vulnerability. Zcash has generated a faerie Gold attack. The attacker can choose the same RHO, which makes the recipient unable to use the received money. There have been similar problems before the Monroe.

The second is the inconsistency between the circuit and the non-circuit implementation. The ultimate goal of the blockchain project is to ensure that all nodes are consistent. Inconsistent problems can cause serious risks, such as bifurcation and double-flowering. When using zkSNARKs, I need to implement a C++ or other high-level language for verification logic. I also need to implement the zero-knowledge proof syntax once. These two types of implementations are likely to produce inconsistencies, but in the audit, there is no I found this problem in well-known projects. There are a lot of test code in these projects to solve this problem, and obviously they are aware of this risk.

The third is the inconsistency between standards and implementation. The Zcash standard has more than one hundred pages of documents. There are many inconsistencies between standards and implementations, which may bring additional security risks. We found some security issues in its standard documentation, but did not find this in the course of the actual audit project code audit. But this inconsistency provides additional security risks and attack surfaces.

The second is the application logic problem. Application developers will call the zero-knowledge proof library to implement ZKP applications, but developers often lack sufficient understanding of the underlying ZKP, and it is easy to create security holes when writing code.


Here is a double flower problem of the online open source project semaphore. The problem is that there should be a unique value brought into the circuit, but the project does not limit the length of the Nullifier. If the P or 2P is added, the circuit can be satisfied. Such a flower can be used multiple times, that is, the same currency can be used multiple times, causing a double flower problem.


Tron actually completed the development of anonymous currency this year. During the development of the tron, I helped some of the security issues with community contributors. This is a problem I found in the tron ​​development process similar to the semaphore double flower vulnerability. Tron used the Librustzcash library to validate anonymous transactions, but did not limit the length and content of the parameters that were brought in, which would cause the double-flower problem mentioned above.


Similarly, the tron ​​has also appeared to verify that the Nullifier of multiple inputs in a transaction is different, causing a double flower problem. But these are the problems that occurred in the initial development of the development, and they were basically fixed soon.

The password is a security issue. The implementation of the new password scheme often brings some additional risks, including the more well-known projects. LibSNARK discovered the R1CS to QAP protocol vulnerability in the project implementation in 2015. LibSNARK implementation does not meet the linear independence requirements of polynomial in QAP, so the soundness may not be satisfied, and the solution is to increase redundancy. This is a related paper. ( https://eprint.iacr.org/2015/437.pdf )

Last month, Stanford University discovered Zcash's Ping attack and Reject attack vulnerabilities. They found that a node in the blockchain network, when dealing with its own transactions, will generate additional information leakage when decrypting, so that the attacker can send malicious forgery. A malicious transaction determines which node an address belongs to. This breaks the irrelevance of zcash, and related papers are also here. ( https://crypto.stanford.edu/timings/pingreject.pdf )

Second, the risk of trust


zkSNARKS has a basic idea, why can it achieve high performance? Generate the certifier challenge X in advance, retain the encrypted X, and discard the plaintext X. You can verify the entire certificate with the encryption challenge. The entire verification process is to satisfy the above equation. If the attacker knows X, such as it is the generator of the project parameters, it can construct arbitrary proofs directly around all difficult problems. Such a back door will not be discovered by others, because the proof of forgery is also zero-knowledge, so whether an attacker uses this loophole to create some extra money, no one can know.


The solution to the trust problem is to use the secure multi-party computing MPC to generate the encryption X, and everyone else does the same. Here is an example of zcash, they have a special parameter generation step, divided into two steps, the first step is to generate an encrypted X. This encryption X can also be used by other projects. Zcash exposes all MPC protocol communication processes, and the implementation code is also public. Each participant will receive what message they have received, and all the messages sent out will be placed on the network. In this case, as long as one of the participants is honest, we can think that the MPC result is safe.


The second step is to generate parameters for the specific circuit, the first step and the second step are indispensable. Some projects use the power of tau parameters directly, but there is no MPC for the second phase of parameter generation. Such projects are also untrustworthy.

Tell me about my views on MPC issues. Now many projects, Ethereum, etc. are doing their own MPC, generating zero-knowledge proof parameters. But there are also some projects that don't have a trusted setup. I can talk about my experience in participating in MPC. In early September, Ethereum contacted me and said that the project party would find some community members to participate in the parameter generation ceremony. In early September, I found me to put me in the 14th and will contact me. Me, but there is no news or news at all in October. Although the MPC solution is secure, the whole process is not like an internet chat room, you can join directly if you join. Who is the first few people, when is it to you, completely controlled by the project side, if you do not participate in MPC yourself, you can not be 100% sure that this MPC is safe.


To give an example, ZoKrates is a zero-knowledge proof smart contract compiler. You have a logic to verify, you can use it to make this verification logic compiler a zero-knowledge verification contract on Ethereum. But there is a problem here, the entire parameter generation process is controlled by the contract generator. If the contract producer is malicious, he can actually falsify any proof.

Third, unshielded transaction information disclosure


I used the crawler to count the Zcash transaction data for September. 85% of the data is transparent. There is no encryption, 14% is partially blocked, translucent, and only 1% is completely blocked. It can be said that most transactions can be tracked.


In addition, I also counted zcash's current currency pool, 95% is transparent, only about 5% in the shield pool.

In fact, many scholars have analyzed the connectivity and anonymity of these anonymous currencies in this paper. Most transactions can be traced back. For example, I have an address A to make a sum of money to the shielding pool, B made a sum of money to the shielding pool, the sum of these two money is sent to the C from the shielding pool one day later, we can judge the ABC from the trading mode. Is the association, the money is A, B to call C. Users can use relevant, relevant addresses, related amounts, and related time to perform matching analysis. Previous papers have shown that more than 90% of transactions can be analyzed.

What is the cause of the problem? The current ZKP solution does not really solve the privacy problems of the majority of users. Most participants use light nodes such as mobile phones to participate in transactions, and it is impossible to synchronize all data to a local computer as a full node. A shielded transaction, decrypted to know whether the transaction belongs to itself, the light node can not have all the data, it needs to decrypt, you need to know whether a transaction belongs to me, you must give the transaction key to the whole node, This light node privacy is not well protected. And now the zero-knowledge proof scheme is not friendly to the light nodes, and the computational overhead is large. In general, the current privacy security is designed for the whole node, and there is no good light node solution.

Light nodes have insufficient support for blocking transactions. For policy reasons, exchange trading does not allow you to use blocked trading, you must use transparent trading. As an ordinary user, I want to protect my privacy. I have some experience methods. I try to use blocked transactions. Each transaction uses a different new address, the transaction amount is different, or enough time, but these cannot guarantee 100% security. You want to have password security, the best solution is to use only blocked transactions. However, the reality is not very satisfactory. For example, if you want to collect money from the opposite side, for example, if you call money to the exchange, you do not support the blocking of trading.

Fourth, the password program risk

Zero-knowledge proof technology is relatively new, 16-year paper, 17-year large-scale use, I personally think that these projects have to wait for time to test, many project parameters selection and optimization are more radical, in order to prove the program is good. Some difficult problems are not standard, rely on too many security issues, or lack sufficient auditing.

There is a problem here, Zcash's forgery vulnerability, CVE-2018-7167, was discovered in 2018, but it was only announced in 2019. Anyone can falsify proof and create a currency like Zcash out of thin air. This vulnerability actually affects multiple Zcash fork projects. This vulnerability takes about 8 months to complete the repair, and the project side must change the entire certification scheme and upgrade the entire network. No one knows if the vulnerability is being exploited. If it is exploited, it is zero-knowledge. Zcash officially issued a statement that few people have high-level discovery vulnerabilities and have not found any problems with the total Zcash.

[BCTV14] There is no provable security. There are redundant elements in the [BCTV14] parameter generation, which can be used to generate forged proofs. The principle is not complicated. Later, the Groth16 program was generated. A similar problem is not the first time. In 2015, Microsoft Research researchers discovered another vulnerability. Are these vulnerabilities the last time, will the next impact be greater?


We can see that there are promises and hashing schemes in Zcash. His scheme is very complicated. Just committing to hashing may involve many schemes and many security proofs. Security is time to test.

V. Other risks

Finally, we talk about other aspects of risk, we can mathematically prove that Groth16 is a perfect zero-knowledge solution. There are additional information loopholes in the actual application. The most obvious is that there are ciphertexts in the transaction. The ciphertext is not zero knowledge. Perfect zero knowledge is not equal to perfect privacy protection. My data, my transactions are all on the chain, the data of the chain may be in 10 years, 20 years, maybe 20 years later, the difficult problem was broken, and the privacy of the past was discovered.

Zcash masked address non-connectability also has the problem that the current address cannot be connected is not equal to the future unconnectable.


Side channel vulnerabilities are usually not taken seriously. Such attacks are difficult to use and do not directly cause security problems, but are very important in privacy-related systems. Because the side channel can directly damage privacy. I went to see the Groth16 program, which is also the most used zero-knowledge proof program. When I want to calculate the proof, I want to calculate ABC. Here, the lowercase ai is actually the secret of the user's own hand. The ABC calculation is directly related to the user's secret. The secret can be recovered through a simple side channel attack. There are similar Cache channel attacks. At present, these zero-knowledge schemes do not protect against side channel attacks.

Today, we simply shared the security risks of ZKP applications. The zero-knowledge proof is still a new technology. There are still many problems in the application. Of course, these problems will be slowly solved and mature.