This article is intended for readers who have a certain cryptography basis or who are interested in cryptography. Although there are a large number of mathematical formulas in the text, they are relatively simple and easy to understand.
Introduction: What is zkPoD?
zkPoD implements a decentralized "zero-knowledge conditional payment" that supports zero-trust fair trade on GB data. For the concept of "zero knowledge conditional payment", please see this overview article "zkPoD: blockchain, zero-knowledge proof and formal verification, to achieve fair trade without intermediary, zero trust" . zkPoD is a new solution for achieving ZKCP goals. At present, zkPoD already supports GB data, supports low TPS public chain, and also supports high TPS alliance chain; supports both binary data and tabular data with internal rich types and structures. Compared with the traditional "trusted third parties", zkPoD uses the blockchain as a "Trustless Third Party" to achieve "zero trust fair trade".
zkPoD is also an underlying protocol that enables two-way circulation of data and value.
- Exploring Zero Knowledge Proof Series | Understanding Zero Knowledge from "Simulation": Parallel Universe and Time Back
- One article to understand the latest research and development progress of Ethereum, zero knowledge proof, sharding and other topics
- Privacy solution ZK² Rollup: how to achieve high-speed, cheap privacy transactions on Ethereum
- Year-end inventory | A complete overview of the coolest blockchain technology in 2019
- From GB to KB, how does zero-knowledge proof create a concise blockchain?
- Vitalik: Understanding the new universal zero-knowledge proof scheme PLONK
zkPoD open source code and more documentation can be found at: https://github.com/sec-bit/zkPoD-node
Proof-of Delivery (PoD) protocol
PoD is the core protocol for implementing the zkPoD system. The PoD protocol implements the borrowing of blockchain smart contracts to exchange atoms between "data" and "token" and guarantees the fairness of buyers and sellers. PoD does not use a single zkSNARK scheme to achieve atomic exchange like ZKCP[1], but instead uses the classic cryptography schemes such as Pedersen Commitment and Schnorr Protocol. This way PoD can be made more efficient and easy to expand. The PoD protocol will also use formal proof to build a solid "trust base."
This article introduces a minimalist PoD protocol, PoD-Tiny, which simplifies many details and is not practical, but can help readers quickly understand the principles and challenges of PoD.
Suppose I am a seller and you need to buy a data file from me. A rough flow of this agreement is:
- Step 1: I encrypt the "data" and pass it to you.
- Step 2: You hand over "Token" to the blockchain "smart contract"
- Step 3: I use the "key" to exchange the "Token" in the "smart contract", and then you can read the "key" from the smart contract to decrypt the data.
Is not it simple? Smart, you are now highly skeptical about whether this process is a problem.
Key issues in "fair trade"
- Key question (1) : The encrypted data you receive is indeed the data you want.
- Key question (2) : After you receive the encrypted data, what should you do if you don’t pay for it?
- Key question (3) : The key that I present to the smart contract must be a real key, otherwise I won't get the token.
- Key question (4) : After I show the real key, I must be able to get the Token.
Let's take a break and discuss how PoD-Tiny cleverly solves these problems .
The feature of locking data: Authenticator
For the key question (1) , we need an anchor and what is the data you want. For the sake of simplicity, let's assume that we have previously agreed on a unique label, or feature, of a data file. Then the data you purchase needs to be able to correspond to this tag one by one.
In general, everyone likes to use Hash to mark features on a string, such as calculations.
h = MD5("hello,zkPoD!")
The promise is also called Commitment, which can do one-to-one correspondence with the data, and can hide the value of the data. Here, A is called "authentication information" Authenticator in the zkPoD system. Where G is the generator of an elliptic curve cycle group.
A = m*G
We generate the "commitment" of this data by the following operation.
Let's take a look at the string "hello,zkPoD!" which has a total of 12 bytes, which is 96 bits. So we can convert these 12 bytes into an integer over a finite field (here we assume that the finite field size is close to 256 bits). So we can encode this string into an integer, let's use a symbol to represent the integer, assuming m.
The "Authentication Information" Authenticator can be made public to everyone, and we don't have to worry about leaking raw data. This is because it is difficult to calculate m by A. This inverse operation is a finite field "logarithm" operation. If the finite field is large, this logarithm operation is very, very difficult. This is the "discrete logarithm problem" hypothesis. Aside from these theoretical details, we only need to know that the Authenticator can be safely handed over to anyone without worrying that m is reversed.
Why does the "certification information" use this "commitment" form instead of the well-known hash operation? This is because "commitment" has additive homomorphism . The so-called homomorphic nature, we can understand this: the plaintext data has some kind of operation, which can be mapped to the ciphertext operation. Suppose there are three data plaintexts, m1, m2 and m3, where m1 = m2 + m3.
Their Authenticator is calculated as follows:
A1 = m1*G, A2 = m2*G, A3 = m3*G,
We can calculate A1 ?= A2 + A3
To verify m1 ?= m2 + m3 . Everyone can find that although a person who knows me knows A1 , he can't calculate m1 . But he still knows that m1 is equal to the sum of the other two numbers, although he has no idea what the specific values of these three numbers are.
Note : The addition here is a modular addition, and a + b is a shorthand for a + b mod p. For readability, subsequent addition, subtraction, multiplication, and division are all agreed to be operations on a finite field.
The rest is simple. In the PoD protocol, we can choose a random number k as a one-time key to encrypt m and calculate E(m) = k + m. E(m) is the encrypted data. I can send K = k*G to you too, so you have three things in your hand, A, K, and E(m). You can use the following formula to "simultaneously" verify that the encrypted E(m) is indeed the ciphertext of the data m:
E(m)*G ?= K + A
And, with the above formula, you can also know a key message: the key is a value associated with K. Although at this time you don't know m and key k at all. This information is also the key to solving the key problem (3) .
in conclusion:
Through homomorphism, buyers can verify that data meets certain conditions in the case of data encryption.
Recall the key questions (2):
- Key question (2) : After you receive the encrypted data, what should you do if you don’t pay for it?
The core method to solve this problem is "zero knowledge proof." If the buyer does not analyze any redundant information after receiving the encrypted data, it will not damage the seller's interests, but also solve the key problem (2). Simply put, if the encrypted data is zero-knowledge, it is not afraid that the buyer will take the encrypted data and run without giving money. The so-called "zero knowledge", everyone can understand this: the encrypted data that the buyer gets, just like getting a bunch of random numbers, there is no information. How to achieve zero knowledge? PoD-Tiny uses the idea of the classic Schnorr protocol.
Interpolation Science: Schnorr Protocol and "Zero Knowledge"
The Schnorr protocol is a very classic example of a textbook. I will take you through it quickly. One of the uses of the Schnorr protocol is for identity authentication. It is a two-party security protocol. One party "certifier" Alice proves to the other party "certifier" Bob that she has a private key corresponding to the public key.
First Alice generates a pair of "public and private keys", (PK, sk). Bob then holds Alice's public key PK. When Alice wants to prove identity to Bob, they will complete the proof through a "three-step interactive protocol": prove that Alice has the private key sk. If Bob accepts the proof, then Bob will think that the person who has the private key on the opposite side is Alice. The following briefly describes this protocol:
Sk = a, pk = a*G
Public input: PK = a*G
Alice private input: sk = a
Step 1: Alice chooses a random number r and sends r's "commitment" R = r*G to Bob.
Step 2: Bob sends back a random number c as a challenge number
Step 3: Alice calculates z = r + c * a, then sends z to Bob, which then verifies by the following formula:
Verification formula: z*G ?= R + c*PK = r*G + c*a*G
This Schnorr protocol has three properties:
- Completeness Completeness
- Reliability Special Soundness
- Special Honor Verifier Zero-Knowledge
The third property is "zero knowledge". This property guarantees that Bob will not get any information about the private key in any way during the protocol interaction.
Note : Strictly speaking, the Schnorr protocol is not "Full-ZK", only "HVZK" is guaranteed, which is a relatively weak zero-knowledge nature. But for the time being, you don't have to worry about this. The Schnorr protocol can be upgraded to "Full-ZK" with some tricks.
PoD-Tiny: A simple PoD protocol
If you have probably remembered the details of the Schnorr protocol, then I will show a protocol called PoD-Tiny.
Protocol Description: Suppose Alice has a plaintext m of data, then Bob owns the Authenticator(A) of this data. There is also a "Trustless Third Party". Let's call her Julia for now. Please remember: she is a smart contract.
protocol:
Props before the opening: m, a, G, a random number generator rand ()
Character:
- Alice: owns data m, one-time key k <-rand()
- Bob: owns A
- Julia: N/A
step:
The first step : Alice generates a random number, r <-rand() , and then sends it to Bob. The two numbers K=k*G and R=r*G
Step 2 : Bob generates a random number c <-rand() and sends it to Alice.
Step 3 : Alice calculates two numbers E(m) = k + m, z = r + c * k and sends it to Bob. These two numbers, the first E(m) is the "data ciphertext" encrypted with the one-time key k, and z is the "ciphertext of the key".
Note: What? The ciphertext of the key? That's right, here Alice encrypts k with the random number r generated in the first step, plus the number of challenges c provided by Bob in the second step, and gets z.
Step 4 : Bob verifies the received data ciphertext E(m) (formula (1)) and verifies the ciphertext of the key (formula (2)):
- Verify the formula (1): E(m) * G ?= A + K
- Verification formula (2): z*G ?= R + c*K
Note: In the fourth step, Bob needs to understand two things: first, the ciphertext data (E(m)) passed to him can correspond to the data anchor point (A); then the ciphertext data (E(m) Is it encrypted by an unknown key X, and the ciphertext of this "unknown key" should be equal to the "key ciphertext" (z) sent by Alice in the third step. If so, at some point in the future, if Bob gets the "key ciphertext key" (r), he can do two decryption actions to successfully get the data plaintext (m). The two decryption actions are as follows: first, Bob decrypts the key ciphertext z with the "key ciphertext key" r and the challenge number c, obtains the data key k, and then decrypts the data ciphertext E with the data key ( m), thereby obtaining the data plaintext m.
Remark: The first step to the fourth step of the above protocol, in fact, you can find very similar to the Schnorr protocol. Just replace a with a one-time key k. Then another difference is that K = k*G is equivalent to the public key in the original Schnorr protocol, not originally sent to Bob, but sent to Bob in the first step of the protocol together with R. Regardless, the four-step protocol is an extension of the Schnorr protocol as a whole. Of course, there is no end here, and then the blockchain will debut. Bob, Alice will start interacting with Julia.
Step 5 : If Bob verifies that Equation (1) and Equation (2) pass in the fourth step, then the encrypted data sent by Alice is correct. At this time, Bob will send Julia a "Delivery-Receipt", R. Bob needs to save "Token" to Julia in this step.
Note: This receipt is to tell Julia that Bob has received the encrypted data, but the key has not been received. The key needs Julia to receive and verify it for him. So what is the verification credentials? It is the "commitment" corresponding to the "key of the key ciphertext", is it a bit of a wrap, this receipt is the R of the first step of the agreement sent to Bob by Alice.
Step 6 : Alice shows Julia the "key of the key ciphertext", which is r. Julia checks this key formula below. If the validation passes, Julia can transfer the Token saved by Bob to Alice.
- Verification formula (3): R ?= r*G
Let's see how the key problem (3) is solved.
- Let's see how the key problem (3) is solved.
In the first step of the agreement, Alice sends Bob a "promise" R for the "key of the key"; then in the fifth step of the agreement, Bob hands over R to Julia; in the sixth step, Alice honors the promise, Reveal the corresponding r. If Alice shows an incorrect value, Julia will immediately find that the publicity (3) does not hold.
There is still one:
- Key question (4) : After Alice shows the real key, he must be able to get the token.
In the sixth step of the agreement, Julia checks the formula (3). In the case where Alice shows the correct r, if the equation does not hold, then there are only two cases: (1) Julia deliberately messed up, and (2) the value of R is incorrect. In the former case, it is necessary to ensure that Julia's contract code does not have a vulnerability and is functioning properly. This requires an additional "formal verification" method. In the latter case, Alice is asked here to check the internal state of Julia in the sixth step. In the fifth step, the R submitted by Bob is not a correct value. Note here: Julia is an open smart contract, any data she holds is publicly visible , and any internal state and calculations are publicly visible .
Analysis of the security and fairness of the agreement
If we do not consider multiple transactions, PoD-Tiny is a "fair" transaction agreement. We will next analyze why this agreement is fair.
We first consider what cheating Alice has:
- A1. Encrypt the fake data m' and pass it to Bob.
- A2. The key k used to encrypt the data, but another k' is used when encrypting the key, and k <> k'
- A3. Present a fake ciphertext key r' to Julia
Analysis: If Alice uses the cheating method A1, Bob can find it when checking formula (1); if Alice uses cheating method A2, then Bob can find it by formula (2); if Alice uses cheating method A3, then Julia passes Equation (3) can be found.
Let us consider what cheating methods Bob has:
- B1. After Bob gets the encrypted data E(m), he quits the protocol and tries to crack the ciphertext.
- B2. Bob presents Julia with a wrong "delivery receipt" after verifying the encrypted data.
- B3. Bob account does not have enough Token
Analysis: If Bob uses cheating means B1, then Bob is unable to get any information from the encrypted data, because the first three steps of the protocol are "zero knowledge" (accurately: Honest Verifier Zero-Knowledge). If Bob uses means B2, Alice can check in the sixth step that Julia's "data delivery receipt" R is the same as she sent to Bob in the first step. Once Bob submits the wrong receipt, Alice can directly quit the agreement. , refused to show the key. Similarly, if Bob uses means B3, Alice can check if the Token saved by Bob at Julia is sufficient in the sixth step, and if not, exit the agreement directly.
Finally, is Julia likely to cheat? Julia is a smart contract. Any behavior and internal state of her can be read by anyone. Then Julia is likely to cause information leakage, which is not good for Alice or Bob. However, please note that Julia does not actually touch any information related to the data, so it does not reveal m information from the chain. Julia has only two messages, R and r.
Compress to the simplest protocol
We count the interaction steps of the above protocol in a total of five steps, Alice interacts with Bob three times, Bob interacts with Julia once, and Alice interacts with Julia once. The security protocol has a tool called Fiat-Shamir Heuristic Transform that directly "compresses" the first three interactions in the PoD-Tiny protocol into an interaction.
Before compression:
After compression:
The main difference we found was that the number of challenges in the compressed PoD-Tiny is no longer generated by Bob, but by Alice. Everyone here may have questions. Will this be unfair to Bob? This is equivalent to a three-step Schnorr protocol that is directly compressed into one step. Here is the conclusion: the compressed protocol maintains the nature of zero knowledge and is still fair to both parties. The reason is that the pre-compression protocol can prove HVZK (Honest Verifier Zero-Knowledge); the compressed protocol can prove NIZK (Non-Interactive Zero-Knowledge). However, the comparison of security before and after compression will compare Subtle, which is no longer expanded.
After compression, the final agreement became incredibly concise:
The challenge towards practicality: safety and performance
The minimalist agreement PoD-Tiny is only the first step of the long march. When faced with the cumbersome real world, when you turn theory into code, you will face many problems. These problems will be intertwined, which in turn will affect the theoretical design of the agreement.
- How to support data longer than 1MB, even on GB
- How to effectively reduce the overhead of verification on the chain
- How to support Ethereum and immunize all kinds of security issues on Ethereum
- How to support complex homomorphic calculation of data
In terms of safety and performance, zkPoD has done a lot of engineering improvements and innovations. Interested friends please pay attention to our follow-up articles.
Written at the end
What can the blockchain do? I have seen a lot of "pessimistic" arguments in the past year, and I think the PoD agreement should give some inspiration to these skeptics. The blockchain played a key role in the "Trustless Third Party" in the PoD protocol, and made this protocol incredibly concise, which we didn't expect.
references
- [1] Maxwell, G. "Zero knowledge contingent payment. 2011." URl: https://en.bitcoin.it/wiki/Zero_Knowledge_Contingent_Payment (visited on 05/01/2016) (2016).
- [2] Greg Maxwell. “The first successful Zero-Knowledge Contingent Payment.” https://bitcoincore.org/en/2016/02/26/zero-knowledge-contingent-payments-announcement/
Author: Yu Guo