# Understanding Zk-stark of Zero Knowledge Proof Algorithm-Arithmetization

## Foreword

The first article in this series ( https://www.8btc.com/article/512859 ), compared with Zk-snark, gave a general introduction from the concept and algorithm flow. It is recommended to read the contents of the first article before reading this article. In this article, let us embark on a journey of exploring the mysteries of the Zk-stark algorithm from shallow to deep.

## review

As mentioned in the first article, the Zk-stark algorithm can be roughly divided into two parts: Arithmetization and Low Degree Testing. In this article, we first introduce Arithmetization in the first stage of the algorithm in detail.

The overall steps of Arithmetization are shown below: So what is Arithmetization? What is the specific process? With these questions in mind, let's savor the contents of the article carefully.

First, what is Arithmetization?

Arithmetization is the process of converting a CI statement into a formal Algebraic language. This step has two purposes: first, to present the CI statement in a concise and clear way; second, to embed the CI statement in the algebraic domain, which is a polynomial Convert to make the pavement. Arithmetization representation mainly consists of two parts: first, the execution trajectory (orange part in the figure); second, polynomial constraint (grey part in the figure). The execution trajectory is a table, and each row of the table represents a single-step operation; the construction of the polynomial constraint is complementary to the execution trajectory, that is, the polynomial constraint will satisfy each calculation of the execution trajectory only when the execution trajectory is correct. Finally, the execution trajectory and polynomial constraints are combined to form a certain polynomial, and then the polynomial is verified by LDT. So far, the problem of verifying the CI statement has been transformed into the problem of verifying the deterministic polynomial LDT.

## Arithmetization

Knowing the overall process of Arithmetization, next, we discuss the specific process. In order to facilitate understanding, we use a simple example to run through the entire Arithmetization process.

Everyone has been to the supermarket. The contents of a general supermarket receipt are as follows: Now, popular Hollywood actor Bob claims: "the total sum we should pay at the supermarket was computed correctly". How to verify it? In fact, it is very simple. At this time, another awesome actor Alice just needs to face the receipt and add up each item to complete the verification. Well, this is just a very simple example. In fact, Alice only needs 5 steps to complete the verification process. Imagine a scenario: After all, Bob was rich and bought 1,000,000 items in the supermarket. Similarly, he claimed: "the total sum we should pay at the supermarket was computed correctly". At this time, Alice was really angry. This How to verify, according to the previous method, it takes about 1,000,000 steps to make trouble? Who loves to do it. Bob also felt bad for Alice, after all those years. I thought, is there any sirloin way for Alice to be sure that I am right with very few steps? So Bob started the strongest brain mode.

Now, let's follow Bob's method to find this sirloin using the simple example above. Bob thought, don't you just verify the final sum, right? Then I will list the calculation process of the sum. I guarantee that each accumulation is correct, then my final result must be correct. So Bob added a new column to the receipt to save the intermediate value in the calculation of the sum (indicated by the orange-brown part in the figure), which is the execution trace (the orange part in Figure 1). A new column of values ​​needs to be met, the initial value is 0 (yellow part in Figure 2), the final value is equal to the sum to be paid (yellow part in Figure 2), each value in the middle must be equal to the previous value plus The unit price of the item on the previous line (the red line in Figure 2), which constitutes a polynomial constraint (the gray part in Figure 1, the lower left corner of Figure 2).

It can be seen from Figure 2:

• There are three polynomial constraints, two are boundary constraints (polynomial indexes 1 & 3) and one is cyclic constraint (polynomial index 2).
• The size of the polynomial has little to do with the answer to the execution trajectory, that is, even if the length of the table is expanded to 1,000,000, the final polynomial constraint is still these three. The only change is the value range of the variable x.

Here, borrow the words of God V to describe Zk-stark: Zk-Stark is not a deterministic algorithm. It is a large class of passwords and mathematical structures. It has different optimal settings for different applications. It can be understood that for different problems, there are different arithmeticization schemes (in this example, a column of values ​​is added, which may not be applicable in other cases), so specific problems must be analyzed. But a common goal is that no matter what the problem is, the execution trace obtained is best represented by a LOOP, and the polynomial constraint obtained in this way is the most concise. The number and form of the polynomial constraints directly affect the size of the proof and the performance of the Zk-stark algorithm. Therefore, finding an optimal setting is particularly important for the Zk-stark algorithm.

Back to the subject, now that Bob has got the polynomial constraints and execution trajectories, how do I turn them into a certain polynomial? Please see the following figure: (the blue arrow represents the main flow, the red arrow represents the branch) Bob first cuts the focus to the execution trajectory. You can see that the execution trajectory has two columns, one is the single price, and the other is the sum of the prices. We perform Lagrange interpolation on the elements of the two columns to obtain two functions f (x ), w (x), 0≤x≤5. The domain expansion of the two functions was evaluated at more points, that is, f (x), w (x), 0≤x≤10000 (from polynomial interpolation to domain expansion, which is actually Reed- Solomen's encoding process can be realized, even if there is a difference in the original data, the codewords obtained will be very different; the main purpose is to prevent the prover from doing evil, and adding the prover to do evil will make the verifier easy to find).

Then, Bob combines f (x), w (x) and polynomial constraint equations to get a set of exact polynomial constraints (shown in red circle 2 in the figure). Take the cyclic constraint polynomial as an example:

1 ≤ x ≤ 5 w (x)-f (x -1)-w (x -1) = 0 (1)

Let Q (x) = w (x)-f (x-1)-w (x-1), then Q (1) = 0, Q (2) = 0, Q (3) = 0, Q ( 4) = 0, Q (5) = 0.

According to the known facts, a polynomial H (x) of degree d is 0 at x = n, then there exists a polynomial H '(x) of degree d-1, which satisfies d (H` (x)) = d ( H (x))-1 && H (x) = H` (x) * (x-n)

So for Q (x), the degree is 5, and there is a polynomial Ψ (x), and the degree is 0, which is a constant, satisfying Q (x) = Ψ (x) * (x-1) (x-2) (x- 3) (x-4) (x-5), let the target polynomial T (x) = (x-1) (x-2) (x-3) (x-4) (x-5), degree 5 , Then:

Q (x) = Ψ (x) * T (x) (2)

The verifier Alice randomly selects a from 0≤x≤10000 and sends it to the prover Bob, asking Bob to return the corresponding value. Taking formula (2) as an example, Bob needs to return w (a), w (a-1), f (a-1), Ψ (a), and then Alice determines whether the equation holds, that is:

w (a)-f (a-1)-w (a-1) = Ψ (a) * T (a) (3)

If the equation holds, Alice has a high probability to believe that the execution trajectory is correct, so the original calculation holds. If the verifier Bob does evil, say that 4.98 in the table is changed to 5.98, then Q (1) = w (1)-w (0)-f (0) = 5.98-0-4.98 = 1, not equal to 0. In this case, looking at formula (2), the right side of the equation is Q (x), the degree is 5, and x = 1 is not the zero point; the right side of the equation Ψ (x) * T (x), let G (x) = Ψ (x) * T (x), degree is 5, because T (x) is zero at x = 1, so G (x) is also 0 at x = 1, so the two sides of the equation are actually Different polynomials with the same degree have a maximum of five intersections. Therefore, in the range of 0≤x≤10000, only 5 values ​​are equal, and the value of 9995 is not equal. Therefore, a value is randomly selected from 0≤x≤10000. The probability of failing the verification is 99.95%. If the range of the domain extension is larger, the probability of failing the verification will be closer to 1. According to the same logic, the boundary constraint polynomials are processed separately, and the results are shown in the figure (shown as red circle 3 in the figure).

Next, we talk about how to increase the zero-knowledge attribute.

For the prover Bob, the execution trajectory is not expected to be seen by the verifier Alice because it will contain some important information. Therefore, the verifier Alice can only choose a value randomly from the range of 6≤x≤10000. For verification, of course, both parties agree.

There is such a problem. When the verifier Alice receives the values ​​from the prover Bob, how to ensure that these values ​​are valid, they are indeed calculated in the form of polynomials, and these polynomials are less than a certain degree, instead of the prover Bob is only for verification, And the generated random value? For example, how to ensure that w (a), w (a-1), f (a-1), and Ψ (a) are polynomials w (x), f (x), and Ψ (x) respectively at x = a && x = What about the value of a-1 and the degree of the polynomials w (x), f (x), Ψ (x) is less than a certain fixed value? These questions will be answered in the next article. Before that, it is better to discuss why the degree of the polynomial is less than a fixed value to prove that the original execution trajectory is correct?

From the above example, it can be seen that Q (x) is equal to 0 when x takes the value of 1, 2, 3, 4, and 5 if and only if the execution trajectory is correct. Then Q (x) can be divisible by the target polynomial T (x), that is: Ψ (x) = Q (x) / T (x), d (Ψ (x)) = d (Q (x))-5 .

It can be seen from Figure 3 that the number of polynomials to be verified is 5 (shown in red circle 4). If LDT is performed for each polynomial, the consumption is very large. Therefore, you can linearize these polynomials by The combination (shown in red circle 5), if and only if each polynomial satisfies a certain degree, the polynomial after linear combination is also less than a certain degree. This condition is sufficient. For details, see the subsequent series. chapter.

Welcome criticism and correction, thank you! !!

## appendix

1. Official introduction document: https://medium.com/starkware/arithmetization-i-15c046390862

2. Official introduction document: https://medium.com/starkware/arithmetization-ii-403c3b3f4355