Science | What is Merkle Pollard?

In the field of cryptographic currency, the Merkel tree is a very efficient method for proving that a particular value exists in a large set of values ​​and that data storage is minimized . This article introduces the Merkel tree and shows how to greatly reduce duplicate proofs by storing multiple levels of Merkel tree branches instead of root nodes (ie, "Merkle pollard").

Hash function

The hash function can convert a piece of data of any length (ie, the name of the fruit in the image below) into a fixed-length value (ie, a "hash value"). For example, the hash values ​​for "Apple" and "Orange" in the image below are as follows:

– hash value –


There are many features of the hash function, the most notable of which are: (1) Even if the input value is only a fraction of a millimeter, the resulting hash value will be quite different; (2) mathematically difficult to base according to the hash value The input value is pushed out (usually, there is no faster method than repeated trial and error).

Merkel tree

A Merkel tree refers to a combination of multiple input values ​​and their hash values ​​compressed into a fixed length value.

At the top of the Merkel tree are various input values, called "leaf nodes." Each leaf node is hashed to get the upper layer branch, and the adjacent two branches are spliced ​​together and then hashed to get the middle branch. After layer by layer, finally get a hash value, the Merkel root node. An example diagram of the Merkel tree is as follows:

– Merkel tree –


The Merkel tree shown above has 8 inputs and is divided into 4 layers. The root node is the 0xd576…ffd9 at the very end.

As mentioned above, even if the input values ​​are very similar, the resulting hash values ​​are quite different. If the input value changes, it will affect the various levels of the Merkel tree, and finally get a completely different root node. For example, after changing "Peach" in the input value to "Pear", the Merkel tree will change, as shown in the following figure:

– The effect of a change on the entire Merkel tree (as indicated by the shaded gray part) –


The Merkel tree is reproducible: if the exact same input values ​​are arranged in the same order, the branches and root nodes of the Merkel tree will always get the same hash value.

Merkel path

The Merkel path refers to a collection of all hash values ​​between an input value and a Merkel root node. The figure below shows the Merkel path for the input value "Peach":

– Merkel path for "Peach" –

Merkel proof

Merkel's proof means that you don't need to know other values ​​in a data set to prove that a value belongs to this set.

– Merkel certification –

Merkel's proof requires three things: the input value (red mark), the middle branch hash (green mark), and the Merkel root node (blue mark). The set of intermediate branch values ​​corresponding to each input value is different.

Blockchain systems often use Merkel's proof to prove that there is an input value in a data set, so there is no need to store the entire data set in the blockchain . Suppose there is a whitelist in an Ethereum contract that only allows accounts in the list to purchase Ethereum. If each account information in the whitelist is stored in the blockchain, it is necessary to pay a high cost. In this case, you only need to create a Merkel tree and store the root node on the blockchain.

For example, if the root node is stored on a smart contract, the smart contract can easily prove that an account is included in the whitelist: the account needs to provide a middle hash value (the contract owner provides it through some sort of chain) For the account holder, the smart contract hashes the hash value of this account in turn with the intermediate hash value. If the final result is consistent with the Merkel root node, it proves that the account is indeed on the whitelist.

Note the relationship between the Merkel path and the hash value of the Merkel proof in the last two figures. In the same hierarchy of the same tree, the hash value of Merkel's proof is related to the hash of the Merkel path. Thus, Merkel proved to be able to reshape the Merkel path of the input value , which is why the end result is the Merkel root node.

At this point, it can be seen that the Merkel certificate has the following characteristics:

  • The space required to store the Merkel proof on the chain is much smaller than the space required to directly store the input values.
  • Publicly storing the Merkel certificate on the chain does not expose the entire set of input values.
  • To prove whether a value exists in a set of input values, verify that the cost of Merkel's proof is lower than the cost of checking the entire set of input values.

Duplicate proof

In the example above, each account only needs to send a Merkel certificate to verify that it is on the whitelist.

In addition, the Merkel tree can also be used as proof of probabilistic knowledge (often referred to as STARKs), and each proof of knowledge can make us (ie, “verifier, challenger”) more convinced: the creator of the Merkel tree (ie "certifier") knows all the constituent values. In this case, the certifier typically generates hundreds of proofs based on a Merkel tree containing dozens or even hundreds of input values. These certificates are sent to the verifier along with the Merkel root node to verify their validity.

Let us explore the duplicate proofs in the example above. The three figures below are three different proofs generated by the same Merkel tree:

– duplicate proof of the same Merkel root –

It can be seen that a total of one Merkel root node and three certificates are sent, adding up to a total of 10 hash values: one for the root node and three for each of the remaining three.

Is there a more efficient approach? It can be seen that the first level of the Merkel tree has only two values ​​c0b7…da30 and 6ff9…8e3d, but (when providing this level of hash value) three certificates have sent a total of 3 ha Greek value (one for each proof). So, if the part provided at the beginning includes not only the lowest level of hash value (ie Merkelgen), but also a higher level of hash value, will the efficiency be higher?

– Duplicate proof of extended Merkel root nodes –

(Proofreading Note: Comparing the two sets of graphs, we can find that the first proof method needs to send 10 hash values, but the second proof method only needs to send 9, so it really improves the efficiency)

Merkel tree truncation

Extending Merkel root can also be said to be the truncation of the Merkel tree, that is, only the Merkel root node and a few layers of intermediate branches. The order in which Merkel trees are truncated is determined by the number of intermediate supports above the root node (a 0th-order Merkel truncation is Merkel root). The 1st order Merkel truncation consists of a layer of intermediate branches, as shown in the following figure:

– 1st order Merkel truncation –

The 2nd-order Merkel truncation consists of two layers of intermediate branches, as shown in the following figure:

– 2nd order Merkel truncation –

If there are multiple duplicate proofs for the same Merkel tree, using the Merkel tree truncation will reduce the size of the proof (because the hash value contained in each proof will decrease) and the time required to verify the proof (because each time Verify that the hash of the required calculation is reduced). To get the best order for Merck's truncation, just take the logarithm of the proof number and round it down. The following image is a low-order Merkel tree truncation table showing the space and time saved by a Merkel tree with 4096 input values, as follows:

– Advantages of different stages of Merkel tree truncation –

Using Merkel tree truncation can save a lot of storage space. For example, a STARK test proves that if you are using a Merkel root node, you need 564 KB of storage, and if you use a Merkel tree truncation, you only need 346 KB of storage, a 40% reduction. The time required to transmit and verify the proof will also be reduced.

Implementation example


Https:// provides Merkel tree truncation in Go language.


Original link:

Author: Jim McDonald

Translation & Proofreading: Min Min & A Jian

(This article is from the EthFans of Ethereum fans, and it is strictly forbidden to reprint without the permission of the author.