Dry goods | Understanding BLS signature algorithm

Editor's Note: The BLS signature algorithm is an algorithm that implements signature aggregation and key aggregation (that is, it can aggregate multiple keys into one key and aggregate multiple signatures into one signature). In the future Casper implementation of Ethereum, there are many certifiers who need to sign the block. To ensure the security of the system and save storage space, this kind of signature aggregation algorithm is needed.

In the previous article, I introduced the Schnorr signature algorithm and its advantages. Now, let me introduce the BLS (Boneh-Lynn-Shacham) signature algorithm and its advantages over Schnorr.

To make a long story short, we already know:

The ECDSA signature algorithm is sufficient for its work, but it is limited to this. It can't do signature aggregation or key aggregation, so only one signature can be verified. When verifying multi-signature transactions, this is too cumbersome. We need to verify all signatures and their corresponding public keys one by one, which consumes a lot of block space and transaction fees.

The Schnorr signature algorithm is much better. It combines all the signatures and public keys in a transaction into a single signature and public key, and the merge process is invisible (no trace of whether the signature or public key was merged). In addition, the combined signature can be verified at one time, speeding up the block verification. Of course, the algorithm also has some shortcomings:

  • Multiple signatures require multiple (signer-to-signer) communications, which is too much trouble for a cold wallet.
  • The aggregate signature algorithm relies on a random number generator instead of the specified random point (R) like ECDSA.
  • The mn multi-signature mechanism is quite tricky and requires a Merkel tree to construct a public key. When m and n are large, the tree takes up a lot of space.
  • Unable to aggregate all signatures in a block into one signature

The BLS signature algorithm can solve the above problem. It does not require a random number generator, it can aggregate all the signatures in the block into one, easy to implement mn multi-signature, and avoid redundant communication between signers. In addition, the length of the BLS signature is shorter (the signature is a point on the elliptic curve instead of two) and is one-half of the Schnorr or ECDSA. Sounds perfect! So let's see how the BLS signature algorithm works.

The magic of the BLS signature algorithm

Before entering the topic, let's first understand two basic concepts, the hashing of the curve (or the "hash on the curve") and the curve pairing (curves pairing).

Curve hash

In the ECDSA and Schnorr signature algorithms, after we hash the message, the result (hash) is a number. The BLS signature algorithm is different. It slightly modifies the hash algorithm and the result corresponds to a point on the elliptic curve. The simplest modification is that the hash function remains the same and the resulting hash value is used as the x value of the point to find the corresponding point on the elliptic curve. In general (such as the curve used by Bitcoin), the elliptic curve has 2256 points, and the value of the SHA-256 hash algorithm is exactly 256 bits. However, a valid x coordinate will correspond to a positive, negative and negative y coordinate (since (x, y) and (x, -y) are the points on the curve y2=x3+ax+b ). In other words, the new hash algorithm has a 50% probability of finding 2 corresponding points on the curve, and another 50% probability is not found at one point ( proofreading note: because the elliptic curve has only 2^256 points, If you want each hash to find the corresponding point, the elliptic curve has 2^257 points.)

Take the elliptic curve y2=x3+7 defined on the finite field of the modulo 23 as an example. Only half of the x coordinates can find the corresponding point on the curve. In this case, we tried three times to successfully find the corresponding point.

When hashing a message, to ensure that the corresponding point can be found on the curve, a number can be appended to the message body. If the (search for the corresponding point) fails, the number is accumulated and recalculated. That is, if hash(m||0) does not find the corresponding point, it continues to try hash(m||1), hash(m||2), etc. until it is found. When the corresponding point is found, the one with the smaller y coordinate is selected as the result among the 2 points.

Curve matching

We also need a special function that maps two points P and Q on one (or two different) curves into one:

  e(P, Q) → n 

This function also has an important feature. That is, for the unknown number x and the two points P and Q, no matter which point is multiplied by x, the result is the same, ie

  e(x*P, Q) = e(P, x*Q) 

In this way, in addition to the multiplier exchange, the equation can be maintained. Further, all the following exchanges must maintain the equation:

  e(a*P, b*Q) = e(P, ab*Q)
             = e(ab*P, Q)
             = e(P, Q)^(ab) 

I am not going to explain how this function works. The math behind it is quite complicated. If you want to know all the "obsessive" details, I suggest you refer to this article. If you want to go deeper, you can study this paper, which tells the theory of pairings. Currently, we only assume that this function exists and does not expose any relevant information about x.

It is worth noting that any elliptic curve (especially the secp256k1 elliptic curve of Bitcoin) cannot be used in the pairing function. We must use very special curves (usually from clusters of curves that are easy to pair) to ensure the efficiency and safety of the function.

BLS signature scheme

Now, the basics of all building BLS signature algorithms are in place. We use pk for the private key, P = pk*G for the public key, and m for the message to be signed.

To calculate the signature, first curve the message hash H(m) , and then multiply the obtained result (curve coordinate point) by the private key: S = pk*H(m) . You're done! No random numbers are needed, no extra steps are required, just multiply the hash result by the private key. The result of the signature is a point on the curve, saved in a compressed serialized format, which occupies only 33 bytes.

– Generate a BLS signature. Multiply the hash result of the message by the private key –

We can use the public key P to verify the signature, ie e(P, H(m)) = e(G, S) . Why is this?

As mentioned earlier, the characteristics of the pairing function make the following equation true:

  e(P, H(m)) = e(pk*G, H(m))
             = e(G, pk*H(m))
             = e(G, S) 

– BLS signature verification. We only need to verify that the public key and the hash of the message (two points on the curve) and the curve generation point and the signature (the other two points on the curve) map to the same number (if it is a valid BLS signature) )-

This signature scheme is beautiful and simple. For Bitcoin, the program has the following more features.

Signature aggregation

Now let's get together the signatures in the block. Suppose there are 1000 transactions in a block, and each transaction consists of Si (signature), Pi (public key) and mi (message) ( i is the serial number here). If these signatures can be merged, why should they be saved separately? After all, we only care about whether all the signatures in the block (not each one) are correct. To get an aggregate signature, just add all the signatures in the block:

  S = S1 + S2 +...+ S1000 

In order to verify that the block is correct, we need to ensure that the following equation is true:

  e(G, S) = e(P1, H(m1)) * e(P2, H(m2)) *...* e(P1000, H(m1000)) 

If the signature is valid, then the equation is true:

  e(G, S) = e(G, S1+S2+...+S1000)
         = e(G, S1) × e(G, S2) *...* e(G, S1000)
         = e(G, pk1×H(m1)) *...* e(G, pk1000×H(m1000))
         = e(pk1×G, H(m1)) *...* e(pk1000×G, H(m1000))
         = e(P1, H(m1)) × e(P2, H(m2)) *...* e(P1000, H(m1000)) 

Here we still need to use all the public keys and calculate 1001 pairing functions. The advantage is that the signature in the block is only 33 bytes. Signature aggregation can be done by miners during mining, saving a lot of block space.

Key aggregation and nn multi-signature

 

When using a multi-signed address, we will sign the same transaction with a different key. In this case, the aggregation key can be used to aggregate all keys and signatures into a single public key and signature, just like the Schnorr algorithm. Let's take the 3-3 multi-signature scheme as an example (the same reason we can derive any number of multi-signature schemes).

A simple aggregation method is to add all the signatures and keys together. Thus, the result of the signature aggregation is S=S1+S2+S3 , and the key aggregation result is P=P1+P2+P3 . It can be verified that the following equation is still true:

  e(G, S) = e(P, H(m)) 

because:

  e(G, S) = e(G, S1+S2+S3)
         = e(G, (pk1+pk2+pk3)×H(m))
         = e((pk1+pk2+pk3)×G, H(m))
         = e(P1+P2+P3, H(m)) = e(P, H(m)) 

Like Schnorr, we also need to eliminate fake key attacks. One method is to require each signing participant to prove that it owns the private key corresponding to the public key (signing the public key with the private key). Another method is to add nonlinear coefficients so that the attack cannot be implemented. To do this, aggregation is no longer simply adding multiple keys and signatures, but multiplying them by a coefficient and adding them:

S = a1 × S1 + a2 × S2 + a3 × S3

P = a1 × P1 + a2 × P2 + a3 × P3

The coefficients of the signature and key in the formula can be calculated by the signer and the public key of everyone else. The formula is as follows:

  Ai = hash(Pi, {P1, P2, P3}) 

For example, you can simply splicing the signer's public key and the owner's public key together to calculate the coefficients:

  Ai = hash(Pi||P1||P2||P3) 

At this point, the above verification formula is still valid. Although the coefficient ai is more , the calculation logic has not changed.

The advantage of this solution is that there is no need to carry out multiple rounds of communication between devices, just know the corresponding information of other signers. It is much simpler than the multi-signature scheme of the Schnorr algorithm (which requires 3 rounds of communication). This scheme is also independent of randomness and is a signature algorithm with complete certainty.

Subgroup multi-signature scheme (mn multi-signature)

 

Nn Multi-signature is not common, we prefer to use mn multi-signature (such as 2-3 multi-signature). We don't want to have nothing because we lost one key (of n keys). Key aggregation is ideal for this scenario. In the Schnorr signature algorithm, we use a Merkel tree of public keys to implement key aggregation, which is not efficient, but will be useful. Unfortunately, as the values ​​of m and n become larger, the size of the Merkel tree grows exponentially.

BLS uses another method, but it's a bit more complicated. We need a normal hash function hash(x) (the result is a number) and a curve hash function H(x) . When you start multi-signature, you also need an initialization process. After that, the signers do not need to communicate anymore, just provide the transaction signature.

For a simple example, we want to create a 2-3 multi-signature, and 3 signatures are stored on different devices (this example can be extended to any mn multi-signature).

Initialization (generating member key)

Use i = 1, 2, 3 to represent the device at the corresponding position in the set, pki for the private key, and Pi = pki × G for the public key. We use the method described above to aggregate the public key:

P = a1 × P1 + a2 × P2 + a3 × P3, ai = hash(Pi, {P1, P2, P3})

Now, each device needs to sign each i to prove that i is a member of the aggregated public key. After the signature is aggregated, it is saved in the corresponding device:

  MKi = (a1*pk1)*H(P, i) + (a2*pk2)*H(P, i) + (a3*pk3)*H(P, i) 

This signature is called a "member key" and will be used later when signing. Each member key is a nn multi-signature to the message body H(P,i) , ie:

e(G, MKi)=e(P, H(P,i))

because:

  e(G,Mki) = e(G, (a1*pk1)*H(P, i) + (a2*pk2)*H(P, i)
               + (a3*pk3)*H(P, i))
          = e(G, (a1*pk1 + a2*pk2 + a3*pk3)*H(P, i))
          = e(G*(a1*pk1 + a2*pk2 + a3*pk3), H(P, i))
          = e((G*a1*pk1 + G*a2*pk2 + G*a3*pk3), H(P, i))
          = e(P, H(P,i)) 

Remember this equation, which will be used later. It is used to prove that a device is a legitimate participant in a multi-signature.

signature

Assuming that only the private key pk1 and pk3 are used to sign the transaction, we will generate 2 signatures S1 and S3 :

  S1 = pk1 × H(P, m) + MK1, S3 = pk3 × H(P, m) + MK3 

Add them together and aggregate them into a single signature and public key:

  (S', P') = (S1+S3, P1+P3) 

P' and S' are used to emphasize that they are only calculated by the partial signer (public key and signature), and not like P is the calculation of all the signers (public key). In order to verify 2-3 multi-signatures, it is necessary to prove that the following equation holds:

  e(G, S') = e(P', H(P, m)) * e(P, H(P, 1)+H(P, 3)) 

As stated above, the member keys MK1 and MK3 are signatures for the messages H(P, 1) and H(P, 3) (the message itself is signed by the aggregated public key P ), so there are:

  e(G, S') = e(G, S1+S3)
          = e(G, pk1*H(P, m)+pk3×H(P, m) + MK1 + MK3)
          = e(G, pk1*H(P, m)+pk3×H(P, m)) * e(G, MK1+MK3)
          = e(pk1*G+pk3*G, H(P, m)) * e(P, H(P, 1) + H(P, 3))
          = e(P', H(P, m)) * e(P, H(P, 1)+H(P, 3)) 

The proof is completed. Multiple signatures are more complicated than nn, but they are still understandable.

Implementation in Bitcoin

To deploy a BLS multi-signature wallet on Bitcoin, we need to make money to an address that is generated by the aggregated public key P. Suppose we want this to be a 2-n multi-signature contract, then we can use the bitcoin lock script to describe it as follows:

  OP_2 <P> OP_CHECK_BLS_MULTISIG 

OP_2 indicates that 2 keys are required for signing. There is no indication that there are a total of 3 signatures, so it can represent different multi-signature addresses such as 2-3 or 2-100. As such, the total number of signatures will never be exposed.

In order to spend money in the output address, you need to provide P' (public key), S' (signature), and the number of the signer (in this case, 1 and 3). The unlock script is as follows:

  OP_1 OP_3 <P'> <S'> 

This information in the script is enough to verify the transaction. Among them, OP_1 and OP_3 tell us that the curve hashes that need to be calculated are H(P, 1) and H(P, 3) .

Thus, for any multi-signature of m and n , we only need the following information:

  • An aggregate public key P used in the lock script
  • An aggregated public key P' generated by some participants ( m ) (used in unlocking)
  • a polymeric signature S'
  • Number of participants required to unlock m

Everything is simple and beautiful!

Since the bitcoin address is usually used only once, it is necessary to use BIP32, a key derivation algorithm, to generate a new private key and address. Therefore, each time we generate a new private key, we also need to generate the corresponding member private key, which I don't like very much. In order to avoid generating a member private key after each transaction, a batch (such as 1000) member private key can be generated at one time. After all, each private key only occupies 32 bytes. So, when 1000 addresses are used up, we need to initialize again.

Disadvantage

Given everyone's comments and twitter reminders, I realize that some places I think about owing weeks will make you mistakenly think that the BLS signature algorithm is perfect. It is not perfect. Thank you for your reminder!

I completely ignored the high complexity of pairing calculations. We used to think that the pairing function e(P, Q) is both efficient and safe, in fact it is not completely.

Pairing function is not so efficient

 

BLS signature verification is an order of magnitude more complex than ECDSA. When verifying the aggregated signature of 1000 transactions in a block, 1000 pairing calculations are still required, which may be slower than when using ECDSA (verification for 1000 individual signatures). The only benefit is that you can put more transactions in the block, after all, the aggregate signature is only 32 bytes.

Unlike BLS, the Schnorr verification algorithm is very efficient, it can verify signatures together, and the efficiency is three times that of the ECDSA algorithm. In this way, the question comes, which is more important to us for efficiency and safety?

Want to be safe, harder!

 

The pairing function is very complicated and will be countered if it is used improperly. On the one hand, we want to use efficient pairing functions to speed up signature verification. On the other hand, we want the key information to be exposed as little as possible. The two are contradictory, and we need to choose the curve that is suitable for pairing with extreme care.

There is an attack against the elliptic curve cryptosystem called the MOV attack (named after the attackers Menezes, Okamoto, and Vanstone), which can use the pairing function to compromise system security. Therefore, you must be careful when using the pairing function.

to sum up

 

The BLS signature algorithm is excellent – it aggregates the signatures in the block into a single signature; it can perform key aggregation and mn multi-signature (no additional communication required); it avoids the use of random number generators. These advantages make it so simple and elegant. Of course, there is still room for improvement, and standardization and optimization will take time. But I hope that one day, it will be good enough to be included in the Bitcoin protocol. In this way, we can not only get its excellent features, but also enjoy the benefits of a smaller, more aggregated signature algorithm.

Dan Boneh, the first author of the BLS algorithm, is currently working on cryptocurrency, which makes me very excited. He is an outstanding cryptographer and his cryptography course is also great. Here, I would also like to recommend his cryptographic work that he has not yet completed. I believe that in the near future, he and his team will bring more interesting solutions and protocol improvements for cryptocurrencies.

Original link: https://medium.com/cryptoadvance/bls-signatures-better-than-schnorr-5a7fe30ea716 Author: Stepan translation & proofreading: wuwei & A sword

(This article is from the EthFans of Ethereum fans, and it is strictly forbidden to reprint without the permission of the author.