Why choose a cryptographic algorithm for privacy protection? What magical mathematical theories are behind cryptographic algorithms? When is 3 greater than 9? How exactly is the illusion of computational reversibility broken in mathematics?
Here, we will start from the theoretical foundation of cryptographic trust and share some thoughts on the application of cryptographic technology in privacy protection technical solutions: how to understand the capability boundaries of cryptographic algorithms, and how to objectively compare the effectiveness of different cryptographic algorithms for privacy protection solutions. Sexual impact .
- Economic Daily Review Blockchain: Embracing the real economy
- Director of Monetary Authority of Singapore talks about Ubin project: more than 40 cooperative institutions, looking forward to working with China
- "Chain Shanghai South" plan announced! Hainan Free Trade Zone (Hong Kong) Blockchain Pilot Zone Releases "Six Chains"
- Babbitt Column | Stanford: 2019 Blockchain Social Impact Report
- Vitalik Buterin: Persuading the Ethereum Foundation to sell $ 100 million ETH, and it is basically at the "highest point"
- Opinion: Blockchain games are still an experiment until commercial success
It all starts with the magical "asymmetry" of cryptography.
The magical "asymmetry"
As early as BC, ancient civilizations such as ancient Egypt, ancient Rome, and ancient Greece have begun to use cryptography to protect the confidentiality of information. The earliest asymmetry in history is the use of special information encoding methods. If a third party does not know In a specific encoding method, it is difficult to decode the corresponding information.
After about 4,000 years of development, in the early 20th century, modern cryptography was formally introduced, introducing a more rigorous mathematical definition of asymmetry. More representative early papers include "Cryptography in an Algebraic Alphabet" published by Lester S. Hill in the American Mathematical Monthly in 1929.
At the end of the 20th century, with the popularization of the Internet, a large amount of sensitive data was transmitted on the network, resulting in a large number of data content protection needs, and the cryptographic technology has developed rapidly.
In modern cryptography, the most familiar concept of asymmetry is "public key" and "private key".
Take encrypted communication as an example. The protagonist Xiaohua wants to send an email to his friend Meili by encryption. You can find the beautiful public key first, use the public key to encrypt the message content, and get the encrypted ciphertext. Send to beautiful. After receiving the ciphertext of the email content, Meimei decrypted it with her private key, and finally got the plaintext of the email content.
- Why only beautiful can decrypt the message content?
- Why can't someone else infer her private key with a beautiful public key?
The answers to these questions must be attributed to the theory of computational difficulty in cryptography.
Computational difficulty theory
In the privacy protection scenario, the computational difficulty theory is specifically manifested in that the computational difficulty of obtaining the same information through different computation paths for the same privacy data subject is asymmetric. In asymmetry, relatively easy calculations are used to construct authorized data access, while difficult calculations are used to avoid unauthorized data leakage.
There are many ways to construct such asymmetry. One of the most classic ways is one of the seven major problems of the millennium-the P and NP problems.
The P problem is a deterministic Turing machine, that is, a general-purpose computer computing model, which can be calculated to obtain the answer in polynomial time (O (n ^ k)). The NP problem is a type of problem in which the deterministic Turing machine can verify the correctness of the answer in polynomial time, but it may not be able to calculate the answer.
Regarding the same answer, the verification process is much easier than the calculation process, so we can construct the asymmetry of the computational difficulty required by the cryptographic algorithm.
Can the NP problem be transformed into a P problem by an effective polynomial time algorithm, thereby cracking the asymmetry of computational difficulty? At present, there is no conclusion in the academic circle.
Theoretical research further shows that for the core problem in the NP problem set, that is, the NP complete problem, if an efficient polynomial time algorithm can be found to solve any NP complete problem, all other NP problems can be constructed based on this algorithm to effectively Polynomial time algorithm. As a result, the asymmetry in computational difficulty mentioned previously will no longer exist.
Fortunately, after nearly 70 years of scientific exploration, such an algorithm has not been discovered. In a limited time, it is difficult for modern computers to solve the answers to these problems, so modern cryptography can safely construct effective cryptographic algorithms based on these NP-complete problems.
The magical "calculation problem"
Visually speaking, the core of the theory of computational difficulty is to construct a maze. If you do not know the shortcut, it is difficult to reach the exit.
The effectiveness of various cryptographic algorithms that we use every day is closely related to this theory. Here we focus on the asymmetric cryptography algorithm as an example to introduce the classic blueprint of maze construction, which is the three major computational difficulties:
Difficulty in factoring large numbers
Discrete logarithm problem
Difficult logarithmic problems on elliptic curves
Difficulty in factoring large numbers
Given two large prime numbers p and q, it is easy to calculate n = p * q. However, given n, solving p and q is difficult.
The prime factorization of integers is one of the most famous problems in number theory. At present, the most effective method for solving the prime factorization is called the number field sieve method, that is, iterative calculation of the possible set of integers by constructing an algebraic number field.
At present, there is no more effective decomposition method for large integer factorization. Therefore, some cryptography schemes use the problem of large number factorization to construct corresponding protocols. For example, the RSA series algorithm reduces its difficulty to the problem of large number factorization. If the difficult problem of large integer decomposition is cracked, the private data protected by the RSA cryptographic scheme will also be cracked accordingly.
Discrete logarithm problem
In a finite field with modulus n and generator g, given the integer a, it is easy to calculate g ^ a = b. However, it is difficult to calculate a given b and g.
Many new readers of cryptography will have the illusion of computational reversibility on the discrete logarithm problem. It seems to be a log operation, but the truth is not the case.
In the real number field, elements have a very important property, total order relationship, so it is easy to compare sizes. For example, if 9> 2 and 3> 2 in the real number domain, 9> 2 must be derived.
When calculating log2 (9), the computer performs a binary search on the result of the function with element 9 as the input, first calculates (9/2) ^ 2 and 9 to compare, and then calculates ((0 + 9/2) / 2 ) ^ 2…. The end result of the log is calculated by constantly comparing the properties of the element sizes.
However, in finite fields, there is no total order relationship between elements. In a finite field with a modulus of 7, we can see that relations such as 9 equals 2, and 3 is greater than 9.
Therefore, a in the above process cannot be calculated by an effective algorithm. Many well-known cryptographic protocols are based on discrete logarithmic difficulties, such as the Diffie-Hellman key exchange protocol, ElGamal encryption, and DSA algorithms.
Difficult logarithmic problems on elliptic curves
At present, elliptic curve cryptography algorithms are the mainstream of current cryptographic applications. Each privacy data can be expressed as a point on the elliptic curve in the form of coordinates (x, y). Similar to the general discrete logarithmic difficulty problem, the discrete logarithm difficulty problem on an elliptic curve can be expressed as:
For an elliptic curve group over a finite field F, point P is a point on the curve. Given an integer a, it is easy to calculate a * P = Q. However, calculating a from P and Q is difficult.
Unlike ordinary algebra operations, point operations on elliptic curves are defined as follows:
It can be seen that the point operation on the elliptic curve is very different from the operation on the ordinary real number field. At present, there is no effective algorithm to decode the discrete logarithm problem of the elliptic curve. At present, the most commonly used public-key cryptosystems ECDSA, EdDSA, and State Secret SM2 are based on this difficult problem.
Objectively compare different cryptographic algorithms
Because different cryptographic algorithm constructions use different difficult problems, correspondingly, different difficult problems are bound to introduce different security assumptions.
Understanding these security assumptions is the key for companies to make technical selections and objectively judge the strength of weak privacy protection schemes based on different cryptographic algorithms.
Here, we need to further introduce the concept of "safety parameters".
A security parameter is a value that measures the strength of a cryptographic algorithm to protect private data. For the security parameter values at the "same level", the security levels of different cryptographic algorithms are basically the same, that is, in the face of the most effective attack method known, the probability that the algorithm is cracked will lead to the same leakage of private data.
In general, the size of the security parameter value directly reflects the length of the key. At the same level, there are large and small security parameter values, and the corresponding key length is also different.
Based on the minimum length of the cryptographic algorithm key for different difficult problems, the National Institute of Standards and Technology NIST recommends the following, where each cell represents the minimum number of bits required to use the key length.
From the above table, we can see that even if the key length is the same, the security level obtained by choosing different difficult problems is different. Generally speaking, the technical scheme constructed based on the same difficult problem, the longer the key length, the higher the security, and accordingly, the lower the system efficiency, which is often accompanied by trade-offs in other system designs.
Different scenarios should select the appropriate technical solution and key length according to business needs. The following points need special attention:
The security of a privacy protection technology solution depends on the lowest level of security parameters in the implementation of the cryptographic algorithm it uses.
Comparing the security of cryptographic algorithms without specifying security parameters has no practical significance.
If the value of the security parameter is small, and generally the corresponding key length is short, no matter how delicate the design of the cryptographic algorithm is, the actual effect may be insecure.
Due to the differences in the choice of difficult problems, the theoretical strength of cryptographic algorithms is not the strongest, only it is strong enough to meet specific security assumptions, and it is usually not practical to force the comparison of cryptographic algorithms based on different difficult problems.
In the final analysis, the difficulty of calculation is still a calculation problem. As the computer's computing power increases or the progress of algorithm theory advances, the security of these problems will change. For example, the RSA encryption algorithm and NIST key management guidelines believe that after 2010, 1024-bit keys are no longer secure and need to be increased to 2048-bit key lengths. It is expected that their security and effectiveness can be maintained until 2030.
For enterprises, the inspiration here is that we cannot simply think that the privacy protection technology solution is effective now, and it will be guaranteed to be effective 10 years later. No matter what kind of privacy protection technology scheme has its timeliness.
If an enterprise can update the existing system in a timely manner according to the security parameters and algorithm schemes recommended by authoritative technical organizations, the difficulty theory can effectively guarantee the effectiveness of privacy protection technology schemes as long as new.
Exactly: Cryptography is easy to defend and difficult to attack, and difficulty theory comes first!
As the cornerstone of cryptographic security, computationally difficult problems and related security parameters are the key points for enterprises to effectively select cryptographic algorithms. When an enterprise application is implemented, it is necessary to fully consider the validity period of privacy data privacy, select an appropriate cryptographic algorithm and key length, and make necessary measurements on data security and system efficiency.
In addition to the computational difficulties directly related to cryptographic algorithms, a complete privacy protection technology solution usually requires the construction of cryptographic protocols to combine multiple cryptographic algorithms. Cryptographic protocols introduce interactions between multiple parties, which also introduces other important security assumptions.
These security assumptions are important to evaluate the overall security, effectiveness, and practicality of the privacy protection technology solution. For specific analysis, please pay attention to the decomposition below.