Vitalik 4D Long Text: What will happen to the hard core puzzles that plague cryptocurrencies in five years?

Written in front: This article is a hardcore long post published by Ethereum co-founder Vitalik Buterin. The article lists the problems he thought cryptocurrencies should solve five years ago, and gives progress five years later. These issues are mainly divided into three aspects: cryptography, consensus theory, and economics. Finally, Vitalik also added some new issues that we need to pay attention to today, and pointed out that the problems at the base layer will definitely be less and less, and the problems at the application layer have just begun.

The following is the full text:


In 2014, I published an article and a speech that listed the problems in the fields of mathematics, computer science, and economics, which I think are important for the maturity of the cryptocurrency field. The situation has changed a lot in the past five years. But how much progress did we think was important? In what ways have we succeeded? In what ways did it fail? In what ways have we changed our importance? In this article, I will discuss the 16 issues listed in 2014 one by one and see how we are progressing on each of them today. Finally, I will list my new definition of the 2019 puzzle.

These problems can be divided into three categories: (i) cryptographic problems, so if it can be solved, it is expected to be solved purely through mathematical techniques; (ii) consensus theory, mainly improvements in PoW and PoS; (iii) economic, oriented Different actors create structures that involve incentives, and often involve more application-level issues than protocol-level issues. We have seen significant progress in all of these areas, albeit with mixed results.

Cryptography issues

Blockchain scalability

One of the biggest problems facing the current cryptocurrency field is scalability … (the larger capacity blockchain) The main concern is trust: if only a few entities can run a full node, then these entities can conspire and agree to give themselves A lot of extra bitcoins, other users who haven't processed the entire block have no way to see if the block is valid.

Problem: Create a blockchain design to maintain bitcoin-like security guarantees, but the presence of strong nodes is required to ensure that the network operation can adapt to the number of transactions. Status: Large theoretical progress requires more real-world assessments.

Scalability is a technical issue and great progress has been made in theory. Five years ago, few people considered sharding; now, sharding design is common. In addition to Ethereum 2.0, there are also OmniLedger, LazyLedger, Zilliqa (and other projects are using sharding), and related research papers will be published almost every month. In my opinion, progress in this area is gradual. Fundamentally, we already have some technologies that allow the group of validators to safely reach consensus while processing more data compared to the environment of a single validator, and even allow the client to survive a 51% attack Indirectly verify the validity and availability of the block.

These are probably the most important technologies:

  • Random sampling, allowing a randomly selected committee to replace the full validator set: -in-an-uncoordinated-majority-model
  • Proof of fraud, allowing individual nodes to broadcast information about their existence to others after they learn about the error:
  • Escrow proof, allowing validators to probabilistically prove that they downloaded and verified some data individually:
  • Data availability proves that when the block header is unavailable, clients are allowed to detect their block content: , see the updated encoding Merkle tree proposal.

There are other smaller advances, such as cross-shard communication through receipts, and "constant factor" enhancements, such as BLS signature aggregation.

Despite this, fully sharded blockchains have not yet appeared in actual operation (Zilliqa with partial sharding has recently started running). In terms of theory, what remains is mainly the debate over details and the challenges related to the stability of the sharded network, developer experience, and reduction of the risk of centralization; there seems to be no longer any doubt about the basic technical possibilities. But the challenge that still exists is one that can't be solved only by thinking; only by developing a system and witnessing Ethereum 2.0 or some similar working chain can this technology prove to be really feasible.

2. Timestamp

Problem: Create a distributed incentive compatible system, whether it is an overlay on the blockchain or on its own blockchain, it can maintain the high accuracy of the current time. All legitimate users' time is normally distributed over a certain "real" time, with a standard deviation of 20 seconds … The interval between two nodes is not allowed to exceed 20 seconds. The solution can rely on the existing "N nodes" concept; in practice, this will be enforced through PoS or non-sybil tokens (see Article 9). The system should continue to provide a time, which should be within 120 seconds of the internal time (may be less if possible), greater than 99% of the time of the honest nodes. External systems may ultimately depend on this system; therefore, it should remain secure regardless of the incentive mechanism, preventing an attacker from controlling less than 25% of the nodes.

Status: Some progress.

In fact, Ethereum's block generation time is only 13 seconds, and it runs smoothly without special advanced time stamping technology; it uses a simple technique in which the client does not accept the time stamp earlier than the client A block of local time. In other words, this has not been tested by serious attacks. Recent network adjustment timestamp proposals seek to improve the status quo, allowing clients to agree on time without knowing the current time; this has not been tested. But in general, timestamps are not currently at the forefront of research challenges; perhaps when the PoS chain (including Ethereum 2.0 and other chains) is really online, we can see the problem and this will change again.

3. Arbitrary calculation proof

Problem: Create programs POC_PROVE (P, I)-> (O, Q) and POC_VERIFY (POQ-> {0,1}, so that POC_PROVE runs program P when inputting I and returns program output O. Calculation proves that Q and POC_VERIFY are used P, O, and Q and the output (prove) whether Q and O are generated by P using the POC_PROVE algorithm.

Status: Large theoretical and practical progress.

This is basically the process of building a SNARK (or STARK, or SHARK, or …). We have done it! SNARKs are now understood by more and more people, and have even been used on multiple blockchains (including on Ethereum). As a privacy technology (see Zcash and and capacity expansion technology (see ZK Rollup, STARKDEX, and STARKing erasure coded data roots), SNARKs are very useful.

There are still challenges in terms of efficiency; creating algorithm-friendly hash functions is an important challenge, and effectively proving random memory access is another. In addition, there is an unresolved question, is whether the enlargement of O (n * log (n)) is a basic limitation in the proof time, or is there a way to make a concise proof using only linear overhead? , Just like bulletproofs (unfortunately, this requires linear time to verify). In addition, the risk of deficiencies in existing solutions has always existed. In general, the problem is the details, not the rationale.

4. Obfuscating the code

The key is to create a fuzzy processor O, so that in the face of any program P, the fuzzy processor can produce a second program O (P) = Q. If the same input is given, P and Q can return the same output. Importantly, Q does not publish any internal information about P anyway. You can hide a password, a piece of encrypted key inside Q, or you can simply use the unique work of the Q hiding algorithm itself.

Status: Progress is slow.

Simply put, we want a way to "encrypt" a program so that the encrypted program still provides the same output for the same input, but the "internal components" of the program will be hidden. A typical use case for obfuscation is a program containing a private key that only allows the private key to sign certain messages.

The code obfuscation solution is very useful for blockchain protocols. Its application is subtle because it has to deal with the possibility that an obfuscated program on a chain will be copied and run in a different environment than the chain itself, but there are many possibilities. For me, I'm interested in being able to remove the centralization operator in the anti-collision tool and replace it with a fuzzer, which contains some PoW. When trying to determine the behavior of each participant, use different inputs to do more. The cost of this run is very high.

Unfortunately, this remains a problem. The work to solve this problem is still ongoing. On the one hand, it is to carry out structural work, trying to reduce the number of hypotheses of mathematical objects that we do not know in actual situations (such as general cryptography multilinear mapping). One party tried to actually deploy the required mathematical objects. However, all of these approaches are still far from creating something that works and is safe. See for a more comprehensive description of the problem.

5. Hash cryptography

Problem: Creating a signature algorithm does not rely on security assumptions, but the random oracle hash attribute has optimal size and other attributes compared to traditional computers that maintain 160-bit security.

Status: Some progress.

Since 2014, two advances have been made in this regard. SPHINCS is a "stateless" (meaning that it doesn't need to remember information like a random number multiple times) signature scheme, it was released shortly after the release of this "difficulty" list and provides a size of about 41 kB pure hash signature scheme. In addition, STARKs have been developed, and people can create similar-sized signatures based on them. In fact, not only signatures, but universal zero-knowledge proofs can be implemented with hashing, which I never thought of five years ago; I'm glad to see this happen. Despite this, capacity is still an issue, and continuous progress (recent DEEP FRI) is continuing to reduce the size of proof.

The main unsolved problem of hash-based cryptography is the aggregation signature, which is similar to the function implemented by BLS aggregation. As we all know, we can STARK many Lamport signatures, but this is very inefficient; a more effective solution will be welcomed. (If you want to know if hash-based public key encryption is feasible, then the answer is no, beyond the cost of the second attack, you can do nothing)

Consensus theory problem

6. Anti-ASIC PoW

One way to solve this problem is to create a PoW algorithm, which is based on a type of calculation that is difficult to specialize … for a more in-depth discussion of anti-ASIC hardware, see here .

Status: Efforts have been made to resolve it.

About 6 months after this "difficult" list was released, Ethereum determined its ASIC-resistant PoW algorithm: Ethash. Ethash is a memory-hard (hard memory) algorithm. The principle is that the random access memory in ordinary computers has been well optimized, so it is difficult to improve for special applications. Ethash's goal is to make memory access a major part of running PoW calculations to achieve ASIC resistance. Ethash is not the first memory-hard algorithm, but it does add an innovation: it uses pseudo-random lookup on a two-level DAG, allowing two methods to evaluate the function. First, if you have the entire (about 2 GB) DAG, you can calculate it quickly; this is the "fast lane" of memory-hard. Second, if there is only the highest level of DAG, it can be calculated more slowly (still able to quickly check a single provided solution); this is used for block verification.

Ethash has proven to be very successful in resisting ASICs; after 3 years and billions of dollars in block rewards, ASICs do exist, but their power consumption and cost efficiency are at most 2-5 times that of GPUs. ProgPoW has been proposed as an alternative, but more and more people believe that ASIC-resistant algorithms will inevitably have a limited lifetime and that ASIC-resistant algorithms are flawed because it makes 51% attacks cheaper.

I think it is possible to create a PoW algorithm that provides a moderate level of resistance to ASICs, but this resistance is limited, and both ASIC and non-ASIC PoWs have disadvantages; in the long run, a better choice for blockchain consensus is PoS.

7. Useful PoW

Make PoW useful while it works; a common alternative is Folding @ home. Users can download this software to a computer to simulate protein folding, providing researchers with a wealth of data to help them treat diseases.

Status: May not be feasible, with one exception.

The challenge of a useful PoW is that the PoW algorithm requires many attributes:

  • Difficult to calculate
  • Easy to verify
  • Does not rely on large amounts of external data
  • Can efficiently perform calculations on small data blocks

Unfortunately, not many useful calculations retain all of these properties, and most calculations that have all of these properties and are "useful" can only be "useful" for too short a time to build a cryptocurrency based on them.

However, there is one possible exception: zero-knowledge proof generation. All aspects of the effectiveness of the blockchain proven by zero-knowledge are difficult to calculate and easy to verify. In addition, this difficult-to-calculate feature can be durable; if the proof of "highly structured" computing becomes too simple, you can simply switch to verify the entire state transition of the blockchain, as virtual machines and random memory accesses need to be modeled This will become very expensive.

The zero-knowledge proof of the effectiveness of the blockchain provides great value to the users of the blockchain, because they can replace the need to directly verify the chain; Coda has already done so, although it uses a simplified blockchain design, And a lot of optimizations for verifiability. Such proof can significantly improve the security and scalability of the blockchain. In other words, the total amount of calculations that actually need to be completed is still much less than the calculations currently completed by PoW miners, so this is at best an additional PoS blockchain, not a comprehensive consensus algorithm.

8. PoS

Another method to solve the problem of mining centralization is to completely cancel mining and use other mechanisms to calculate the weight of each node in the consensus. So far, the most popular alternative in discussion is "PoS"-that is, the consensus model is no longer considered as "one CPU, one vote", but "one coin, one vote".

Status: Large theoretical progress, waiting for more realistic assessments.

At the end of 2014, the PoS community clearly realized that some form of "weak subjectivity" was inevitable. In order to maintain economic security, the node needs to obtain the latest checkpoint additional protocol when it is synchronized for the first time. If it is offline for more than several months, it needs to obtain it again. This is a pill that is difficult to swallow; many supporters of PoW still insist on using PoW, because in the PoW chain, the unique data from trusted sources (the blockchain client itself) can be used to discover the " head". However, since the additional trust requirement is not high, PoS advocates are willing to swallow the pill. From there, the path to PoS through long-term secure deposits became clear.

Most interesting consensus algorithms today are basically similar to PBFT, but replace a fixed set of validators with a dynamic list, and anyone can withdraw funds by sending tokens to system-level smart contracts with time locks (in some In some cases, withdrawals may take up to 4 months to complete) to join the list. In many cases (including Ethereum 2.0), these algorithms achieve "economic termination" by punishing verifiers who violate the protocol.

Today we have:

  • Casper FFG:
  • Tendermint:
  • HotStuff:
  • Casper CBC:

There are refinements going on (eg here and here ). Eth2 Phase 0, where FFG will be deployed, is currently being implemented and great progress has been made. In addition, Tendermint has been running as a Cosmos chain for several months. In my opinion, the remaining debate about PoS is related to optimizing economic incentives and further developing strategies to deal with 51% attacks. In addition, the Casper CBC specification can still use specific efficiency improvements.

9. Proof of storage

The third way to solve this problem is to use rare computing resources other than computing power or currency. In this regard, the two main options proposed are storage and bandwidth. In principle, there is no way to provide a post-encryption proof that the bandwidth is used, so to be precise, the bandwidth proof should be a subset of the social proof, which will be mentioned later, but the storage proof can definitely be calculated. One advantage of the storage proof is that it is completely ASIC-resistant; our storage on hard disks is close to optimal.

Status: There are many theoretical advances, although there is still much to be done, and more realistic assessments.

Some blockchains plan to use proof-of-storage protocols, including Chia and Filecoin. In other words, these algorithms have not been tested in the real world. My main concern is centralization: are these algorithms actually dominated by small users using spare storage capacity or are they dominated by large mines?


10. Stable value crypto assets

One of Bitcoin's main problems is price volatility … Problem: Constructing a crypto asset with stable price.

Status: Some progress.

MakerDAO is online and has been stable for nearly two years. The value of its underlying backed assets (ETH) has fallen by 93%, but it has survived and now has issued more than $ 100 million in DAI. It has become the backbone of the Ethereum ecosystem, and many Ethereum projects have already or are deploying this project. Other synthetic token projects, such as UMA, are also rapidly developing.

However, although the MakerDAO system has withstood the severe economic situation in 2019, these conditions are by no means the most severe. In the past, Bitcoin has fallen by 75% in two days; the same could happen to Ethereum or any other mortgage asset. Attacks on the underlying blockchain are a greater untested risk, especially if prices fall at the same time. Another major and arguably bigger challenge is that the stability of systems such as MakerDAO depends on some underlying oracle machine solutions. Different attempts at oracle systems do exist (see Article 16), but how long they will last in the face of great economic pressure is inconclusive. So far, the collateral controlled by MakerDAO has been lower than the value of MKR tokens; if this relationship is reversed, MKR holders may have an incentive to "loot" the MakerDAO system. There are many ways to prevent this attack, but it has not been tested in real life.

11. Decentralized public product incentives

In general, one of the challenges facing economic systems is the issue of "public goods". For example, suppose there is a scientific research project that will cost $ 1 million to complete, and it is well known that if it is completed, it will save $ 5 per person for 1 million people. Overall, the social benefits are obvious … (but) from a personal perspective, it makes no sense to contribute … so far, most problems with public goods have involved additional assumptions about centralization and Requirement: There is a completely reliable oracle to determine if a public product task has been completed (actually this is wrong, but this is another problem area).

Status: Some progress.

Funding public goods can generally be understood as being divided into two issues: the issue of funding (where to get funding) and the problem of summing preferences (how to determine what is truly public interest, not personal projects). This issue focuses specifically on the former, assuming the latter has been resolved.

Overall, there are no major new breakthroughs here. There are two broad categories of solutions. First, we can try to inspire individual contributions and give people corresponding social returns. My own proposal for charity through marginal price discrimination is one example; the other is the anti-malaria donation badge on Peepeth. Secondly, we can raise funds from applications with network effects. There are several options in the blockchain space:

  • Issue currency
  • Charge part of the transaction fee at the agreement level (eg EIP 1559)
  • Charge some transaction fees from some second-tier applications (such as Uniswap, or some extensions, and even use state leases in the execution environment of Ethereum 2.0)
  • Charge part of other fees (such as domain registration)

Outside the blockchain space, this is just an age-old question: how do you collect taxes if you are a government? How much do you charge if you are a business or other organization?

12. Reputation system

Question: Design a formal reputation system, including a score rep (A, B)-> V, where V is the reputation of B from A's point of view. This mechanism can determine the possibility of a party being trusted, and based on special disclosure Or final interaction records to update reputation.

Status: Progress is slow.

Since 2014, there has been little research on reputation systems. Perhaps the best way is to use a registry managed by the token to create a managed list of trusted entities or objects; Kleros ERC20 TCR is an example, and there is even an alternative interface to Uniswap, which uses the former as a backend to obtain tokens and stocks. List of codes and logos. Subjective types of reputation systems have not really been tried, perhaps because there are not enough "social graphs" of people-to-people connections, and this information has been published in some form. If this information starts to exist for other reasons, the subjective reputation system may become more popular.

13. Certificate of Excellence

This is an interesting, but largely unexplored, ad hoc approach to token allocation (with reasons to explain why it cannot be used so easily for mining) is to use socially useful but need primitive human drivers Tasks for creative effort and talent. For example, a "proof of proof" currency can be proposed, rewarding participants for the mathematical proof of certain theorems.

Status: No progress, this issue is forgotten.

The main alternative method of token distribution is airdrop; usually, one token is allocated proportionally to the existing holdings of other tokens when it is launched, or based on other indicators. Direct verification of human creativity has not really been attempted, and with recent advances in artificial intelligence, the problem of creating a task that only humans can but computer verify can be too difficult.

14. Decentralized contribution indicators

Unfortunately, stimulating the production of public goods is not the only problem with centralized solutions. Another issue is deciding, first, which public goods are worth producing, and second, determining how much a particular achievement has actually completed the production of public goods. This challenge involves the latter issue.

Status: Some progress, some changes.

Recent research on determining the contribution value of public goods does not attempt to separate the determination of tasks from the determination of the quality of completion; the reason is that in practice the two are difficult to separate. The work done by a particular team is often irreplaceable and very subjective, so the most reasonable approach is to package the relevance of the task and the quality of the performance and use the same techniques to evaluate both.

Fortunately, much progress has been made in this regard, especially the discovery of secondary financing. Secondary financing is a mechanism under which individuals can donate to the project and then use a formula to calculate how much they should donate if they cooperate well with each other (Considering each other's interests, they have not become victims of the tragedy of the commons). The difference between the amount we donate and the amount actually donated will be provided to the project as a subsidy to a central funding pool (see Article 11 for funding sources of the central funding pool). Note that this mechanism focuses on meeting the value of a community, not meeting a set goal, whether or not someone cares about it. Due to the complexity of the value problem, this method may be more robust to the unknown.

Secondary financing has also been tried in real life, and has achieved considerable success in the recent gitcoin secondary financing. Some progress has also been made in improving secondary financing and similar mechanisms; in particular, paired bounded secondary financing to prevent collusion. There is also work to regulate and implement anti-bribery voting techniques to prevent users from proving to third parties who they voted for; this can prevent collusion and bribery attacks.

15. Anti-Witch Attack System

One issue related to the reputation system problem is the challenge of creating a "unique identity system"-a system that generates tokens to prove that an identity is not part of a witch attack … however, we want one more A better and more equal system; it can be said that one person, one vote is the most ideal.

Status: Some progress.

Many attempts have been made to solve this only human problem. The attempts that came to mind include (incomplete list!):

  • HumanityDAO:
  • Pseudonym parties:
  • POAP ("proof of attendance protocol"):
  • BrightID:

As people become more interested in technologies such as secondary voting and secondary financing, the demand for anti-witch attack systems is also growing. It is hoped that the continuous development of these technologies and new technologies can meet this requirement.

16. Decentralized Success Indicators

Problem: Propose and implement a decentralized method to quantify real-world variables … the system should be able to measure anything that humans can currently reach a general consensus on (such as asset prices, temperature, global carbon dioxide concentration).

Status: Some progress.

This is now commonly referred to as the "oracle machine problem." The largest known running instance of a decentralized oracle is Augur, which has processed multi-million dollar bet results. Token-managed registries (such as Kleros TCR) are another example. However, due to highly controversial issues or due to 51% attack attempts, these systems have not yet started actual testing. There are also studies of oracle machines that occur outside the blockchain domain in the form of "peer-to-peer prediction" literature; here you can see the latest developments in this field.

Another imminent challenge is that people want to rely on these systems to guide the transfer of asset quantities, and the scale of these asset transfers is greater than the economic value of the system's local token. In this case, the token holders theoretically have a collusion to give the wrong answer to steal funds. In this case, the system will fork, and the original system token may become worthless, but the original system token holders can still get a return from any asset transfer they misdirected. The stablecoin (see Article 10) is a special example. One solution would be a system that assumes that selfless honest data providers do exist and creates a mechanism to identify them. If someone starts a malicious vote, users who rely on the oracle can complete an orderly exit first. In any case, the further development of oracle technology is a very important issue.

New question

If I were to write a list of problems again in 2019, some would be a continuation of the above issues, but the key points would have major changes and major new issues. Here are some options:

  • Obfuscation of passwords: see point 4
  • Ongoing work on post-quantum cryptography: "structured" mathematical objects based on hashing and post-quantum security, such as elliptic curve homology, control curves …
  • Anti-collusion infrastructure: work in progress and detailed content here , including increasing privacy for operators, adding multi-party computing in the most practical way, and more
  • Oracle: Same as point 16 above, but removes the emphasis on "success indicators" and focuses on the issue of "getting real-world data"
  • Unique identity: See 15 points, but the point is an "absolute" solution: it's harder to get two identities, but getting multiple identities at the same time is neither possible nor harmful
  • Homomorphic encryption and multiparty computing: For practicality, continuous improvement is required
  • Decentralized governance mechanism: DAO is cool, but the current DAO is still very primitive; we can do better
  • Formal response to PoS 51% attack: work in progress and detail are here
  • More sources of funding for public goods: The ideal approach is to charge for crowded resources (such as transaction fees) within a system with network effects, but in a decentralized system, doing so requires public rationality; therefore, A social problem, but also a technical problem of finding possible sources
  • Reputation system: see Article 12

In short, although the problems at the base layer are slow but will definitely decrease, the problems at the application layer have just begun.