Vitalik: Three Transformations Ethereum Needs to Complete – L2, Wallets, Privacy
Vitalik believes Ethereum needs to complete three transformations: L2 scaling, wallet improvements, and privacy enhancements.Original Title: The Three Transitions
Original Author: vitalik
Original Source: vitalik.eth
Translation: MarsBit, MK
- Matter Labs receives $1 million in funding from a16z crypto’s “Crypto Startup School” accelerator.
- An in-depth introduction to Tenet: the new public chain designed specifically for LSD
- Sequoia China Leads $22 Million Investment in Ethereum Layer 2 Network Taiko
Special thanks to Dan Finlay, Karl Floersch, David Hoffman, as well as the Scroll and SoulWallet teams for their feedback, review, and suggestions.
When Ethereum transitions from a young experimental technology to a mature technology stack that can truly bring open, global, and permissionless experiences to ordinary users, the stack needs to undergo three major technical transitions, roughly simultaneously:
1. L2 scaling transition – Everyone migrates to rollups
2. Wallet security transition – Everyone migrates to smart contract wallets
3. Privacy transition – Ensuring privacy-preserving fund transfers, and ensuring that all other tools being developed (social recovery, identity, reputation) also preserve privacy
This is the triangular relationship of ecosystem transformation. You can only choose three out of three.
Without the first one, Ethereum will fail, as every transaction will cost $3.75 (or $82.48 if we get another bull market), and every mass-market product will eventually forget about the chain and take a centralized solution for everything.
Without the second one, Ethereum will fail, as users will not be willing to store their funds (and non-financial assets), and everyone will move to centralized exchanges.
Without the third one, Ethereum will fail, as all transactions (and POAPs, etc.) are public for anyone to see, which is too high a sacrifice for privacy for many users, and everyone will move to centralized solutions with at least some data hidden.
For the aforementioned reasons, these three transitions are crucial. But because solving these problems requires strong coordination, they are also challenging. Not only do protocol improvements need to be made, in some cases, fairly fundamental changes to how we interact with Ethereum need to be made, requiring deep changes in applications and wallets.
These three transitions will fundamentally change the relationship between users and addresses
In the L2 scaling world, users will exist in many L2s. Are you a member of ExampleDAO, which is on Optimism? Then you have an account on Optimism! Do you hold a CDP in the stablecoin system on ZkSync? Then you have an account on ZkSync! Have you tried some applications that happen to be on Kakarot? Then you have an account on Kakarot! The days when users had only one address are gone forever.
According to my Brave Wallet view, I have ETH in four places. Yes, Arbitrum and Arbitrum Nova are different. Don’t worry, this will become more complicated over time!
The complexity of smart contract wallets has made it more difficult to have the same address across L1 and various L2s. Today, most users are using externally-owned accounts, whose address is actually the hash of the public key used for signature verification – so there is no change between L1 and L2. However, for smart contract wallets, keeping an address is becoming increasingly difficult. Despite a lot of work being done to try to make addresses equivalent codes that hash between networks, especially CREATE2 and ERC-2470 singleton factories, it is very difficult to achieve this perfectly. Some L2s (such as “type 4 ZK-EVMs”) are not exactly equivalent to EVM and often use Solidity or intermediate assembly instead, preventing hash equivalence. Even if you can have hash equivalence, there are other non-intuitive consequences from wallet ownership changing via key changes.
Privacy requires each user to have more addresses, and may even change the types of addresses we deal with. If privacy address proposals are widely adopted, each user will no longer have just a few addresses, or one address per L2, but may have one address per transaction. Other privacy schemes, or even existing schemes like Tornado Cash, will change the way assets are stored: many users’ funds are stored in the same smart contract (and thus the same address). To send funds to a specific user, users must rely on the privacy scheme’s own internal address system.
As we have seen, these three transitions have weakened the “one user ~= one address” mental model in different ways, and some of their effects are fed back into the complexity of executing the transitions. Two particular points of complexity are:
1. How do you get information on who to pay if you want to pay someone?
2. If a user stores many assets in different places on different chains, how do they do key rotation and social recovery?
Three transitions are about on-chain payments (and identity)
I have coins on Scroll and want to pay for coffee (if “I” literally means me, the author of this article, then “coffee” is of course a metaphor for “green tea”). You are selling me coffee, but you only accept coins on Taiko. What do I do?
Basically, there are two solutions:
1. The recipient wallet (which could be a merchant or just an individual) makes an effort to support every L2 and has some automated function to asynchronously integrate funds.
2. The recipient provides their L2 and their address, and the sender’s wallet automatically routes the funds to the target L2 through some kind of cross-L2 bridging system.
Of course, these solutions can be combined: the recipient provides a list of L2s they are willing to accept, and the sender’s wallet calculates the payment, which may involve a direct send (if they are lucky) or a cross-L2 bridging path.
But this is just one example of the key challenges introduced by three hops: simple actions like paying someone start to require more than just a 20-byte address.
The transition to smart contract wallets fortunately carries little burden on the address system, but there are still some technical challenges to be addressed in other parts of the application stack. Wallets need to be updated to ensure that they not only send 21,000 gas in transactions, but more importantly to ensure that the payment receiver on the wallet tracks ETH transfers not just from EOAs, but also from smart contract code. Applications that rely on the assumption of address ownership staying constant (e.g., NFTs that prohibit smart contracts from enforcing a royalty) will have to find other ways to achieve their goal. Smart contract wallets will also make some things easier – in particular, if someone only accepts non-ETH ERC20 tokens, they will be able to use ERC-4337 payers to pay gas with that token.
On the other hand, privacy once again raises a major challenge that we have not yet really solved. The original Tornado Cash did not introduce these issues because it did not support internal transfers: users could only deposit into the system and withdraw from it. Once you can do internal transfers, users need to use an internal address scheme for the privacy system. In practice, the user’s “payment information” needs to include (i) some kind of “spending public key,” a blinded version of the recipient’s address that the recipient can spend, and (ii) a way for the sender to send an encrypted message that only the recipient can decrypt to help the recipient discover the payment.
The privacy address protocol relies on the concept of a meta-address, which works as follows: part of the meta-address is a blinded version of the sender’s spending key, and another part is the sender’s encryption key (though the minimal implementation can set these keys to be the same).
The key lesson here is that in a privacy-focused ecosystem, users will have spending public keys and encryption public keys, and a user’s “payment information” will need to include both of these keys. There are other good reasons to extend in this direction beyond payments as well. For example, if we want to have Ethereum-based encrypted email, users will need to publicly provide some form of encryption key. In an “EOA world”, we can reuse account keys to accomplish this, but in a secure smart contract wallet world, we might want more explicit functionality to accomplish this. This would also help make Ethereum-based identity more compatible with non-Ethereum decentralized privacy ecosystems, the most prominent example being PGP keys.
Three Transitions and Key Recovery
In a world where users may have multiple addresses, the default way to implement key changes and social recovery is to have users go through a recovery process for each address individually. This can be done with a one-click process: wallets can include software to execute the recovery process on all of a user’s addresses simultaneously. However, even with such a user experience simplification, naive multi-address recovery has three problems:
1. Impractical gas costs: This one is self-explanatory.
2. Counterfactual addresses: Addresses whose smart contracts have not yet been deployed (in fact, this means you don’t have an account that has sent funds to that address yet). As a user, you can have an infinite number of counterfactual addresses: one or more on every L2, including L2s that don’t yet exist, and a completely separate infinite set of counterfactual addresses stemming from stealth address schemes.
3. Privacy: If users intentionally have many addresses to avoid linking them together, they certainly do not want to publicly link all of their addresses by recovering them at the same time or around the same time!
Solving these problems is difficult. Fortunately, there is a fairly elegant solution that performs quite well: an architecture that separates verification logic from asset holding.
Each user has a keyring contract, which exists in one location (possibly the mainnet or a specific L2). The user then has addresses on different L2s, where each address’s verification logic points to a pointer to the keyring contract. Spending from these addresses will require a proof into the keyring contract, showing the current (or more realistically, most recently used) spending public key.
Proofs can be achieved in several ways:
1. Directly reading read-only L1 access permissions in L2. L2 can be modified to give them a direct way to read L1 state. If the key library contract is on L1, this would mean that contracts within L2 can “freely” access the key library.
2. Merkel Branch. Merkel Branch can prove L1 state to L2 or L2 state to L1, or you can combine the two to prove part of an L2 state to another L2. The main weakness of Merkel proof is the high gas cost due to proof length: a proof may require 5kB, although this will be reduced to less than 1kB in the future due to Verkle trees.
3. ZK-SNARKs. You can reduce data costs by using Merkel Branch’s ZK-SNARK instead of the branch itself. Chain-offline aggregation technology can be built (e.g. based on EIP-4337) to allow a single ZK-SNARK to verify all cross-chain state proofs in a block.
4. KZG commitment. L2 or schemes built on it can introduce a sequential addressing system, allowing state proofs within the system to be only 48 bytes long. Like ZK-SNARKs, a multi-proof scheme can merge all these proofs into a single proof for each block.
If we want to avoid doing a proof for every transaction, we can implement a lighter-weight scheme that only requires a cross-L2 proof when recovering. Spending from an account will depend on a spending key, whose corresponding public key is stored in that account, but recovery will require a transaction that copies the current spending public key in the key library. Funds in a counterfactual address are secure even if your old key is compromised: “activating” a counterfactual address, turning it into a working contract, will require doing a cross-L2 proof and copying the current spending public key. This thread on the Safe forum describes how a similar architecture might work.
To add privacy to such a scheme, we only need to encrypt the pointer and then do all the proofs in ZK-SNARKs:
By doing more work (e.g. starting with this work), we can also strip away much of the complexity of ZK-SNARKs and make a simpler KZG-based scheme.
These schemes may become complicated. However, there are many potential synergies between these schemes. For example, the concept of a “keyring contract” may also be a solution to the “address” challenge mentioned in the previous section: if we want users to have persistent addresses that do not change when they update their keys, we can put the secret meta-address, encrypted keys, and other information into a keyring contract and use the address of the keyring contract as the user’s “address”.
Many second-tier infrastructure needs to be updated
Using ENS is expensive. Today, in June 2023, the situation is not too bad: transaction fees are high, but they can still be compared to ENS domain name fees. Registering zuzalu.eth cost me about $27, of which $11 was the transaction fee. But if we have another bull market, fees will skyrocket. Even without an ETH price increase, a gas return to 200 gwei will raise the transaction fee for domain name registration to $104. Therefore, if we want people to really use ENS, especially in applications such as decentralized social media, where users demand almost free registration (ENS domain name fees are not a problem, as these platforms provide subdomains for their users), we need ENS to run on L2.
Fortunately, the ENS team has already started taking action, and ENS is actually happening on L2! ERC-3668 (also known as the “CCIP standard”), together with ENSIP-10, provides a way to automatically verify ENS subdomains on any L2. The CCIP standard requires setting up a smart contract that describes how to verify L2 data proofs, and the domain (e.g. Optinames uses ecc.eth) can be placed under the control of such a contract. Once the CCIP contract controls ecc.eth on L1, accessing some subdomain.ecc.eth will automatically involve looking up and verifying the proof (e.g. Merkle branch) of the L2 state that actually stores that particular subdomain.
Actually obtaining the proof involves accessing a series of URLs stored in the contract, which admittedly feels centralized, although I would argue that it is not: it is a 1-of-N trust model (invalid proofs are captured by the verification logic in the CCIP contract’s callback function, and as long as one URL returns a valid proof, there is no problem). This list of URLs may contain dozens of URLs.
The work of ENS CCIP is a successful example and should be seen as a sign that the kind of radical reform we need is possible. But more application-level reforms are needed. Some examples include:
Many dapps rely on users to provide off-chain signatures. This is easy for externally owned accounts (EOAs). ERC-1271 provides a standardized way for smart contract wallets to achieve this. However, many dapps still do not support ERC-1271; they need to.
Dapps that use “is this an EOA?” to distinguish between users and contracts (e.g. to prevent transfers or execution of royalties) will break. In general, I recommend not trying to find a purely technical solution; figuring out whether a particular cryptographic control transfer is a beneficial equity transfer is a difficult issue that may not be solvable without some off-chain community-driven mechanism. Most likely, applications will have to rely less on preventing transfers and more on technologies like Harberger taxes.
Wallet interactions with spending and cryptographic keys will need to improve. Currently, wallets typically use deterministic signatures to generate application-specific keys: signing a standard random value (e.g. the hash of the application name) with the EOA’s private key, generating a deterministic value that cannot be generated without the private key, so is technically secure. However, these techniques are “opaque” to wallets, preventing them from implementing user interface-level security checks. In a more mature ecosystem, signature, encryption, and related functionality will need to be handled more explicitly by wallets.
Light clients (e.g. Helios) will need to verify L2, not just L1. Today, light clients focus on checking the validity of the L1 header (using a light client synchronization protocol), as well as verifying the Merkle branches of the L1 state and transactions sourced from the L1 header. Tomorrow, they will also need to verify the proofs of the L2 status sourced from the L1-stored state root (the more advanced versions actually look at L2 pre-commits).
Wallets need to protect assets and data.
Today, wallets’ business is protecting assets. Everything exists on-chain, and the only thing wallets need to protect is the private key that currently protects those assets. If you change keys, you can safely publish your previous private key on the internet the next day. However, in a zero-knowledge proof world, things are no longer that simple: wallets are protecting not just authentication credentials, but also your data.
We saw the first signs of such a world at ZuBlockingss, the ZK-SNARK-based identity system used by Zuzalu. Users have a private key, which they use to authenticate with the system, and can be used for basic proofs such as “prove that I am a resident of Zuzalu, but don’t reveal which one”. However, the ZuBlockingss system has also begun to have other applications built on top of it, the most famous of which is stamps (the POAPs version of ZuBlockingss).
One of my many ZuBlockingss stamps proves that I am a proud member of Team Cat.
The key feature that stamps offer over POAPs is that stamps are private: you hold the data locally, and only prove the stamp (or some computation on the stamp) when you want someone to have that information about you. But this increases the risk: if you lose that information, you lose your stamp.
Of course, the problem of holding data can be reduced to the problem of holding an encryption key: a third party (or even the chain) can hold an encrypted copy of the data. There is a convenient advantage to this, which is that your actions do not change the encryption key, so you do not need to interact with any system for keeping your encryption key secure. But even so, if you lose your encryption key, you lose everything. Conversely, if someone sees your encryption key, they can see everything that was encrypted with that key.
ZuBlockingss’ actual solution is to encourage people to store their keys on multiple devices (e.g. laptop and phone), as the chance of losing all of them simultaneously is small. We can go further, using secret sharing to store keys, splitting the keys among multiple guardians.
This socially-recovered via MPC is not a sufficient solution for wallets, as it means not only the current guardians, but also past guardians, could collude to steal your assets, which is an unacceptable high risk. However, privacy leaks are generally smaller risks than total loss of assets, and if someone has a use case that requires highly-protected privacy, they can accept higher loss risk by not backing up the associated keys for privacy-sensitive actions.
To avoid overwhelming users with a complex multi-recovery-path system, wallets supporting social recovery may need to manage asset recovery and encryption key recovery simultaneously.
Returning to Identity
One common theme of these changes is that the “address” concept representing “you” using a cryptographic identifier on-chain must undergo a thorough overhaul. “How to interact with me” is no longer just an ETH address; they must be in some form, included in some combination of multiple addresses on multiple L2s, a stealth meta-address, encrypted keys, and other data.
One way to accomplish this is to make ENS your identity: your ENS record can contain all this information, and if you send bob.eth (or bob.ecc.eth, or…) to someone, they can look up and know everything about how to pay and interact with you, including in more complex, cross-domain, and privacy-preserving ways.
However, this ENS-centric approach has two weaknesses:
l It binds too many things to your name. Your name is not you; your name is just one of your many attributes. You should be able to change your name without having to move your entire identity configuration and update a bunch of records in many apps.
l You can’t have untrusted pseudonymous names. One key UX feature of any blockchain is the ability to send coins to someone who hasn’t interacted with the chain yet. Without this ability, there’s a chicken-and-egg problem: interacting with the chain requires paying transaction fees, and paying fees requires… already having coins. ETH addresses, including smart contract addresses with CREATE2, have this feature. ENS names don’t, because if two Bobs both off-chain decide they’re bob.ecc.eth, there’s no way to choose which one gets that name.
One possible solution is to put more things into the keyring contract mentioned earlier in this article’s architecture. Keyring contracts can contain various information about you and how to interact with you (via CCIP, some of this information can be off-chain), and users can use their keyring contract as their primary identifier. But the actual assets they receive will be stored in various different places. Keyring contracts are not bound to names; they are pseudonymous-friendly: you can generate an address initialized with a keyring contract that can only be owned by someone with certain fixed initial parameters.
Another category of solution is related to abandoning the user-facing concept of addresses, similar to the spirit of the Bitcoin payment protocol. One idea is to rely more on direct communication channels between sender and recipient; for example, the sender can send a payment request link (as an explicit URL or QR code), and the recipient can accept the payment in whatever way they prefer using that link.
Whether the action is taken by the sender or the recipient, relying more on the wallet to generate the latest payment information in real time can reduce friction. However, persistent identifiers are convenient (especially when combined with ENS), and in practice, the assumption of direct communication between sender and recipient is a very tricky issue, so we may see a combination of different technologies.
In all of these designs, it is crucial to keep things both decentralized and easy for users to understand. We need to ensure that users can easily access the latest view of their current assets as well as the information that has been published for them. These views should rely on open tools rather than proprietary solutions. Avoiding the more complex payment infrastructure becoming an opaque “abstraction tower” that developers find difficult to understand what’s happening and adapt it to new environments will require hard work. Despite the challenges, it is essential to achieve Ethereum’s scalability, wallet security, and privacy for ordinary users. This is not just about technical feasibility but about practical accessibility for regular users. We need to rise to this challenge.
We will continue to update Blocking; if you have any questions or suggestions, please contact us!
Was this article helpful?
93 out of 132 found this helpful
Related articles
- Quick preview of upcoming frxETH v2: a more efficient and decentralized LSD protocol
- MR Headset: Vision Pro, a “Legal Risk Pro” in the Virtual Reality Industry?
- Long-term Effects of Shapella Upgrade: Yield, Competition, and LSD-Fi
- Electric Capital: Analysis of Key Mechanisms for Value Capture in Rollups
- Exhaustion of Gas fee causes Arbitrum to pause transaction processing for one hour
- Analysis of SEC’s “Trigger” for Launching Crypto War and 5 Possible Outcomes
- Viewpoint: Why the US Congress should regulate cryptocurrency