How to break the interoperability trust problem by in-depth analysis of any message passing protocol?

How to solve the interoperability trust issue through analysis of message protocols?

Author: Shi Khai Wei, Raghav Agarwal

Compilation: Kxp, BlockBeats

Introduction

Multi-chain is the future trend of development, and the pursuit of scalability has led Ethereum to the construction of Rollup technology. In the process of turning to modular blockchains, people are once again paying attention to application chains. In the near future, we have heard rumors about specific application Rollups, L3, and sovereign chains. However, all of this will come at the cost of fragmentation, and currently available cross-chain bridges typically have functional limitations and rely on trusted signers to ensure security.

So what will the interconnected Web3 ultimately look like? We believe that cross-chain bridges will eventually evolve into cross-chain messaging or “Any Message Protocol” (AMP) protocols, unlocking new use cases, and allowing applications to transmit arbitrary messages between source and target chains. We will also witness the emergence of a “trust mechanism pattern” in which builders will make various trade-offs between availability, complexity, and security.

Every AMP solution needs to implement two key functions:

  • Verification: ability to verify the validity of messages from the source chain on the target chain
  • Liveness: ability to propagate information from the source chain to the target chain

Unfortunately, 100% trustless verification is not practical, and users must choose to trust code, game theory, humans (or entities), or combinations thereof, based on whether the verification is on-chain or off-chain.

In this article, we will horizontally divide the overall interoperability field into two aspects: trust mechanism-based and integration architecture-based.

Trust Mechanism:

1. Trusting Code and Mathematics: For these solutions, there are on-chain proofs that anyone can verify. These solutions typically rely on light clients to verify the consensus of the source chain on the target chain or the validity of the state transition of the source chain on the target chain. Verification via light clients can be made more efficient by using zero-knowledge proofs, compressing arbitrarily long computations into offline computation, while providing simple on-chain verification to prove the computation result.

2. Trusting Game Theory: When users/applications need to trust third parties or third-party networks to ensure the authenticity of transactions, additional trust assumptions are involved. By adopting permissionless networks and game theories such as economic incentives and optimistic security, the security of these mechanisms can be improved.

3. Trusting humans: these solutions rely on the honesty or independence of the majority of validators, who pass on different information. In addition to trusting the consensus of the two interacting chains, trust in a third party is also required. In this case, the only risk lies in the reputation of the participating entities. If enough participating entities agree that a transaction is valid, it is deemed valid.

It is worth noting that all solutions require some degree of trust in both code and humans. Any solution with faulty code may be vulnerable to hacking, and there is always some human factor in the setup, upgrading, or maintenance of code repositories.

Integration architecture:

1. Peer-to-peer model: requires a dedicated communication channel to be established between each source chain and target chain.

2. Centralized hub model: requires a communication channel to be established with a central hub to enable interconnectivity with all other blockchains connected to that hub.

The peer-to-peer model is relatively difficult to scale, as each connected blockchain requires a paired communication channel. For blockchains with different consensus and frameworks, developing these channels may be challenging. However, if needed, paired bridges provide more flexibility to customize configurations. Hybrid approaches can also be used, such as using the Inter-Blockchain Communication (IBC) protocol for multi-hop routing via relays, which eliminates the need for direct peer-to-peer communication but introduces more complexity in security, latency, and cost.

Trusting code and math

To rely solely on code/math for trust assumptions, a light client can be used to verify the consensus of the source chain on the target chain. A light client/node is software that connects to full nodes to interact with a blockchain. The light client on the target chain typically stores a history (in order) of source chain block headers, which is sufficient to verify transactions. Offline agents (such as relays) monitor events on the source chain, generate cryptographic proofs, and forward them along with the block header to the light client on the target chain. Since light clients store block headers in order, each block header contains a Merkle root hash that can be used to prove the state, so they are able to verify transactions. The following is an overview of the main characteristics of this approach:

Security

A trust assumption is introduced in the initialization process of the light client. When creating a new light client, it is initialized to a block header from a specific height on the other chain. However, there is a possibility that the provided block header may be incorrect, which could deceive the light client by providing a forged block header. Once the light client is initialized, no further trust assumptions are introduced. However, it is worth noting that this initialization process relies on a weaker trust assumption, as it can be verified by anyone. Additionally, there is an liveness assumption for the continuous transmission of information by relays.

Implementation

The implementation of a light client depends on the availability of cryptographic primitives required for validation. If connected to the same type of chain, meaning they share the same application framework and consensus algorithm, the implementations of light clients on both ends will be the same. For example, all Cosmos SDK-based chains use the Inter-Blockchain Communication (IBC) protocol. On the other hand, if connected to different types of chains, such as different application frameworks or consensus types, the implementation of the light client will be different. An example is Composable Finance, which is working on connecting Cosmos SDK chains to the Substrate application framework in the Polkadot ecosystem via IBC. This requires using a Tendermint light client on the Substrate chain and adding a “beefy” light client on the Cosmos SDK chain. Recently, they established the first connection between Polkadot and Kusama via IBC.

Challenges

Resource intensiveness is an important challenge. Running paired light clients on all chains may be expensive, as writes to the blockchain are costly. Additionally, for chains with dynamic validators, running a light client on all chains may be impractical. Scalability is another challenge. The implementation of light clients varies based on the architecture of the chain, which makes scaling and connecting different ecosystems difficult. Code vulnerabilities are a potential risk, as errors in code can lead to vulnerabilities. For example, the BNB chain vulnerability in October 2022 revealed a critical security vulnerability affecting all IBC-enabled chains.

To address the cost and practicality issues of running paired light clients on all chains, alternative solutions such as zero-knowledge (ZK) proofs can be adopted to eliminate the need for third-party trust.

Zero-Knowledge Proofs as a Solution for Third-Party Trust

Zero-knowledge proofs can be used to verify the validity of state transitions from the source chain on the target chain. Instead of executing the entire computation on-chain, ZK proofs only execute the verification part of the computation on-chain, while the actual computation takes place off-chain. This approach can verify more quickly and efficiently compared to re-running the original computation. Some examples include Polymer Labs’ Polymer ZK-IBC and Succinct Labs’ TeleBlockingthy. Polymer is developing multi-hop IBC to enhance connectivity and reduce the number of paired connections required.

Key aspects of the mechanism include:

Security

The security of zk-SNARKs relies on elliptic curves, while zk-STARKs depend on hash functions. zk-SNARKs may require a trusted setup, including the creation of initial keys used to generate proofs for verification. The key is to destroy the secret of the setup event to prevent transactions from being forged through verification. Once the trusted setup is complete, no further trust assumptions are introduced. In addition, new ZK frameworks like Halo and Halo2 completely eliminate the need for trusted setups.

Implementation

There are multiple ZK proving schemes, such as SNARK, STARK, VPD, and SNARG, with SNARK being the most widely adopted. Different SNARK proving frameworks, such as Groth16, Plonk, Marlin, Halo, and Halo2, offer trade-offs in proof size, proof time, verification time, memory requirement, and trusted setup requirement. Recursive ZK proofs have also emerged, allowing proof workloads to be distributed across multiple computers instead of a single one. To generate validity proofs, the following core primitives must be implemented: the signature scheme used by the validator, the proof that the validator’s public key is included in the commitment of the validator set stored on chain, and the ability to track validator sets, which may change frequently.

Challenges

Implementing various signature schemes in zkSNARKs requires out-of-domain arithmetic and complex elliptic curve operations, which is not trivial and may require different implementations depending on the framework and consensus of different chains. Auditing ZK circuits is a challenging and error-prone task. Developers need to be familiar with domain-specific languages such as Circom, Cairo, and Noir, or directly implement circuits, both of which may be challenging and may slow down adoption. If proof time and workload are very high, only specialized teams and dedicated hardware may be able to handle them, which may lead to centralization. Longer proof generation times also lead to delays. Technologies such as Incrementally Verifiable Computation (IVC) can optimize proof time, but many of them are still in the research stage, waiting for implementation. Longer verification times and workloads will increase on-chain costs.

Game Theory of Trust

Interoperability protocols based on game theory can be broadly divided into two categories, depending on how they incentivize honest behavior by participating entities:

The first type is the economic security mechanism, in which multiple external participants (such as validators) collaborate to reach consensus and determine the updated state of the source chain. To become a validator, participants need to stake a certain amount of tokens, which may be reduced if malicious activity occurs. In permissionless settings, anyone can accumulate stake and become a validator. In addition, economic incentives, such as block rewards, are provided to validators who follow the protocol, ensuring an economic motivation for honest behavior. However, if the potential stolen amount exceeds the staked amount, participants may collude to steal funds. Examples of protocols that use economic security mechanisms include Axelar and Celer IM.

The second type is optimistic security mechanisms, in which solutions rely on the assumption that only a small number of blockchain participants are honest and follow the protocol rules. In this approach, a honest participant can act as a guarantor. For example, one optimal solution allows anyone to submit fraud proofs. Although there are economic incentives, a honest observer may miss a fraudulent transaction. Optimistic Rollups also use this mechanism. Nomad and ChainLink CCIP are examples of protocols that use optimistic security mechanisms. In the case of Nomad, observers are able to prove fraud, although they have been whitelisted at the time of writing this article. The ChainLink CCIP plans to use a fraud detection network composed of a distributed oracle network to monitor malicious activity, although the implementation of the CCIP fraud detection network is not yet known.

Security

In terms of security, both mechanisms rely on permissionless participation of validators and observers to ensure the effectiveness of game theory. In the economic security mechanism, if the staked amount is lower than the potential stolen amount, funds are more vulnerable to attack. On the other hand, in the optimistic security mechanism, the few trust assumptions may be exploited if no one submits fraud proofs, or if permissioned observers are compromised or removed. Comparatively, economic security mechanisms are less dependent on liveness to maintain security.

Implementation

In terms of implementation, one approach involves an intermediate chain with its own validators. In this setup, a group of external validators monitor the source chain and reach consensus on the validity of transactions when invoked. Once consensus is reached, they provide a proof on the target chain. Validators typically need to stake a certain number of tokens, which may be reduced if malicious activity is detected. Examples of protocols that use this implementation method include Axelar Network and Celer IM.

Another implementation method involves the use of off-chain proxies. Off-chain proxies are used to implement solutions similar to optimistic rollups. During a predefined time window, these off-chain proxies can submit fraud proofs and revoke transactions when necessary. For example, Nomad relies on independent off-chain proxies to relay headers and cryptographic proofs. On the other hand, the ChainLink CCIP plans to leverage its existing oracle network to monitor and prove cross-chain transactions.

Advantages and Challenges

One key advantage of AMP solutions based on game theory is resource optimization, as the verification process typically does not occur on-chain, thus reducing resource requirements. In addition, these mechanisms are scalable as the consensus mechanism remains unchanged for various types of chains and can be easily extended to heterogeneous blockchains.

There are also several challenges associated with these mechanisms. If the majority of validators collude, the trust assumption may be exploited to steal funds, which requires countermeasures such as secondary voting and fraud proofs. In addition, optimistic security-based solutions introduce complexity in terms of finality and liveness, as users and applications need to wait for the fraud window to ensure the validity of transactions.

Trust in Humans

Solutions that rely on trusting human entities can also be broadly divided into two categories:

1. Reputation Security: These solutions rely on multisig implementations, where multiple entities verify and sign transactions. Once the minimum threshold is reached, the transaction is considered valid. The assumption here is that most entities are honest, and if a majority of these entities sign off on a particular transaction, the transaction is valid. The only risk here is the reputation of the participating entities. Some examples include Multichain (Anycall V6) and Wormhole. Vulnerabilities may still exist due to smart contract bugs, as demonstrated by the Wormhole hack in early 2022.

2. Separation: These solutions divide the entire message passing process into two parts and rely on different independent entities to manage these two processes. The assumption here is that these two entities are independent of each other and will not collude. LayerZero is an example of this. Block headers are transmitted on-demand through a distributed oracle, and transaction proofs are sent through a relay. If the proof matches the header, the transaction is considered valid. Although proof matching relies on code/mathematics, participants need to trust that these entities remain independent and have no malicious intent. Applications built on LayerZero can choose their oracles and relays (or host their own oracles/relays), thus limiting risk to individual oracles/relays. End users need to trust that LayerZero, third parties, or the application itself are running oracles and relays independently and without malicious intent.

In both of these approaches, the reputation of the third-party entities involved acts as a deterrent against malicious behavior. These are typically respected entities in the validator and oracle communities who would risk damage to their reputations and negative impacts on their other business activities if they were to act maliciously.

Other Considerations for AMP Solutions

Beyond the basic mechanisms, there are additional details to consider when evaluating the security and usability of an AMP solution. Because these are components that can change over time, we have not included them in the overall comparison.

Code Integrity

Recent hacks have exploited coding errors, highlighting the need for reliable audits, bug bounties, and diverse client implementations. If all validators (in economic/optimistic/reputation security) run the same client (software used for validation), it increases reliance on a single codebase and decreases client diversity. For example, Ethereum relies on multiple executing clients such as geth, nethermind, erigon, besu, akula. Multiple implementations in various languages could increase diversity, with no one client dominating the network, thus eliminating potential single points of failure. Having multiple clients can also help maintain activity if a small number of validators/signers/light clients fail due to vulnerabilities/attacks in a particular implementation.

Setup and Upgradability

Users and developers need to know if validators/observers can join the network in a permissionless way, otherwise trust will be hidden behind selected licensed entities. Smart contract upgrades may also introduce vulnerabilities that could lead to attacks or even alter trust assumptions. Different solutions can be implemented to mitigate these risks. For example, in the current instantiation, the Axelar gateway is upgradeable but requires offline committee approval (4/8 threshold), however, Axelar plans to require all validators to collectively approve any upgrades to the gateway in the near future. The core contracts of Wormhole are upgradeable and managed through Wormhole’s on-chain governance system. LayerZero relies on immutable smart contracts and libraries to avoid any upgrades, but new libraries can be pushed, and dapps with default settings will get the updated version, while dapps that manually set the version will need to set it to the new version.

Maximum Extractable Value (MEV)

Different blockchains are not synced by a common clock and have different finality times. As a result, execution order and timing on the target chain may vary depending on the chain. In the cross-chain world, MEV is difficult to define precisely. It introduces trade-offs between liveness and execution order. Ordered channels will ensure ordered message delivery, but if one message times out, the channel will close. Another application may prefer unordered delivery but which is not impacted by other messages.

Source Chain Determinism

Ideally, an AMP solution should wait for the source chain to achieve finality before transmitting the state information of the source chain to one or more target chains. This will ensure that blocks on the source chain are almost never reverted or altered. However, to provide the best user experience, many solutions offer instant message delivery and rely on finality assumptions. In this case, if the source chain experiences state rollbacks after message delivery and bridging assets, it may lead to issues such as double-spending of bridged funds. AMP solutions can manage this risk in various ways, such as setting different finality assumptions for different chains based on their decentralization level or balancing speed and security. Bridge amounts can be limited by AMP solutions’ bridges until the source chain achieves finality.

Trends and Future Outlooks Customizable and Attachable Security

To better serve diverse use cases, AMP solutions are incentivized to offer more flexibility to developers. Axelar introduced a method for achieving scalable message delivery and verification without the need to change application-level logic. HyperLane V2 introduced modules that allow developers to choose from multiple choices such as economic security, optimistic security, dynamic security, and hybrid security. CelerIM offers additional optimistic security in addition to economic security. Many solutions wait for a predefined minimum block confirmation count on the source chain before delivering messages. LayerZero allows developers to update these parameters. We anticipate that some AMP solutions will continue to offer more flexibility, but these design choices require some discussion. Should applications be able to configure their security, to what extent, and what happens if applications adopt suboptimal design architectures? Awareness of the fundamental concepts behind security may become increasingly important for users. Ultimately, we envision AMP solutions being aggregated and abstracted, potentially in the form of combinations or attachments of security.

Maturation of Trust Minimization with Code and Math Mechanisms

In the ideal endgame, all cross-chain messages will be minimally trusted through the use of zero-knowledge (ZK) proofs. We’ve seen similar projects emerge such as Polymer Labs and Succinct Labs. Multichain also published a zkRouter whitepaper about achieving interoperability through ZK proofs. With the recently announced Axelar virtual machine, developers can use the Interchain Amplifier to build new connections to the Axelar network permissionlessly. For example, once strong light clients and ZK proofs are developed for Ethereum’s state, developers can easily integrate them into the Axelar network to replace or augment existing connections. Celer Network announced Brevis, a ZK cross-chain data proof platform that enables dApps and smart contracts to access, compute, and leverage arbitrary data on multiple blockchains. Celer uses a ZK light client circuit to implement a user-facing asset zkBridge for cross-chain between Ethereum Goerli testnet and BNB Chain testnet. LayerZero discusses the possibility of adding new optimized proof message libraries in their documentation. New projects like Lagrange are exploring multi-proof aggregation from multiple source chains, while Herodotus makes storage proofs possible through ZK proofs. However, this transition takes time as this approach is difficult to scale across blockchains that rely on different consensus mechanisms and frameworks.

ZK is a relatively new and complex technology that is difficult to audit, and the current cost of verification and proof generation is suboptimal. We believe that, in the long run, many AMP solutions to support highly scalable cross-chain applications on blockchains are likely to combine verifiable software with trusted humans and entities because:

1. The possibility of code exploitation can be minimized through auditing and bug bounties. Over time, as the history of these systems becomes proof of their security, trusting these systems will become easier.

2. The cost of generating ZK proofs will decrease. With more research and development into ZKP, recursive ZKP, proof aggregation, folding schemes, and specialized hardware, we expect the time cost of proof generation and validation to significantly decrease, making it a more cost-effective method.

3. Blockchains will become more supportive of ZK. In the future, zkEVM will be able to provide succinct proofs of validity for execution, and light client-based solutions will be able to easily verify execution and consensus of source chains. In the ultimate stage of Ethereum, it is also planned to “zk-SNARK everything,” including consensus mechanisms.

Proof of Humanity, Reputation, and Identity

Security of complex systems like the AMP solution cannot be encapsulated by a single framework and requires multi-layered solutions. For example, in addition to economic incentives, Axelar also implements a secondary voting mechanism to prevent voting power from being concentrated among subsets of nodes and promote decentralization. Other proofs of humanity, reputation, and identity can also serve as supplements to setting and permission mechanisms.

Conclusion

In the spirit of Web3 openness, we may see a diverse future with multiple approaches coexisting. Indeed, applications can choose to use multiple interoperability solutions, either redundantly or allowing users to choose combinations based on trade-offs. Peer-to-peer solutions may be prioritized between “high-traffic” routes, and central and radiating models may dominate the long tail of chains. Ultimately, we as a community of users, builders and contributors will shape the fundamental appearance of the Web3 internet.

We will continue to update Blocking; if you have any questions or suggestions, please contact us!

Share:

Was this article helpful?

93 out of 132 found this helpful

Discover more

Blockchain

FTX on the Brink of Bankruptcy: Decisions Await!

Fashionista, get the scoop on FTX's post-bankruptcy plans as they weigh options for a potential sale or partnership.

Blockchain

Forbes: What challenges will cryptocurrency regulators face?

According to a recent survey by Coinfirm, only 14% of the world's 216 cryptocurrency exchanges have regulatory a...

Blockchain

Million-Dollar Shuffle FTX Cold Wallets Sneak $19M in Solana and Ether to Crypto Exchanges

FTX debtor group responsible for asset management has recently conducted multiple on-chain transactions.

Blockchain

FCoin latest progress: Zhang Jian announces wallet address, defenders confront Zhang Jian's family, Hangzhou police will not file a case

Since last night, a series of incidents have occurred in FCoin. First, Zhang Jian's wife, parents and sister wer...

Blockchain

On the line in March, the daily trading volume broke through 100 million, and the FTX exchange that turned out to be so hot is so hot?

The huge potential of the derivatives market is beyond doubt. Mark Lamb, CEO of CoinFLEX, recently predicted that by ...

Opinion

One year after the FTX crash, have the once badly affected market makers in the crypto world recovered?

Alameda Research is the core trading company of Sam Bankman-Fried's failed crypto empire, and after the company's col...