Analysis of the technical details and in-depth of the latest Sui vulnerability “Hamster Wheel”

Analysis of the technical details and depth of the latest Sui vulnerability, "Hamster Wheel".

Previously, the CertiK team discovered a series of denial-of-service vulnerabilities in the Sui blockchain. Among these vulnerabilities, a new and particularly impactful vulnerability stood out. This vulnerability could cause Sui network nodes to be unable to process new transactions, effectively shutting down the entire network.

Just last Monday, CertiK received a $500,000 bug bounty from Sui for discovering this major security vulnerability. The incident was reported by the authoritative US media CoinDesk, and other major media outlets followed suit with related news.

The security vulnerability is aptly named the “hamster wheel”: its unique attack method is different from known attacks, and an attacker only needs to submit a payload of about 100 bytes to trigger an infinite loop in the Sui verification node, rendering it unable to respond to new transactions.

In addition, the damage caused by the attack can continue after the network restarts, and can automatically propagate in the Sui network, making all nodes unable to process new transactions like hamsters running endlessly on a wheel. Therefore, we call this unique attack type the “hamster wheel” attack.

After discovering the vulnerability, CertiK reported it to Sui through Sui’s bug bounty program. Sui also responded promptly, confirming the severity of the vulnerability and actively taking measures to fix the problem before the mainnet launch. In addition to fixing this specific vulnerability, Sui also implemented preventative mitigation measures to reduce the potential damage this vulnerability could cause.

To thank the CertiK team for responsibly disclosing the vulnerability, Sui awarded the CertiK team a $500,000 bonus.

The following section will reveal the details of this critical vulnerability from a technical perspective, explaining its root cause and potential impact.

Vulnerability details

The key role of validators in Sui

For blockchains like Sui and Aptos based on the Move language, the main safeguard mechanism against malicious payload attacks is static verification technology. Through static verification technology, Sui can check the validity of user-submitted payloads before contract deployment or upgrade. The validator provides a series of checkers to ensure the correctness of structure and semantics. Only when the contract passes the verification check will it enter the Move virtual machine for execution.

Threats of Malicious Payloads on the Move Blockchain

Sui provides a new storage model and interface on top of the original Move virtual machine. Therefore, Sui has a custom version of the Move virtual machine. In order to support the new storage primitives, Sui has introduced a series of additional and custom checking mechanisms for security verification of untrusted payloads, such as object security and global storage access. These custom check mechanisms fit Sui’s unique features, so we call these custom checks Sui validators.

Sui’s Check Order for Payloads

As shown in the above figure, most of the checks in the validator are structural security verification for CompiledModule (which represents the user-provided contract payload at runtime). For example, the “Duplicate Checker” ensures that there are no duplicate entries in the runtime payload; the “Limit Checker” ensures that the length of each field in the runtime payload is within the allowable entry limit.

In addition to structural checks, the static checks of the validator still require more complex analysis to ensure the robustness of untrusted payloads at the semantic level.

Understanding Move’s Abstract Interpreter:

Linear and Iterative Analysis

Move’s abstract interpreter is a framework designed specifically for performing complex security analysis on bytecode through abstract interpretation. This mechanism makes the verification process more refined and accurate, and each verifier is allowed to define their unique abstract state for analysis.

At the beginning of the runtime, the abstract interpreter constructs a control flow graph (CFG) from the compiled module. Each basic block in these CFGs maintains a set of states, namely “pre-state” and “post-state”. The “pre-state” provides a program state snapshot before the basic block is executed, while the “post-state” provides a program state description after the basic block is executed.

When the abstract interpreter does not encounter a jump (or loop) in the control flow graph, it follows a simple linear execution principle: each basic block is analyzed in turn, and the pre-state and post-state are calculated based on the semantics of each instruction in the block. The result is an accurate snapshot of the state of each basic block during program execution, which helps verify the security properties of the program.

Move Abstract Interpreter Workflow

However, when there is a loop in the control flow, the process becomes more complicated. The presence of a loop means that there is a back edge in the control flow graph, and the source of the back edge corresponds to the post-state of the current basic block, while the target basic block of the back edge (the loop header) is the pre-state of a basic block that has already been analyzed. Therefore, the abstract interpreter needs to carefully merge the states of the two basic blocks related to the back edge.

If it is found that the merged state is different from the existing pre-state of the loop header basic block, the abstract interpreter will update the state of the loop header basic block and restart the analysis from this basic block. This iterative analysis process will continue until the pre-state of the loop header basic block stabilizes. In other words, this process is repeated until the pre-state of the loop header basic block no longer changes between iterations. Reaching a fixed point indicates that loop analysis has been completed.

Sui IDLeak Verifier:

Customized Abstract Interpretation Analysis

Unlike the original Move design, Sui’s blockchain platform introduces a unique target-centered global storage model. One notable feature of this model is that any data structure with a key attribute (stored on-chain as an index) must have an ID type as the first field of the structure. The ID field is immutable and cannot be transferred to other targets because each object must have a globally unique ID. To ensure these properties, Sui has established a set of custom analysis logic on the abstract interpreter.

The IDLeak verifier, also known as id_leak_verifier, works together with the abstract interpreter for analysis. It has its own unique AbstractDomain, called AbstractState. Each AbstractState is composed of multiple AbstractValues corresponding to local variables. AbstractValue is used to supervise the state of each local variable in order to track whether an ID variable is brand new.

During the struct packaging process, the IDLeak verifier only allows a brand new ID to be packaged into a struct. Through abstract interpretation analysis, the IDLeak verifier can meticulously track the local data flow state to ensure that no existing ID is transferred to other struct objects.

Sui IDLeak Verifier State Maintenance Inconsistency Problem

The IDLeak verifier integrates with the Move abstract interpreter by implementing the AbstractState::join function. This function plays an indispensable role in state management, especially in merging and updating state values.

Take a closer look at these functions to understand their operation:

In AbstractState::join, the function takes another AbstractState as input and attempts to merge its local state with the local state of the current object. For each local variable in the input state, it compares the value of that variable with its current value in the local state (defaulting to AbstractValue::Other if not found). If these two values are different, it sets a “changed” flag as a basis for determining whether the final state merge result has changed, and updates the local variable value in the local state by calling AbstractValue::join.

In AbstractValue::join, the function compares its value with another AbstractValue. If they are equal, it returns the incoming value. If they are not equal, it returns AbstractValue::Other.

However, this state maintenance logic contains a hidden inconsistency problem. Although AbstractState::join returns a result indicating that the merged state has changed (JoinResult::Changed) based on the difference between the old and new values, the updated state value after merging may still remain the same.

This inconsistency problem is caused by the sequence of operations: the judgment of whether the state has changed in AbstractState::join occurs before the state update (AbstractValue::join), and this judgment does not reflect the true result of the state update.

In addition, AbstractValue::Other plays a decisive role in the merged result in AbstractValue::join. For example, if the old value is AbstractValue::Other, and the new value is AbstractValue::Fresh, the updated state value is still AbstractValue::Other, even if the new and old values are different and the state itself has not changed after the update.

Example: Inconsistency in state merging

This introduces an inconsistency: the result of merging the basic block state is judged as “changed”, but the merged state value itself has not changed. In the process of abstract interpretation analysis, the occurrence of this inconsistency problem may have serious consequences. We review the behavior of abstract interpreters when loops appear in the control flow graph (CFG):

When encountering a loop, the abstract interpreter uses an iterative analysis method to merge the states of the loopback target basic block and the current basic block. If the merged state changes, the abstract interpreter will start the analysis again from the jump target.

However, if the merge operation of the abstract interpretation analysis incorrectly marks the state merge result as “changed”, while in fact the value of the internal variables in the state has not changed, it will lead to endless reanalysis and infinite loop.

Further Utilizing Inconsistency

Triggering Infinite Loop in Sui IDLeak Verifier

By exploiting this inconsistency, attackers can construct a malicious control flow graph to lure the IDLeak verifier into an infinite loop. The crafted control flow graph consists of three basic blocks: BB1 and BB2, BB3. It is worth noting that we intentionally introduce a fall-back edge from BB3 to BB2 to construct a loop.

Malicious CFG + state, leading to an internal infinite loop of the IDLeak verifier

This process starts from BB2, where a specific local variable’s AbstractValue is set to ::Other. After executing BB2, the control flow transfers to BB3, where the same variable is set to ::Fresh. At the end of BB3, there is a fall-back edge that jumps back to BB2.

In the process of abstract interpretation analysis of this example, the inconsistency mentioned above plays a key role. When the fall-back edge is processed, the abstract interpreter tries to connect the post-state of BB3 (variable is “::Fresh”) with the pre-state of BB2 (variable is “::Other”) together. The AbstractState::join function notices the difference between the new and old values and sets the “change” flag to indicate that BB2 needs to be re-analyzed.

However, the dominant behavior of “::Other” in AbstractValue::join means that after the AbstractValue is merged, the actual value of the BB2 state variable is still “::Other”, and the result of the state merge remains unchanged.

Therefore, once this looping process starts, the verifier will continue to re-analyze BB2 and all its successor basic block nodes (BB3 in this example) indefinitely. The infinite loop consumes all available CPU cycles, making it unable to process new transactions, and this situation still exists after the verifier is restarted.

By exploiting this vulnerability, the verification node runs like a hamster on a wheel in an endless loop, unable to process new transactions. Therefore, we call this unique attack type the “hamster wheel” attack.

The “hamster wheel” attack can effectively cause the Sui verifier to stall, leading to the entire Sui network being paralyzed.

After understanding the cause and triggering process of the vulnerability, we successfully triggered the vulnerability in a real-world simulation by using the following Move bytecode to simulate a specific example:

This example demonstrates how a vulnerability can be triggered in a real environment through carefully constructed bytecode. Specifically, an attacker can trigger an infinite loop in the IDLeak validator, using a payload of only about 100 bytes to consume all CPU cycles of Sui nodes, effectively blocking new transaction processing and causing a denial-of-service on the Sui network.

The Persistent Harm of the “Hamster Wheel” Attack on the Sui Network

The severity of a vulnerability in Sui’s bug bounty program is determined based on the level of harm it poses to the entire network. Vulnerabilities that meet the “critical” rating must shut down the entire network and effectively block new transaction confirmations, requiring a hard fork to fix the problem. If a vulnerability only causes partial network nodes to go down, it is rated as a “medium” or “high” risk vulnerability at most.

The “Hamster Wheel” vulnerability discovered by the CertiK Skyfall team can shut down the entire Sui network and requires an official release of a new version to upgrade and repair. Based on the level of harm posed by this vulnerability, Sui ultimately rated it as “critical”. To further understand the reasons for the severity of the impact of the “Hamster Wheel” attack, it is necessary to understand the complex architecture of Sui’s backend system, especially the entire process of submitting or upgrading on-chain transactions.

Interaction overview of submitting transactions in Sui

Initially, user transactions are submitted through the front-end RPC, and after basic validation, they are passed to the backend service. The Sui backend service is responsible for further verifying the incoming transaction payload. After successfully verifying the user’s signature, the transaction is transformed into a transaction certificate (containing transaction information and Sui’s signature).

These transaction certificates are the basic components of the Sui network, and can be propagated between various validation nodes in the network. For contract creation/upgrade transactions, before they can be put on the chain, the validation nodes will call the Sui validator to check and verify the validity of the contract structure/semantics of these certificates. It is precisely at this critical verification stage that the “infinite loop” vulnerability can be triggered and exploited.

When this vulnerability is triggered, it will cause the verification process to be interrupted indefinitely, effectively blocking the system’s ability to process new transactions and causing the network to completely shut down. To make matters worse, the situation still exists after the node is restarted, which means that traditional mitigation measures are far from enough. Once triggered, this vulnerability will cause “persistent harm” and leave a lasting impact on the entire Sui network.

Sui’s Fix

After receiving feedback from CertiK, Sui promptly acknowledged the vulnerability and released a patch to address the critical flaw. The fix ensures consistency between state changes and changed flags, eliminating the critical impact caused by the “hamster wheel” attack.

To eliminate the aforementioned inconsistency, Sui’s fix includes a small but critical adjustment to the AbstractState::join function. This patch removes the logic that judges the state merge result before executing the AbstractValue::join, and instead executes the AbstractValue::join function first to merge the state. It sets a flag to indicate whether the merge has changed by comparing the final updated result with the original state value (old_value).

This way, the result of the state merge will be consistent with the actually updated result, and there will be no infinite loop in the analysis process.

In addition to fixing this specific vulnerability, Sui has deployed mitigations to reduce the impact of validator vulnerabilities in the future. According to Sui’s response in the bug report, the mitigation involves a feature called Denylist.

“However, validators have a node configuration file that allows them to temporarily reject certain categories of transactions. This configuration can be used to temporarily prohibit the processing of releases and package upgrades. Since this bug occurs when the Sui validator is run before signing the release or package upgrade tx, and the denylist stops the validator from running and discards malicious tx, temporarily denying these tx types is a 100% effective mitigation (although it will temporarily interrupt the service of those trying to publish or upgrade code).”

By the way, we have had this TX denylist configuration file for some time, but we have also added a similar mechanism to certificates as a follow-up mitigation to the “validator infinite loop” vulnerability you reported earlier. With this mechanism, we will have greater flexibility in dealing with this type of attack: we will use the certificate denylist configuration to make validators forget bad certificates (breaking the infinite loop) and use the TX denylist configuration to prohibit publishing/upgrading, thus preventing the creation of new malicious attack transactions. Thank you for making us think about this issue!

Validators have a limited number of “ticks” (different from gas) to perform bytecode validation before signing transactions. If all the bytecode published in the transaction cannot be validated within so many ticks, the validator will refuse to sign the transaction to prevent it from being executed on the network. Previously, the measurement only applied to a selected group of complex validators. To address this issue, we extended the measurement to each validator to ensure that the work performed by the validator during each tick of validation is constrained. We also fixed a potential infinite loop error in the ID leak validator. “

— Explanation from Sui developers about vulnerability fixes

In summary, the Denylist allows validators to temporarily mitigate exploits against vulnerabilities in the validator by disabling the release or upgrade process and effectively preventing potential harm from malicious transactions. When Denylist mitigation measures are in effect, nodes sacrifice their own release/update contract function to ensure that they can continue to work.

Summary

In this article, we shared the details of the “hamster wheel” attack technique discovered by the CertiK Skyfall team, explaining how this new attack exploits critical vulnerabilities to cause the Sui network to completely shut down. In addition, we also carefully studied Sui’s timely response to fixing this critical issue and shared methods for fixing vulnerabilities and mitigating similar vulnerabilities in the future.

We will continue to update Blocking; if you have any questions or suggestions, please contact us!

Security VulnerabilitySui

Was this article helpful?

93 out of 132 found this helpful

Analysis of the technical details and in-depth of the latest Sui vulnerability “Hamster Wheel”

Was this article helpful?

What are the potential chain reactions of Prime Trust facing bankruptcy crisis?

Chainlink Proof of Reserve (PoR): Putting Transparency at the Forefront

Bitcoin