Deep analysis of the latest vulnerability “Hamster Wheel” in Sui blockchain

In-depth analysis of "Hamster Wheel," the latest vulnerability in the Sui blockchain.

Previously, the CertiK team discovered a series of denial-of-service vulnerabilities on the Sui blockchain. Among these vulnerabilities, a new and highly impactful vulnerability is particularly noteworthy. This vulnerability can cause Sui network nodes to be unable to process new transactions, effectively shutting down the entire network.

Last Monday, CertiK received a $500,000 bug bounty from SUI for discovering this major security vulnerability. The event was reported by CoinDesk, a leading U.S. industry media, and other major media outlets subsequently followed suit with related news.

The security vulnerability is vividly called the “hamster wheel”: its unique attack method is different from the known attacks, and attackers only need to submit a payload of about 100 bytes to trigger an infinite loop in a Sui verification node, preventing it from responding to new transactions.

In addition, the damage caused by the attack can persist after the network is restarted, and can automatically propagate within the Sui network, making all nodes unable to process new transactions like hamsters running endlessly on a wheel. Therefore, we call this unique type of attack the “hamster wheel” attack.

After discovering the vulnerability, CertiK reported it to Sui through Sui’s bug bounty program. Sui also responded promptly, confirming the severity of the vulnerability and taking appropriate measures to fix the problem before the mainnet launch. In addition to fixing this particular vulnerability, Sui has implemented preventive mitigation measures to reduce the potential damage that this vulnerability may cause.

To thank the CertiK team for their responsible disclosure, Sui awarded the CertiK team a $500,000 bonus.

The following section will reveal the details of this critical vulnerability from a technical perspective, elucidating the root cause and potential impact of the vulnerability.

Vulnerability details

The key role of validators in Sui

For blockchains like Sui and Aptos based on the Move language, the safeguard mechanism against malicious payload attacks mainly relies on static verification technology. Through static verification technology, Sui can check the validity of user-submitted payloads before contract publication or upgrade. Validators provide a series of checkers to ensure the correctness of structure and semantics, and only when the contract passes the verification check, will it enter the Move virtual machine and be executed.

Threat of Malicious Payload on Move Chain

Sui provides a new storage model and interface on top of the original Move virtual machine, so Sui has a customized version of the Move virtual machine. To support the new storage primitives, Sui has introduced a series of additional, customized verification mechanisms for security verification of untrusted payloads, such as object safety and global storage access. These customized verification mechanisms fit Sui’s unique features, so we call these customized checks Sui validators.

Sui's Check Order for Payloads

As shown in the figure above, most of the checks in the validator are aimed at structural-level security verification of CompiledModule (which represents the user-provided contract payload). For example, the “repetition checker” ensures that there are no duplicate entries in the runtime payload, and the “limit checker” ensures that the length of each field in the runtime payload is within the allowed limit of entries.

In addition to structural-level checks, static checks in the validator still require more complex analysis techniques to ensure the semantic robustness of untrusted payloads.

Understanding Move’s Abstract Interpreter: Linear and Iterative Analysis

Move’s abstract interpreter is a framework designed specifically for performing complex security analysis on bytecode through abstract interpretation. This mechanism makes the verification process more refined and accurate, allowing each validator to define their unique abstract state for analysis.

At runtime, the abstract interpreter builds a control flow graph (CFG) from the compiled module. Each basic block in these CFGs maintains a set of states, namely “pre-state” and “post-state”. The “pre-state” provides a program state snapshot before the basic block is executed, while the “post-state” provides a description of the program state after the basic block is executed.

When the abstract interpreter does not encounter a jump (or loop) in the control flow graph, it follows a simple linear execution principle: each basic block is analyzed in turn, and the pre-state and post-state are calculated based on the semantics of each instruction in the block. The result is an accurate snapshot of the state of the program at the basic block level during execution, which helps verify the program’s security properties.

Move Abstract Interpreter Workflow

However, when there are loops in the control flow, the process becomes more complicated. The presence of a loop means that the control flow graph contains a back-edge, and the source of the back-edge corresponds to the post-state of the current basic block, while the target basic block of the back-edge (loop head) is the predecessor state of a basic block that has already been analyzed, so the abstract interpreter needs to carefully merge the states of the two basic blocks related to the back-edge.

If the merged state is found to be different from the existing predecessor state of the loop head basic block, the abstract interpreter updates the state of the loop head basic block and restarts the analysis from this basic block. This iterative analysis process will continue until the loop pre-state stabilizes. In other words, this process is repeated until the predecessor state of the loop head basic block no longer changes between iterations. Reaching a fixed point indicates that loop analysis has been completed.

Sui IDLeak Verifier: Custom Abstract Interpretation Analysis

Unlike the original Move design, Sui’s blockchain platform introduces a unique target-centric global storage model. One significant feature of this model is that any data structure with the key property (stored on-chain as an index) must have an ID type as the first field of the structure. The ID field is immutable and cannot be transferred to other targets, because each object must have a globally unique ID. To ensure these features, Sui has established a set of custom analysis logic on top of the abstract interpreter.

The IDLeak verifier, also known as id_leak_verifier, works in conjunction with the abstract interpreter for analysis. It has its own unique AbstractDomain, called AbstractState. Each AbstractState is composed of multiple AbstractValues corresponding to local variables. The state of each ID variable is monitored through AbstractValue to track whether an ID variable is brand new.

During the process of struct packing, IDLeak verifier only allows a brand new ID to be packed into a struct. Through abstract interpretation analysis, IDLeak verifier can track the local data flow state in detail to ensure that no existing ID is transferred to other struct objects.

Inconsistency in Sui IDLeak Validator State Maintenance

The IDLeak validator integrates with the Move abstract interpreter by implementing the AbstractState::join function. This function plays an essential role in state management, especially in merging and updating state values.

Examine these functions in detail to understand their operations:

In AbstractState::join, the function takes another AbstractState as input and attempts to merge its local state with the local state of the current object. For each local variable in the input state, it compares the value of that variable with its current value in the local state (defaulting to AbstractValue::Other if not found). If these two values are not equal, it sets a “changed” flag as the basis for determining whether the final state merge result has changed and updates the local variable value in the local state by calling AbstractValue::join.

In AbstractValue::join, the function compares its value with another AbstractValue. If they are equal, it returns the passed-in value. If not, it returns AbstractValue::Other.

However, this state maintenance logic contains a hidden inconsistency problem. Although AbstractState::join returns a result indicating that the merged state has changed (JoinResult::Changed) based on the differences between old and new values, the merged updated state value may still be unchanged.

This inconsistency problem is caused by the order of operations: the judgment of whether the state has changed in AbstractState::join occurs before the state update (AbstractValue::join), and this judgment does not reflect the true result of the state update.

In addition, AbstractValue::Other plays a decisive role in the merged result in AbstractValue::join. For example, if the old value is AbstractValue::Other and the new value is AbstractValue::Fresh, the updated state value is still AbstractValue::Other, even if the new and old values are different and the updated state itself has not changed.

Example: inconsistency in state join

This introduces an inconsistency: the result of merging basic block states is judged to have “changed,” but the merged state value itself has not changed. This inconsistency problem can have serious consequences in the process of abstract interpretation analysis. We review the behavior of the abstract interpreter when loops appear in the control flow graph (CFG):

When encountering a loop, the abstract interpreter uses an iterative analysis method to merge the state of the jump target basic block and the current basic block. If the merged state changes, the abstract interpreter will reanalyze from the jump target.

However, if the abstract interpretation analysis incorrectly marks the merged operation of the abstract interpretation as “changed” when the value of the internal variables of the state has not actually changed, it will cause endless reanalysis and produce an infinite loop.

Further using inconsistencies to trigger infinite loops in the Sui IDLeak verifier

With this inconsistency, an attacker can construct a malicious control flow graph to cause the IDLeak verifier to enter an infinite loop. This carefully constructed control flow graph consists of three basic blocks: BB1 and BB2, and BB3. It is worth noting that we deliberately introduced a return edge from BB3 to BB2 to construct a loop.

Malicious CFG+state that can cause an IDLeak verifier internal infinite loop

This process starts with BB2, where a specific local variable’s AbstractValue is set to ::Other. After executing BB2, the flow is transferred to BB3, where the same variable is set to ::Fresh. At the end of BB3, there is a return edge that jumps to BB2.

In the process of abstract interpretation analysis of this example, the inconsistency mentioned earlier plays a key role. When the return edge is processed, the abstract interpreter tries to connect the post-order state of BB3 (variable is “::Fresh”) with the pre-order state of BB2 (variable is “::Other”). The AbstractState::join function notices the difference between the new and old values and sets the “change” flag to indicate that BB2 needs to be reanalyzed.

However, the dominant behavior of “::Other” in AbstractValue::join means that after the AbstractValue is merged, the actual value of the BB2 state variable is still “::Other”, and the result of the state merge has not changed.

Therefore, once this loop process starts, when the verifier continues to reanalyze BB2 and all its successor basic block nodes (BB3 in this example), it will continue indefinitely. The infinite loop consumes all available CPU cycles, making it unable to process new transactions, and this situation still exists after the verifier is restarted.

By exploiting this vulnerability, the verification node runs endlessly on the wheel like a hamster, unable to process new transactions. Therefore, we call this unique type of attack “hamster wheel” attack.

The “hamster wheel” attack can effectively cause the Sui validator to stall, leading to the entire Sui network being paralyzed.

After understanding the cause and triggering process of the vulnerability, we used the following Move bytecode to simulate a specific example and successfully triggered the vulnerability in a simulated real environment:

This example demonstrates how to trigger the vulnerability in a real environment through carefully constructed bytecode. Specifically, the attacker can trigger an infinite loop in the IDLeak validator, consuming all CPU cycles of the Sui node with a payload of only about 100 bytes, effectively preventing new transactions from being processed and causing the Sui network to deny service.

The Persistent Harm of the “Hamster Wheel” Attack in the Sui Network

Sui’s vulnerability bounty program has strict rules for rating the severity of vulnerabilities, mainly based on the degree of harm to the entire network. A vulnerability that meets the “critical” rating must shut down the entire network and effectively block new transaction confirmations, and a hard fork is required to fix the problem; if the vulnerability can only cause some network nodes to deny service, it can be rated as a “medium” or “high” risk vulnerability at most.

The “hamster wheel” vulnerability discovered by the CertiK Skyfall team can shut down the entire Sui network and requires an official release of a new version to upgrade and fix it. Based on the degree of harm caused by the vulnerability, Sui was ultimately rated as a “critical” level. To further understand the reasons for the seriousness of the “hamster wheel” attack, it is necessary to understand the complex architecture of Sui’s backend system, especially the entire process of publishing or upgrading on-chain transactions.

Interaction overview of submitting transactions in Sui

Initially, user transactions are submitted through frontend RPC, and after basic validation, they are passed to the backend service. The Sui backend service is responsible for further verifying the incoming transaction payload. After successfully verifying the user’s signature, the transaction is converted into a transaction certificate (containing transaction information and Sui’s signature).

These transaction certificates are the basic components of the Sui network and can be propagated between various validation nodes in the network. For contract creation/upgrade transactions, before they can be chained, validation nodes call the Sui validator to check and verify the validity of the contract structure/semantics of these certificates. It is precisely at this critical verification stage that the “infinite loop” vulnerability can be triggered and exploited.

When triggered, this vulnerability causes the validation process to be indefinitely interrupted, effectively preventing the system from processing new transactions and causing the network to shut down completely. To make matters worse, the situation persists even after node restarts, which means that traditional mitigation measures are far from enough. Once triggered, the vulnerability results in a “persistent disruption” that leaves a lasting impact on the entire Sui network.

Sui’s solution

After receiving feedback from CertiK, Sui promptly confirmed the vulnerability and released a fix to address this critical flaw. The fix ensures consistency between state changes and the flags that follow them, eliminating the key impact caused by the “hamster wheel” attack.

To eliminate the inconsistency mentioned above, Sui’s fix includes a small but critical adjustment to the AbstractState::join function. This patch removes the logic that judges the state merge result before executing AbstractValue::join, and instead first executes the AbstractValue::join function to merge the states, and sets the flag indicating whether the merge has changed by comparing the final updated result with the original state value (old_value).

By doing so, the result of the state merge will be consistent with the actual updated result, and no infinite loop will occur during the analysis process.

In addition to fixing this specific vulnerability, Sui has deployed mitigation measures to reduce the impact of future validator vulnerabilities. According to Sui’s response in the bug report, the mitigation measures involve a feature called Denylist.

“However, validators have a node configuration file that allows them to temporarily reject certain categories of transactions. This configuration can be used to temporarily block the processing of releases and software upgrades. Since this bug occurs when running the Sui validator before signing a release or software upgrade transaction, and the denylist will stop the validator from running and discard malicious transactions, temporarily denying these types of transactions from the denylist is a 100% effective mitigation measure (although it will temporarily interrupt the service of those attempting to publish or upgrade code).”

By the way, we’ve had this TX denylist configuration file for a while, but we’ve also added a similar mechanism for certificates as a follow-up mitigation for the “validator infinite loop” vulnerability you reported earlier. With this mechanism, we’ll have more flexibility in dealing with this type of attack: we’ll use the certificate denylist configuration to make validators forget about bad certificates (breaking the infinite loop), and use the TX denylist configuration to prevent the creation of new malicious attack transactions. Thank you for making us think about this problem!”

Validators have a limited number of “ticks” (different from gas) for bytecode validation before signing transactions. If all the bytecode submitted in a transaction cannot be validated within this number of ticks, the validator will refuse to sign the transaction, preventing it from being executed on the network. Previously, this measurement only applied to a selected group of complex validators. To address this issue, we extended the measurement to each validator, ensuring that the work performed by the validator during each tick of validation is constrained. We also fixed a potential infinite loop error in the ID leak validator. “

— From Sui Developer’s Notes on Bug Fixing

In summary, the denylist allows validators to temporarily mitigate vulnerability exploitation in the validator by disabling the publishing or upgrading process, effectively preventing potential damage caused by some malicious transactions. When the denylist’s mitigation measures take effect, nodes sacrifice their own publishing/updating contract capabilities to ensure their continued operation.

Summary

In this article, we shared the details of the “hamster wheel” attack technique discovered by the CertiK Skyfall team, explaining how this new type of attack leverages critical vulnerabilities to cause the Sui network to shut down completely. Additionally, we also examined Sui’s timely response to fixing this critical issue and shared the methods used to fix the vulnerability and mitigate similar vulnerabilities in the future.

We will continue to update Blocking; if you have any questions or suggestions, please contact us!

AptosBlockchainMainnetnodeStorageSui

Was this article helpful?

93 out of 132 found this helpful