Interpretation of the Ethereum Yellow Book (4): In-depth understanding of the transaction execution process on Ethereum

The topic we are going to talk about this time is how the transactions on the Ethereum are executed. In this article, we'll learn about transaction validation rules and why they exist; then, an in-depth understanding of the execution of the transaction, and each step the node takes to validate the transaction.

This article is the fourth article in the Ethereum Yellow Book series. The purpose of this series is to give everyone a clearer understanding of the Ethereum Yellow Book and let more people know about Ethereum. If you missed the previous articles, here are the relevant links:

  • Interpretation of the Ethereum Yellow Book (1/7)
  • Interpretation of the Ethereum Yellow Book (2/7)
  • Interpretation of the Ethereum Yellow Book (3/7)

(Disclaimer: This article is based on the Byzantine 7e819ec version of the Yellow Book on October 20, 2019)

Introduction

In this series of articles, we have discussed how distributed Etherase works as a distributed computer, and how users interact with Ethereum by sending transactions to the system (also talking about the concept of transaction fees).

In the first blog post, we learned about Ethereum's state transition function, and how Ethereum implements computer functions through continuous state transitions.

In simple terms, the state transition function uses the current state and transaction as input to calculate the next state.

– Ethereum state conversion function –

Before we dive into how nodes in Ethereum execute transactions, let's talk about how a transaction is verified.

Transaction verification

Before executing a transaction, the node first verifies that the transaction satisfies some basic (inherent) rules. If these basic rules are not passed, the node will not execute the transaction.

The inherent rules of these transactions are as follows:

  1. Meet the RLP encoding format
  2. Legal signature
  3. Has a legal nonce (same as the current nonce value of the sender of the transaction)
  4. The intrinsic cost of executing a trade is less than the gas capped trade set by the trade.
  5. The sender's account balance is greater than or equal to the prepayment required for the transaction.

There is also a rule that does not belong to the inherent rules of the transaction – if a series of transactions that are ready to be packaged into the block, plus this transaction, will cause the total Gas Limit of all transactions to exceed the block's Gas limit, then The transaction cannot be packaged into a block with those transactions.

Let us expand each rule to explain how these rules work and why.

Transactions must conform to a compliant RLP code

This rule may be best understood intuitively. RLP (Recursive Length Prefix) is an encoding method used to serialize objects in Ethereum. Like other methods, if you do not encode objects according to RLP, you cannot perform the object. Decoding, you can't get the information of the original object through data encoding.

The purpose of this rule is to ensure that the Ethereum client can successfully decode and execute after receiving the transaction.

The transaction must have a legal signature

Suppose you have a lot of Ethereum in your Ethereum account. Now someone is trying to initiate a transaction and transfer money from your account to take it for yourself. What do you think? You definitely don't want to see someone posing as you and stealing your money, which is why we need to trade signatures.

Ethereum uses asymmetric encryption to ensure that only the actual controller can initiate a transaction from the account. At the same time, this cryptography tool allows others to verify that the transaction was indeed initiated by the actual controller of the account.

I won't discuss the details of ECDSA (the asymmetric cryptographic algorithm chosen by Ethereum ) because we only need to know the most basic concepts.

In asymmetric cryptography, public and private keys exist in pairs. The private key should be completely confidential, and the public key can be shared with anyone; the private key can be used for signing, and the signature can be verified with the corresponding public key. Signing a transaction initiated by you at Ethereum is equivalent to signing a letter you wrote, the difference being that cryptographic signatures are more difficult to forge than handwritten signatures!

At Ethereum, the account address is generated based on the individual's public key. When sending a transaction, the private key is used to sign the transaction (remember v, r, s, which are the values ​​contained in the transaction?), then all nodes can determine if the transaction is really associated The private key owner of the account signed.

A transaction that does not have a legal signature has no meaning of execution, so having to have a legal signature becomes one of the inherent rules of the transaction.

Trading nonce and account nonce must match

In Ethereum, the account nonce value represents the number of transactions sent by the account (if it is a contract account, the nonce value refers to the number of contracts created by the account). Without a nonce, the same transaction may be executed incorrectly multiple times (also known as a "replay attack"). Given the distributed nature of Ethereum, different nodes may try to package the same transaction into different blocks, winding the duplicate transactions. Suppose a transaction that you transfer money to someone is mistakenly packaged twice, causing you to repeat the money twice. You must be very unhappy in your heart.

Whenever a user creates a new transaction, they must set a transaction nonce value that matches the current account's nonce value. When the transaction is executed, the node checks to see if the transaction nonce matches the account nonce.

If, for some reason, the same transaction is repeatedly submitted to the node, the transaction that is repeatedly submitted will be considered illegal because the account nonce value has increased.

Ethereum mandates that the transaction nonce value matches the account nonce value, in addition to avoiding replay attacks, and ensuring that a transaction will only execute and change state once.

The inherent cost of the transaction must be less than the gas limit set by the transaction

In the previous blog post , we explained why using Ethereum requires payment and the concept of gas . In general, every transaction has a gas associated with it – the cost of sending a transaction consists of two parts: the inherent cost and the execution cost .

The execution cost depends on how much Ethereum Virtual Machine (EVM) resources are used to calculate the transaction. The more operations required to perform a transaction, the higher the execution cost.

The inherent cost is determined by the payload of the transaction, which is divided into the following three types of load:

  • If the transaction is to create a smart contract, the load is the EVM code that creates the smart contract.
  • If the transaction is to call a function of a smart contract, the load is the input data of the execution message.
  • If the transaction is simply transferring money between two accounts, the load is empty

Suppose Nzeros represents the total number of bytes in the transaction load with a byte of 0; Nnonzeros represents the total number of bytes in the transaction payload where the byte is not zero. The intrinsic cost of the transaction can be calculated by the following formula (Chapter 6.2, Equations 54, 55 and 56):

Inherent cost = Gtransaction + Gtxdatazero * Nzeros + Gtxdatanonzero * Nnonzeros + Gtxcreate

In Appendix G of the Yellow Book, you can see a fee schedule for the costs associated with creating and executing a transaction. The content related to the inherent cost is as follows:

  • Gtransaction = 21,000 Wei
  • Gtxcreate = 32,000 Wei
  • Gtxdatazero = 4 Wei
  • Gtxdatanonzero = 68 Wei (will be changed to 16 wei when upgrading in Istanbul )

When we understand what the inherent cost is, we can understand why the transaction is considered illegal once the inherent cost of the transaction is higher than the Gas limit. Gas Limit specifies the upper limit of Gas that can be consumed when a trade is executed; if we know that its inherent cost is higher than the Gas limit before we start the trade, then we have no reason to execute the trade.

The sender's account balance must be greater than or equal to the prepayment required for the transaction.

Transaction prepayment refers to the amount of Gas pre-deducted from the transaction sender's account before the transaction is executed.

We can calculate the transaction prepayment by the following formula:

Advance payment = gasLimit * gasPrice + value

The Gas Limit of a transaction refers to the maximum value of the Gas that the sender of the transaction is willing to spend on the transaction; Gas Price refers to the unit price of each unit of Gas; the value of the transaction refers to the Wei of the recipient of the message. The quantity (for example, the amount of the transfer), or the reserve in the contract to be created. If you want to know more about what is Gas and why it costs Gas to execute a transaction, check out our previous blog post .

Because the transaction prepayment is deducted before the transaction is executed, once the transaction sender's account balance is less than the withholding amount, the transaction is not necessary.

The Gas Limit of the trade must be less than or equal to the Gas limit of the block

This rule is not an intrinsic rule, but it is a basic requirement that a node must follow when selecting a transaction to package. The Block Gas cap is the upper limit of the total number of Gases that can be used to "place" in the block.

When the node is selecting the transaction to be packaged, the node must ensure that the total number of uses used by the transaction in the block does not exceed the block Gas limit after joining the transaction. For a trade to be packaged, the Gas Limit plus the Gas Limit sum of other trades must be less than or equal to the block Gas limit . Of course, if a transaction cannot be packaged into the current block, it still has a chance to be packaged by subsequent blocks.

Execute transaction

After verifying the transaction, it is time to execute it. In Ethereum, executing trades changes state—several transactions are packaged into one block, and each block is a list of transactions; when the transaction is executed sequentially, a new legal status is output.

The transaction is performed as follows:

  1. Add 1 to the sender account nonce value
  2. Deduct the transaction prepayment amount from the sender account ( gasLimit * gasPrice )
  3. Determine the gas value (gasLimit – intrinsic cost) that the transaction can be used to execute
  4. Execute the actions contained in the transaction (transfer, call or create a smart contract)
  5. Refund of senders via SELFDESTRUCT and SSTORE functions
  6. Refund the sender of the transaction for any unused gas
  7. Transfer to the beneficiary's account (usually a miner who digs out the block containing the transaction) into the mining revenue

Increase the nonce value of the sender's account

Whenever a transaction is sent, the sender account nonce will increase. This operation is completed at the beginning of the transaction execution, and if the transaction fails, the account nonce value is rolled back.

Deduct the transaction prepayment amount from the sender's account

We will deduct the prepaid amount from the sender's account balance. The mechanism is simple – the sender pays for the voluntary transaction cost (gasLimit * gasPrice).

Calculate the gas that can be used to execute the transaction

After the gas limit of the transaction is deducted from the intrinsic cost, the rest is the gas that can be used to execute the trade.

Execute the actions included in the transaction

The execution of the transaction also involves an EVM's list of operations, the only one that does not require EVM operations at all – just a normal transfer.

Each EVM operation has a corresponding cost of gas; during the execution of the transaction, each time an EVM operation is performed, the corresponding gas cost is deducted from the available gas. Stops until one of the following two conditions occurs:

  • Available gas is exhausted and execution fails
  • Available after the end of the gas, there is still left, or just zero

Refund of senders via SELFDESTRUCT and SSTORE functions

In Ethereum, the SELFDESTRUCT opcode is used to destroy smart contracts that are no longer needed. Each time a contract is destroyed, the performer can charge 24,000 Wei.

Similarly, when writing a 0 (valid delete value) using the SSTORE opcode, the operator can charge 1500 Wei for every 0 written.

One interesting thing about refunds is that there is a limit to refunds. This upper limit ensures that the miner can calculate the upper bound of the calculation time required to execute the transaction. (More details on gas fees and refunds can be found in Ethereum's Design Rationality article).

Another important point is that a refund must be made after the execution of the operations included in the exchange has ended. Therefore, any gas that should be returned will not be consumed by the transaction execution process, thus avoiding possible transactions that will never run out of gas .

Refund the sender of the transaction for any unused gas

If the prepayment for the transaction exceeds the gas used by the exchange, the sender has the right to withdraw the remaining gas after the transaction is executed.

Paying miners' fees to the beneficiary's account

All Gas used to execute the exchange is considered a transaction fee and is earned by the miner. This mechanism encourages miners to continue to produce blocks and to continue to cooperate on the cyber security level.

Conclusion

In this article, we discuss the verification and execution of transactions in detail (Chapter 6 of the Yellow Book). More types of transactions (contract creation and recall) will be introduced in Chapters 7 and 8. I will continue to update the blog posts on these chapters.

I believe that the best way to thoroughly understand the details of transaction verification and execution is to read the Ethereum client source code for an implementation protocol. As a contributor to Besu , I am familiar with its implementation, so I suggest that even if you are not very proficient in Java, you can still look at its source code. You can start reading from these two sections:

MainnetTransactionValidation.java and MainnetTransactionProcessor.java .

Again, if you find any errors or improvements in the text, or have questions, please let me know in the comments as usual.

See you next time!

Original link: https://www.lucassaldanha.com/transaction-execution-ethereum-yellow-paper-walkthrough-4-7/ Author: Lucas Saldanha translation & proofreading: IAN LIU & A sword

This article was authored by the original author to translate and republish EthFans.

(This article is from the EthFans of Ethereum fans, and it is strictly forbidden to reprint without the permission of the author.