Analysis | 3 key indicators to measure the performance of the blockchain network

Original author: MixBytes

Compilation: First Class (First.VIP)

Key indicators for measuring blockchain performance include:

1. Blockchain node indicators (number of blocks produced, number of transactions processed, processing time, completion time, etc.)

2. P2P subsystem indicators (number of hit / miss requests, number of active users, number and structure of P2P traffic, etc.)

3. System node indicators (CPU, memory, storage, network, etc.)

When everything is fine, you usually don't have to worry about blockchain testing. We'll explain why it's best not to put aside performance evaluations, what metrics to use, and get the most out of it. Let's find out.

TPS (transactions per second)

In the context of distributed systems, TPS is a very vague and capricious indicator.

TPS indicators come from a distributed database. They usually use standardized transaction types or transaction sets (for example, the values ​​of INSERT, UPDATE, DELETE, and the value of the constant SELECTs), and are configured for a specific cluster or individual machine. Such "comprehensive" indicators do not reflect the true performance of the database or blockchain in question, as transaction processing times may vary in such systems.

A consistency-oriented database (see " CAP Theorem ") will only submit transactions after other nodes have received a sufficient number of confirmations, which is very slow.

Note: An availability-oriented database believes that if a transaction is simply written to disk, it is successful. They provided updated data immediately and were very fast (though this transaction may be rolled back in the future).

If the transaction updates only one data unit, the TPS will be higher. If a transaction updates many data units (rows, indexes, files), they will block each other. We don't see any "TPS competition" between Oracle, MSSQL, PostgreSQL and MongoDB, Redis, Tarantool, because their internal mechanisms and tasks are very different.

From our perspective, “Measuring Blockchain TPS” means performing a full range of performance measurements:

1) Under repeatable conditions

2) Number of nodes close to real block verification

3) Use various types of transactions:-Typical blockchains studied (for example, transfer () of major cryptocurrencies)-Load storage subsystem (every transaction has considerable changes)-Load network bandwidth (large transactions )-CPU loading (large-scale password conversion or calculation)

To talk about the "TPS" we cherish, we need to describe all network conditions, parameters, and benchmarking logic. In a blockchain, applying transactions to an internal database does not mean that consensus will accept it.

Note: In the PoW consensus, transactions are never finalized. If a transaction is contained in a block on a machine, it does not mean that it is accepted by the entire network (for example, if another fork wins).

If the blockchain has other algorithms to ensure finality (e.g. EOS, Ethereum 2.0, Polkadot parallel chain using GRANDAPA final consensus), then the processing time can be considered as the node "sees" the transaction and the next finalized completion zone Block time. Such "TPS" are very useful, but because they are lower than expected, they are rare.

"TPS" involves many things. Please be skeptical and ask for all details.

I. Blockchain-specific indicators

93

Local TPS

The number of transactions processed and the maximum / average / minimum processing time (on the local node) are very convenient to measure, because the functions that perform these operations are usually expressed in code. The transaction processing time is equal to the time required to update the status database. For example, in an "optimistic" blockchain, processed transactions may have been verified but not yet accepted by consensus. In this case, the node sends the updated data to the client (assuming there is no fork of any chain).

This indicator is not very reliable: if another fork chain is selected as the main chain, the transaction data will be rolled back, and the measured statistics must also be rolled back. This is often overlooked during testing.

"Yesterday our blockchain reached 8000tps." Such numbers can often be seen in short project reports because they are easy to measure. All you need is a running node and a load script. In this case, the speed of reaching consensus on the entire network will not decrease due to network delay.

Note: This indicator reflects the performance of the state database without being affected by the network. This number does not reflect the actual network bandwidth, but rather shows what the limit it is trying to reach if consensus and the network are fast enough.

Any blockchain transaction is written in several atomic stores. For example, a Bitcoin payment transaction involves removing several old UTXOs (deletion) and adding new UTXOs (insert). In Ethereum, a transaction is to execute a small smart contract code and update several key-value pairs.

Atomic storage writes are a very good indicator for finding storage subsystem bottlenecks and distinguishing between low-level logic problems and internal logic problems.

Blockchain nodes can be implemented in several programming languages, which is more reliable. For example, Ethereum nodes have Rust and Go implementations. Keep this in mind when testing network performance.

Number of local blocks generated

This simple indicator shows the number of blocks produced by a particular verification node . It depends on consensus and is critical to assessing the "usefulness" of a single verification node network.

Since validating nodes make money on each block, they will ensure that their machines are running stably and securely. You can determine which validating node candidate is the most qualified, protected, and ready to work in a public network with real user assets. The metrics can be checked publicly, just download the blockchain and calculate the number of blocks .

Final certainty & final irreversible block

The final certainty ensures that all transactions contained in the blockchain will not be rolled back and will not be replaced by another fork chain. This is a way for PoS networks to prevent double-spend attacks and confirm cryptocurrency transactions for users.

When there is a block that can be finalized to include a transaction on the chain, rather than when the transaction is only accepted by the node, the user can consider the transaction to be finalized. To finalize a block, the verifier must accept the block in the P2P network and exchange signatures with each other. The true blockchain speed is detected here, because the time point at which the transaction is finally determined is the most important for the user.

The final deterministic algorithms also differ from each other, intersect, and are combined by the main consensus (please read: Casper in Ethereum, Last Irreversible Blocks in EOS, GRANDPA in Essence and GRANDPA in ParityPolkadot and their modifications, (E.g. MixBytesRANDPA).

For networks where not every block has been finalized, a useful indicator is the delay between the final finalized block and the current latest block. In the case where they agree with the correct chain, this delay number indicates how much the validating node lags behind. If this gap is large, the final deterministic algorithm requires more analysis and optimization.

P2P layer

The peer-to-peer subsystem is often ignored as the middle layer of the blockchain network. This is due to the fuzzy delay of transactions between block delivery and verification nodes.

When the number of verification nodes is small, they are localized, the user list is hard-coded, everything works fine and is very fast. However, verifying that the nodes are geographically distributed and simulating packet loss, we are facing a serious "TPS" failure.

For example, when the EOS consensus is tested using an additional final deterministic algorithm, the number of verification nodes is increased by 80 to 100 and distributed across four continents, with little impact on final certainty.

At the same time, the increased packet loss verification severely affects the final certainty, which proves that an additional P2P layer configuration is needed to better resist network packet loss (rather than high latency). Unfortunately, there are many different settings and factors, and only benchmarks can give us an idea of ​​the number of verification nodes needed and get a relatively comfortable blockchain speed.

The configuration of the P2P subsystem is clear in the documentation, for example, see [ libp2p ], [Kadamlia] protocol, or [ BitTorrent ].

Important P2P indicators can be:

1) Inbound and outbound traffic 2) The number of success / failures linked to the user 3) The number of times the previously cached data block was returned, and the number of times the request was further forwarded to find the required block (cache hit / miss miss simulation)

For example, a large number of misses when accessing data means that only a few nodes have the requested data and they have no time to distribute the data to each node. Received / sent P2P traffic allows identification of nodes dealing with network configuration or channel issues.

System indicators of blockchain nodes

The standard system indicators of blockchain nodes are described in a lot of source code, so we will briefly introduce them. They help find logical bottlenecks and errors.

CPU

The CPU displays the amount of calculations performed by the processor. If the CPU load is high, it means that the node is actively computing using logic or FPU (almost never used in the blockchain). For example, the latter situation can occur because nodes are checking electronic signatures, using strong passwords to process transactions, or performing complex calculations.

The CPU can be divided into more metrics to indicate code bottlenecks. For example, system time-time spent in kernel code, user time-time spent in user processes, io-waiting for I / O from slow external devices (disk / network), and so on.

RAM

Modern blockchains use key-value databases (LevelDB, RocksDB), which constantly store "hot" data in their memory. Any loaded service suffers from memory leaks due to errors or attacks against node code. If memory consumption is increasing or increasing sharply, it is most likely caused by a large number of state database keys, a large transaction queue, or an increase in the amount of messages between different node subsystems.

Insufficient memory load indicates an increase in block data limits or maximum transaction complexity.

The full node responding to the web client relies on the file cache metrics . When a client accesses various parts of the state database and transaction log, old blocks on disk may appear and replace new blocks. This in turn reduces the response speed of the client.

The internet

The main network indicators are the size of the traffic (in bytes), the number of network packets sent and received, and the packet loss rate. These indicators are often underestimated because the blockchain cannot yet process transactions at 1Gbit / s.

Currently, some blockchain projects allow users to share WiFi or provide services for storing and sending files or messages. When testing such networks, the amount and quality of network interface traffic becomes very important, because a congested network channel affects all other services on the machine.

storage

The disk subsystem is the slowest component of all services and often causes serious performance issues. Excessive log records, accidental backups, inconvenient read / write modes, and a large amount of blockchain total, all of which may cause a significant decrease in node speed or excessive demand for hardware.

The operation mode of the blockchain transaction log using disks is similar to different DBMSs using write-ahead logs (WAL). Technically, the transaction log can be viewed as the WAL of the state database.

Therefore, these storage metrics are important because they can identify bottlenecks in modern key-value databases. The number of read / write IOPS, maximum / minimum / average latency, and many other metrics can help optimize disk operations.

in conclusion

In summary, we can group indicators into:

1) Blockchain node indicators (number of blocks produced, processed transactions, processing time, completion time, etc.) 2) P2P subsystem indicators (number of hit / miss requests, number of active users, P2P traffic Quantity and structure, etc.) 3) system node indicators (CPU, memory, storage, network, etc.)

Each group is important because there may be subsystem errors that limit the operation of other components. The slowdown of even a small number of validating nodes can seriously affect the entire network.

In consensus algorithms and final deterministic algorithms, the trickiest errors only occur when large transaction flows or consensus parameter changes. Their analysis requires repeatable test conditions and complex load scenarios.

Original: The Key Metrics to Measure Blockchain Network Performance, https://hackernoon.com/how-to-measure-blockchain-network-performance-key-metrics-en1234u4