Technical Guide 丨 How to implement off-chain storage on HyperLedger Fabric

The author of this article is Deeptiman Pattnaik, a development engineer working on various software development projects: Android, Go, Node.js, MongoDB, PHP, JavaScript Beacon, Virtual Reality, Augmented Reality.

In this article, I will try to explain the importance of OffChain storage in Hyperledger Fabric and the application of offchaindata, which is an application built to demonstrate the use of Hyperledger Fabric's Go programming language for off-chain storage.

1_Ct8Hudcainn1bH4I3vm-Ng

OnChain and OffChain transactions

Transaction flows in any blockchain platform are executed in two different layers. Transactions involving distributed ledgers in a blockchain network are considered on-chain transactions, while transactions executed outside the blockchain and stored in any collection database (such as CouchDB, StateDB) are considered off-chain transactions.

Blockchain is more than a storage solution

The concept of blockchain technology is to store large amounts of data and be able to provide the current status for each transaction. The blockchain network will maintain a transaction history log for any changes performed in the distributed ledger data. This technology distinguishes the blockchain from traditional database storage technologies, which are only designed to store data in an organized manner.

On-chain transaction issues

Generally, on-chain transactions take longer to complete. Due to the large number of transaction queues waiting to be executed in the network, the performance of the blockchain has begun to decline, and the transaction time spent is longer. During on-chain transactions in the blockchain, both business and storage space involve significant costs.

On-chain storage calculations

IBM has performed analysis to determine the significant costs involved with storage space. (Source: https://www.ibm.com/downloads/cas/RXOVXAPM)

Fact factor

  • Bitcoin stores 1,400 transactions per block.
  • The size of the hyperledger block is 1 MB, and each block has 1000 transactions.
  • Each blockchain transaction is 5 KB in size and can generate 205 TPS (transactions per second)
  • Calculation of storage per TPS

Calculate transactions by comparing companies that work an average of 8 hours a day and 240 days a year.

(1 TPS / 1000 TB) * 1024 KB * 3500 seconds / hour * 8 hours / day * 240 days / year = 7,077,888 KB per transaction per year = 6,912 MB = 6.75 GB = 0.00659 TB per transaction / year

On-chain financial costs of the blockchain

IBM also provides average enterprise-level costs for unauthorized blockchains such as Hyperledger and Ethereum.

IBM Hyperledger costs $ 1000 per month and the extra cost per active node is $ 1000, so the total monthly cost is $ 6000.

The cost of each transaction is: Bitcoin is $ 1.30, and Ethereum is $ 0.25 per transaction.

In an unlicensed blockchain, the cost per transaction will vary based on the current value of the cryptocurrency.

The cost of permission-based blockchains such as Hyperledger will change as the number of nodes increases.

Therefore, compared to transaction costs, all non-transactional data (such as pictures, videos, PDFs and other documents) should not be stored in the blockchain ledger.

Off-chain transaction solutions

Off-chain transactions do not store transactions for each node in the storage space. A party willing to store a particular transaction can use off-chain storage. Off-chain transactions improve computing efficiency. Such calculations are performed off-chain and are deterministic rather than consensus.

Design and implementation of off-chain storage

There are many off-chain databases that can be integrated with Hyperledger Fabric to store transaction details. The offchaindata application I built uses CouchDB as off-chain storage. A GRPC event listener will be run that listens to the peer as a GRPC client connection.

Therefore, the event listener processes the KVWriteSet value of each block into the off-chain storage area (CouchDB). MapReduce technology is used to query off-chain data from CouchDB storage.

GitHub: https://github.com/Deeptiman/offchaindata

What is MapReduce?

MapReduce is a programming model designed to process large amounts of data in parallel on large clusters.

MapReduce has two functions:

  • Mapping-It provides a list of key-value pairs for some document collections.
  • Reduced-It has a smaller set of key-value pairs that can handle multiple nodes in the collection.

CouchDB uses MapReduce technology to filter all collected documents. In the following example, we will see how MapReduce can be used in a User model.

User model

1

Document collection in CouchDB

WX20200227-175559 @ 2x

User details stored in CouchDB

Therefore, we will create a MapReduce function to query emails in the collection.

Configure MapReduce for email

WX20200227-201849 @ 2x

Output : 3

Using parameters

Design view: emailviewdesign

MapReduce view:

4 Query reduction feature counts total email

5

Output

6

Query map feature lists all emails 7

Output

8

Therefore, MapReduce works this way. We can also create MapReduce functions for other nodes to query from CouchDB

in conclusion

All queries are executed in off-chain storage, and on-chain ledgers are completely ignored. This improves computational efficiency in terms of querying large amounts of data. Performing off-chain queries does not involve transaction costs, as similar on-chain queries have higher transaction costs. In the case of a public blockchain, off-chain storage can also be used to store sensitive private data, because not all participants in the blockchain network are aware of the additional independent storage layer used to store data.

So this is an overview of the important use cases for understanding off-chain storage in any blockchain network. Please check the off-chain data app on Github and share your feedback.