Babbitt column | Thinking about blockchain from the perspective of database

I. Introduction

Many newcomers don't understand what the blockchain is. They don't understand why it can be hooked up with Bitcoin, hooked up with invoices, hooked up with banks, supply chain finance, and even hooked up with deposit certificates. If you go to the book to find the definition of the blockchain, it is difficult to understand.

I am here to provide a way to understand, from the perspective of the database to understand the blockchain, I think it is a good angle.

Second, bitcoin

Take Bitcoin as an example. One of the questions that many newcomers often ask me is: What exactly is Bitcoin? Is it tangible or intangible? If it is an intangible thing, how does it bear the value function of trading and transfer?

If we explain bitcoin from the perspective of consensus, social recognition, etc., it can be magnanimous and empty. If we understand it from the perspective of the database, it is much more specific.

Every time a friend asks me this question, I will answer this question: Bitcoin currently has a data size of several hundred G for the whole node. These hundreds of G data are the carriers of Bitcoin. You can simply understand the data. Into bitcoin, every bitcoin transaction, transfer, the essence is the change of these data.

Third, understand the blockchain from the database

The term blockchain extends from bitcoin. The so-called blockchain, understood from a technical perspective, is the block + chain data structure.

A block is roughly composed of the following parts: transaction, block structure, and random number. The division between the block and the block is neither through the time nor the number of transactions in the block, nor through the size of the block, but the random number is calculated by the proof of the workload. If you calculate the random number within 1min, then it is 1min out of a block; if the next block is to calculate a random number within 30min, then the next block is 30min later. That is to say, between the block and the block, it is divided by the POW workload proof. With POW, there is a block! Without a POW, there is no block! Without blocks, there is naturally no blockchain!

This is the definition of the block, let us look at the definition of the chain: one by one, combined by specific rules, form a chain. In general, the formation of the chain has the following steps:

1. Selection of new blocks – Generally speaking, which block first calculates the random number, whichever block is used; but if there are multiple nodes at the same time, the selection problem is involved. 2, network communication – quickly broadcast the newly generated new block, spread to more nodes as soon as possible. 3. The longest chain is formed – and the next block is built on this basis.

Of course, there are some more detailed components, but the general steps are the above three steps. In fact, the formation process of the chain is the process of converging different data of different nodes of the whole network into certain and identical data.

The above is the most typical blockchain data structure. From the most intuitive point of view, what is the blockchain? The blockchain is not air, and the blockchain is the database of hundreds of Gs formed by this special data structure.

Of course, the blockchain is definitely not just a database. Understanding the blockchain into a database is just a simplified way of understanding. This simplified understanding makes it easy for us to figure out a lot of problems. The meaning of a blockchain is much larger than a database, but it is first of all a database .

Fourth, the problems of the Internet database

I recently read a speech by NEO founder Da Hongfei. He said that although the Internet today looks very good, it has three problems: First, the online systems are very fragmented and cannot be interconnected. There are also many gaps between the physical world and the online world. The third is the phenomenon of platform monopoly and data monopoly.

He said this is more written. I will give you a few specific examples. Everyone will understand that one is an offline problem and the other is an online problem.

The most direct performance of the offline problem is that there are still many places to queue up!

I was hospitalized once, and when I was discharged from the hospital for settlement and medical insurance reimbursement, I had a team of nearly 30 minutes to complete the business. This made me very confused. The Internet is so developed, is this business really? Can't you finish it online? Must be queued to complete?

In fact, I have been in the team for so long, and I only did two things. One is to submit my personal identity, the other is to submit my own needs. To put it simply, it is to hand in the ID card and say "do medical insurance reimbursement". The remaining thing is that the settlement center downloads the data from the inpatient department and then deposits the medical insurance settlement business.

In the same way, there are still many places in our life that need to be lined up. For example, we go through real estate transfer, we go through business registration. In short, all the places we need to line up in our current life are facing the same problem. Why do these businesses need to be present? Can I handle it? In fact, nothing more than two things: First, confirm the identity of the applicant, second, obtain relevant data.

In particular, the step of obtaining data, because different departments have different databases, must first identify the identity of the applicant, and then request the relevant data, in order to complete the business , to say that the root cause is still the problem of data connectivity between different business centers.

The reason why this process can't be done online is because the different departmental databases are not connected. For example, the accounts of the inpatient department and the discharge settlement are not the same. You need to apply for it and then download it. For example, the real estate department and the bank have A lot of data is not the same, they each have their own database, you need to apply for it. On the other hand, the current account password system is not secure enough to prove the “applicant status” . For example, if your account is cracked, others can conduct business on your behalf. This system is not suitable for large fund settlement. Suitable for simple information transfer.

However, these problems should no longer exist in the blockchain era. In the era of blockchain, if you want to confirm the identity of the applicant, you can directly enter the private key. The private key has personal attributes. Just enter the private key and cooperate with some biometrics. Technology has been able to prove "I am me", as long as I can prove "I am me", the subsequent data application and data transfer is very simple.

This is an offline problem, and there are problems on the line. The most direct problem on the line is that you can't use WeChat to transfer money to Alipay. There is no way to transfer money to Jingdong with Alipay. You can't even communicate with each other. Why is this happening in today's highly developed Internet?

The same is the transfer, banks can transfer money to each other, as long as the account is provided, even if it is cross-banking, you can transfer directly.

The reason behind this is also the database, Jingdong, Alipay and WeChat. They are centralized organizations. Each center has its own database. The data between the databases is not interoperable, so data and money cannot be transferred. But behind the bank and the bank there is the clearing structure of UnionPay. They have a common database, so these businesses can handle it.

At present, there are some inconveniences on the Internet. These inconveniences are caused by a centralized database, and the blockchain solves this problem. You can simply understand the blockchain as a common database . For example, Jingdong, Alipay, and WeChat share the same clearing database in the payment field, and you can transfer funds directly to each other, just as banks and banks share UnionPay databases.

Moreover, generally only specific data needs to use a common database. For example, major banks only share databases in liquidation, and specific business data is not shared. Business organizations like WeChat and Alipay are the same. As long as you use a common database (blockchain) where you need to cooperate, it is generally at the specific settlement, clearing, and credit level.

Fifth, the attribution of the database

Sharing a database sounds great, but the most immediate problem is the problem of ownership of the database.

The maintenance of the database requires a lot of manpower and material resources, and the cost is high. Who is responsible for this cost? This is a huge problem. For example, Jingdong, Ali, Tencent, three companies to do a clearing system, who will maintain this system? What is the cost? Who owns the shares? What is the right to speak? This will involve a lot of troublesome issues. And this is a problem that all alliance chains have to face. If a company has the absolute right to speak data, then this is no different from a centralized database. This problem is also one of the obstacles that the blockchain is too late to land.

The traditional public chain uses the currency to solve this problem. The currency can generate economic incentives, so that the book can be profitable, so many people are willing to maintain the book spontaneously, and even compete to maintain the book. So this problem will be solved, but at present, the domestic currency is not allowed, so this road is temporarily unavailable.

Let's think about it again. Why is the current government level most actively embracing the blockchain? And why is the government's blockchain business the fastest? For example, the IRS and Tencent have cooperated to open a blockchain invoice business. For example, the Supreme People's Court has approved the legal effect of the blockchain deposit certificate. For example, the latest news has been developed by the Hangzhou government to develop its own government blockchain.

Because the government can easily solve the problem of property rights and incentives of the database , the administrative level will promote, and all departments will cooperate. The relevant business departments will contribute the data and put it on the chain, so it is much easier to land. The blockchain begins at the government level and is the most efficient and least resistant path.


The connotation of the blockchain is very rich. It has many derivative meanings, such as the evidence economy and production relations. It is definitely not accurate to use the database to understand the blockchain, but the database is indeed the technical bottom layer of the blockchain.

Why is Bitcoin worldwide able to transfer money directly? Because they have a common bitcoin database; why can't WeChat and Alipay communicate directly and transfer money? Because they belong to different databases; why blockchains cannot be tampered with? Because the blockchain database does not belong to any single party; what is a public chain, the public chain is an open third-party database; What is a coalition chain? The alliance chain is where several business stakeholders maintain a common database.

It is convenient to understand the blockchain principle and application through the database because it is very specific. In fact, in most cases, every time there are projects on the market that claim to be blockchain applications, but you don’t know what they are doing, you replace the words “blockchain” with The "third-party database" immediately understood.