On July 10, according to Coinmarketcap, Bitcoin rose more than 12,955 USD in a year's history. The current market value of Bitcoin reached $16.9 billion, which has risen by 76.48% in the past three months. In the ever-changing market, the giant whales of the cryptocurrency are also waiting for opportunities to raise again.


In June, the Diar report of the global research institute showed that since 2019, the number of bitcoins accumulated in large household addresses has exceeded 100,000, and the number has increased by 10%. The latest Bitcoin Rich List data in July also shows the trend of the giant whale address. Currently, according to Coinhills, the largest BTC transactions in 24 hours are: BitMEX, bitFlyer, OKEx, COINBIG, and the Bitcoin rankings are in addition to the four largest encryption wallets belonging to Binance, Bitstamp, Bitfinex and Huobi. The identity of the Bitcoin address holder is still unknown. How to track and mine these whale users? How to know the dynamics of the bitcoin trading of the whale users in time? This article will detail the bitcoin address mining method and related mathematical principles.



Bitcoin is a well-known cryptocurrency. Although every transaction is in the chain, the data is searchable, but people still don't know which person or organization the address belongs to. At present, if there is no effective way to find out his address for an individual, for an organization, the address can be found through data mining.

Some websites have now counted some published addresses, such as This website counts the four major categories, the most actively traded, and the most licensed website. They divide the bitcoin address into the following categories:


2. Mine pool

3. Service organization

4. Gambling website

But these agencies will change addresses frequently, how to find these addresses, or dig out these addresses is the main discussion of this article.


Technical principle

For Bitcoin, its address data mining is mainly dependent on some characteristics of Bitcoin transactions.

1. Multiple input merge

If multiple input addresses appear in a transaction, multiple input addresses belong to the same subject. In a transaction at an address, it appears on the input side, along with other addresses that appear on the input side, and can be considered to belong to the same subject (such as an exchange).

Satisfy condition: – The number of input addresses is not 1

The implied mathematical relationship inside will be detailed in subsequent articles.

For example, in the transaction shown in the figure below, there are five addresses on the input side (left side in the figure). In general, the five addresses can be considered to belong to the same subject.


2. Transfer and change

If there is one and only two output addresses in a transaction, and neither of the addresses is an input address, one of the addresses is the receiving transfer, and the other is the change address. Then the subject of this change address should be the same person as the input party.

The logic of this reasoning is actually the bit-zero mechanism of Bitcoin. By default, the change will appear in a new address.

To meet the conditions:

1. The number of output addresses is 2

2. The number of input addresses is not 2

3. The input address and output address cannot be the same

4. The btc number of one of the output addresses must be a value with more than 4 decimal places.

5. Another output address, not in the previous (multiple input or transfer and change address) address collection

For example, in the transaction shown below, there are only 2 addresses on the output side (the right side of the figure), and there are 85 addresses on the input side. In the previous example, we already know that the 85 addresses on the input side belong to the same subject. Then, with this rule, the output side has an address of 4 decimal places, and the 85 addresses belong to the same subject.


3. Mathematical principles

A probabilistic hypothesis is proposed in reference [1] to represent probabilistic models of different data sources. Consider different types of models (we treat them as independent so that they are computationally solvable):




Bitcoin address mining has the following effects:

1. Count the number of assets of each exchange, you can better understand the exchange's currency, and bitcoin's liquidity.

2. Predict market changes. When there is a change in the general market, there will always be a large amount of capital flow for the exchange. Market trends can be better predicted by monitoring the large inflows and outflows of individual exchanges.

3. For individual users, you can understand the organization's asset status and make it easy for users to make the right investment decisions.


Automatic Bitcoin Address Clustering:


