The report produced by the Institute of Fire currency block chain, reports Published April 10, 2019, Author: Yuan Yuming, Hu Zhiwei, Weng Yi-ming, Qiao Xiaofeng.

## **Summary**

As the provider of market liquidity – the main risk faced by market makers in their day-to-day operations is the adverse selection risk caused by the information advantage of the counterparty. VPIN (Volume synchronized probability of informed trading) is a relatively popular measurement indicator in the traditional trading market. VPIN can be divided into TR-VPIN (TickRule VPIN) and BV-VPIN (Bulked Volume VPIN) according to the classification criteria of different transaction volumes. The principle is that the addition of an informed trader will result in a shift in the originally stable distribution.

**The Fire Coin Research Institute analyzes the VPIN model and writes the code, and back tests on the real data and backtesting platforms of multiple exchanges provided by 1token.** The main test scenarios include the rapid rise in bitcoin prices in early April and the rapid decline in EOS prices in January.

- "Large Case" tells you the truth about cryptocurrency hedge funds
- The latest developments in cryptocurrency by Congress and the Federal Election Commission
- ECB Executive Director calls for financial regulators to act quickly on Libra
- Japan introduces new regulations: limiting the leverage ratio of cryptocurrency margin trading
- Since 2016, France’s cryptocurrency-related fraud investigation has increased by 14,000%
- Singapore tax authorities propose to remove cryptocurrency transactions from GST taxation

The results show that **whether it is the skyrocketing or plunging of digital asset prices, VPIN will often have a large increase, with a certain forecasting effect, which can be regarded as a leading indicator of volatility, which can provide liquidity to options trading and market makers. The exchange risk management of the exchange has guiding significance.**

## **Report body**

**VPIN**

As the provider of market liquidity – the main risk faced by market makers in their day-to-day operations is the adverse selection risk caused by the information advantage of the counterparty. In order to measure the risk of adverse selection faced by market makers in transactions, the industry has adopted various indicators, and price volatility is one of the most popular.

However, in the context of high-frequency trading, volatility is not the most reliable predictor. For short-term large fluctuations in high-frequency trading, a more popular indicator in traditional financial markets is VPIN.

VPIN originated from the main market risk source of concern to market makers – PIN (probability of informed trading), which is the probability of informed traders. To put it simply, when a market maker offers quotes in the market, one of the core issues to consider is the probability that the counterparty is an informed trader, because when trading with an informed trader or a trader with informational advantages The expected return on the transaction is definitely negative.

VPIN (Volume synchronized probability of informed trading) is an improvement based on the PIN indicator. Because it is difficult to calculate PIN directly, Easley (2015) proposes to use transaction time instead of physical time to measure the transaction toxicity under real-time conditions. According to the classification criteria of different transaction volumes, VPIN can be further divided into TR-VPIN (tickruleVPIN) and BV-VPIN (Bulked volume VPIN). Easley (2015) mentioned in the later article that BV-VPIN is a better algorithm, so our algorithm in this paper is the BV-VPIN algorithm.

For the secondary market of digital assets with frequent short-term fluctuations, the Fire Coin Research Institute analyzes the VPIN model and implements the algorithm code in Python, and performs backtesting on the real data of multiple exchanges provided by 1token. Test results on transaction history data such as Bitcoin and EOS show that **VPIN has a certain predictive effect in digital asset trading, which can be regarded as a leading indicator of volatility, which can provide liquidity to exchanges, market makers, and exchanges. Wind control management has guiding significance.**

**2. Model principle**

Before we start the calculation, let's review the model logic of VPIN. One of the core indicators of market makers' attention in daily transactions is the probability that a counterparty has an information advantage, that is, PIN (probabilityof informed trading). This model models the impact of PIN (probability of informed trading) on market makers' quotations. The reasoning process of the model is more complicated, and we mainly state the core assumptions and conclusions of the model.

**The Glosten-Milgrom Model**

**The Easley-O'Hara Model**

This model is a model for generating PIN, a toxicity indicator for information flow. In a series of articles by Easley, transactions are seen as a game between market makers and traders. In this model, the transaction is seen as a series of trading periods i = 1, …, I. At the beginning of each period, there is an alpha probability that an event affects the price of the asset. If this happens, it may be good news or bad news for the asset. At the end of the ith period, if it is good news, the value of the asset should be ??, if it is bad news, the value of the asset should be ??. The probability of good news occurring is (1-δ), the probability of bad news occurring is δ, and δ can be assumed to be a prior probability. After the message occurs, the order flow enters the exchange with a Poisson distribution. Informed traders know the quality of the information, they will buy in good news, and sell in bad news. In the model, the arrival rate of the informed trader is assumed to be μ, and the arrival rate of the uninformed trader is ε.

The deductive result of this model is the spread of the market maker's quote AB=[?-?]PIN (A is ask price, B is bid price). The larger the proportion of translated trading in this model, the greater the price difference between market makers. This is also consistent with everyday conscious.

From the above two models, we can learn how to construct the PIN from a theoretical perspective. If this indicator is to be applied to a transaction, the model parameters must be estimated. The standard solution to solve this model and calculate the PIN requires prediction of the (α, δ, μ, ε) parameters, which makes real-time prediction very difficult. We used the VPIN method to predict the PIN, which makes it possible to calculate this indicator in real time. In the secondary market for digital assets, this idea is also more intuitive. The occurrence of transactions is usually highly correlated with information, and the correlation with time is small, which also supports the modeling of events with transaction time rather than physical time.

In the VPIN model, transactions are categorized. The distribution of the classification is to first aggregate the transactions over a period of time, and then classify the transaction volume by the distribution of the difference between the starting price and the ending price during that time. The specific formula is as follows:

The more complex order flow model is considered in the literature, and we provide the simplest order flow model for the sake of simplicity.

Whether this indicator matches the theoretical model above, the answer is yes. Interested readers can use the Monte Carlo simulation to validate the model assumptions (α, δ, μ, ε) and use these parameters to generate order flows. The PIN in the calculation result is very close to the calculation result of VPIN.

The reasoning process of the whole model involves more theoretical proofs, and interested readers can directly read the original text of the reference in this report. The idea behind the model is not complicated, and it is worth learning in the development of daily trading strategies. The model assumes that the market will have a relatively stable distribution of orders in equilibrium, **and the addition of informed traders will result in a shift in the originally stable distribution.** **VPIN is an attempt to capture this offset.**

**3. Algorithm steps**

After knowing the VPIN principle, we can try to algorithmize the VPIN. See Prado (2012) for details.

**A. Input**

1. Time series of transaction data of a certain currency

T: transaction time T_i

P: the underlying transaction price P_i at the time of the transaction

V: trading volume V_i

2.V: trading volume

3.n: Transaction sample used to predict VPIN usage

**B. Cut the transaction volume into the same size**

1. Arrange the order in chronological order

2. Calculation

3. Extend ∆P_i so that the number of ∆P_i is the same as the corresponding V_i, and after expansion, I=∑_i▒V_i ∆P_i

4. Sort ∆P_i from new to i=1,…,I.

5. Will τ = 0

6.τ=τ+1

7. If I<τ*V, then jump to 11 steps

8. For i belongs to [(τ-1)V+1, τV], the transaction volume is buyer-driven and seller-driven.

9. Label the volume of the basket

10. Back to the sixth step

11. Will L = τ-1

**C. Use VPIN calculation formula**

If L is greater than n, there will be enough information for calculation

**D.VPIN accuracy verification**

The Monte Carlo algorithm can be used to verify the validity of the VPIN. The method can be assumed that the factor behind an order flow is (α, μ, ε). The VPIN calculated with the simulated order flow will be very close to the real PIN.

**4. Solid test**

In this report, we use the real-time transaction data of multiple trading markets provided by 1token ( https://1token.trade/ ) to test and compare the performance of VPIN in the event of a sharp fall in asset prices. 1token is one of the few high-quality data quotients on the market. Their data has good real-time and quasi-grouping, which is just right for this high-frequency data backtesting. At the same time, 1token's backtesting platform can also reduce the development cost of backtesting code.

**Test 1**

The price of Bitcoin quickly rose from around $4,100 in early April 2019 to $5,000 and above. In this test scenario, we selected the BTC/USDT data from multiple exchanges for testing.

The test parameters are set to: 1/50 of the daily average transaction volume sets the size of the "bucket", and the calculation of VPIN selects the last 25 buckets.

The test data is the result of Bitfinex's real trading data:

The test data is the result of Kraken's real trading data:

The test data is the result of the firm transaction data of the currency security:

The blue line of each of the above figures represents the transaction price of BTC/USDT, the value corresponds to the left vertical axis; the red line is the cumulative probability distribution value (CDF) of VPIN, and the value corresponds to the right vertical axis.

It can be seen that the VPIN's CDF value has risen rapidly along with the bitcoin price between 4:30 and 5:00 (UTC time) on April 2, and the VPIN's CDF value has risen rapidly at 5: After a short period of time, 00 remained at a very high level; and the bitcoin price subsequently rose to around $5,000 in recent price highs around 5:30.

Therefore, it can be seen that VPIN has a certain pre-prompted effect on the price increase in each exchange.

**Test 2**

The price of EOS has experienced significant fluctuations recently. The price of EOS/USDT experienced a sharp and rapid decline on January 10, 2019, dropping from around 2.9 to around 2.4. In this test scenario, we chose the EOS/USDT test price drop. The specific test data is as follows:

Test data: Coin's, Bitexex exchange's EOS/USDT transaction pair's transaction data in January

Test parameters: 1/50 of the daily average trading volume sets the size of the "bucket" (Bucket Size), the calculation of VPIN selects the last 25 barrels

The test data set is the result of the Currency Exchange:

Among them, the blue line indicates the transaction price of EOS/USDT, the value corresponds to the left ordinate axis; the green line is the value of the VPIN indicator, and the red line is the cumulative probability distribution value (CDF) of VPIN, and the value corresponds to the right ordinate axis. .

The test data set is the result of Bitfinex:

The VPIN can be seen on the chart, especially on the CDA, which stays at a very high level (CDF>0.8) before the EOS price falls sharply. It is indicated that there will be obvious statistical anomalies in the imbalance of orders during the period of sharp fluctuations in the price of digital assets and significant fluctuations, and VPIN and CDF have better indication effects. VPIN's level on the Bitfinex exchange will be lower than that of the Onan Exchange, and it is willing that VPIN will be more effective on leading exchanges and relatively weaker on lagging exchanges, the differences of which will be discussed in later articles.

**5. Summary**

It can be seen from the above results that the VPIN and its cumulative probability distribution function CDF often increase greatly when the asset price surges and has a certain prediction effect. Therefore, VPIN can be regarded as a leading indicator of volatility in the practice of digital asset trading. When there is a large rapid increase in VPIN, the trader is prompted to pay attention to the fluctuation of the digital asset. Since both ups and downs are possible, VPIN will be more instructive in the scenario of options trading. In addition, it can also be applied to market makers to provide liquidity, exchange risk management and other aspects.

Due to the limited space, this report selects the recent representative market time period and trading varieties. If you want to test the effect of VPIN more rigorously, you need to further examine the correlation between VPIN and price fluctuations and the conditional probability between VPIN and price fluctuations. These tests will continue to be introduced in future research reports. It is worth noting that the effect of VPIN is also controversial in the academic circle. Anderson (2014) questioned the validity of VPIN, and Easley (2014) also refutes the question. Readers who are interested in further study can continue to read the article.

For the implementation of the VPIN algorithm and Python code in the body of the report, please contact the Fire Coin Research Institute for further discussion.

**Reference material**

[1] Marcos M.Lopez De Prado. Advances in High Frequency Strategies (2012), 76-80

[2] Glosten, LR and P. Milgrom (1985): “Bid, ask and transaction prices in a specialist market with heterogeneously informed traders”, Journal of Financial Economics, 14, 71-100

[3] Easley, D. and M. O'Hara (1992b): “Time and the process of security price adjustment”, Journal of Finance, 47, 576-605

[4] Andersen TG, and Bondarenko O. VPIN and the Flash Crash [J]. Journal of Financial Markets, 2014.17:1-46

[5] Easley D, de Prado MML, and O'Hara M. VPIN and the flash carsh: A rejoinder[J]. Journal of Financial Markets, 2014. 17:47-52

[6] Easley, D., RFEngle, M. O'Hara and L. Wu (2008): "Time-Varying Arrival Rates of Informed and Uninformed Traders", Journal of Financial Econometrics