Note
The following implementations and documentation closely follow the publication by Bogomolov, T: Pairs trading based on statistical variability of the spread process. Quantitative Finance, 13(9): 1411–1430.
H-Strategy
In this paper, the author proposes a new non-parametric approach to pairs trading based on the idea of Renko and Kagi charts. This approach exploits statistical information about the variability of the tradable process. The approach does not aim to find a long-run mean of the process and trade towards it like other methods of pairs trading. Instead, it manages the problem of how far the process should move in one direction before trading in the opposite direction potentially becomes profitable, which is done by measuring the variability of the process.
H-construction
Suppose \(P(t)\) is a continuous time series on the time interval \([0, T]\).
Renko construction
Step 1: Generate the Renko Process
The Renko process \(X(i)\) is defined as,
where \(\tau_i\), \(i = 0, 1, ..., N\) is an increasing sequence of time moments such that
for some arbitrary \(H > 0\), \(\tau_0 = 0\) and \(P(\tau_0) = P(0)\),
Step 2: Determine Turning Points
We create another sequence of time moments \(\{(\tau^a_n, \tau^b_n), n = 0, 1, ..., M\}\) based on the sequence \({\tau_i}\). The sequence \(\{\tau^a_n\}\) defines time moments when the renko process \(X(i)\) has a local maximum or minimum, that is the process \(X(i) = P(\tau_i)\) changes its direction, and the sequence \(\{\tau^b_n\}\) defines the time moments when the local maximum or minimum is detected.
More precisely, when take \(\tau^a_0 = \tau_0\) and \(\tau^b_0 = \tau_1\) then
Kagi construction
The Kagi construction is similar to the Renko construction with the only difference being that to create the sequence of time moments \(\{(\tau^a_n, \tau^b_n), n = 0, 1, ..., M\}\) for the Kagi construction we use local maximums and minimums of the process \(P(t)\) rather than the process \(X(i)\) derived from it. The sequence \(\{\tau^a_n\}\) then defines the time moments when the price process \(P(t)\) has a local maximum or minimum and the sequence \(\{\tau^b_n\}\) defines the time moments when that local maximum or minimum is recognized, that is, the time when the process \(P(t)\) moves away from its last local maximum or minimum by a distance equal to \(H\).
More precisely, \(\tau^a_0\), \(\tau^b_0\) and \(S_0\) is defined as,
where \(S_0\) can take two values: \(1\) for a local maximum and \(−1\) for a local minimum.
Then we define \((\tau^a_n, \tau^b_n)\), \(n > 0\) recursively. The construction of the full sequence \(\{(\tau^a_n, \tau^b_n), n = 0, 1, ..., M\}\) is done inductively by alternating the following cases.
\(Case\ 1: \ \ S_{n-1} = -1\)
if \(S_{n-1} = -1\), then \(\tau^a_n, \tau^b_n\) and \(S_n\) is defined as,
\(Case\ 2: \ \ S_{n-1} = 1\)
if \(S_{n-1} = 1\), then \(\tau^a_n, \tau^b_n\) and \(S_n\) is defined as,
H-statistics
H-inversion
H-inversion counts the number of times the process \(P(t)\) changes its direction for selected \(H\), \(T\) and \(P(t)\). It is given by
where \(H\) denotes the threshold of the H-construction, and \(P\) denotes the process \(P(t)\).
H-distances
H-distances counts the sum of vertical distances between local maximums and minimums to the power \(p\). It is given by
H-volatility
H-volatility of order p measures the variability of the process \(P(t)\) for selected \(H\) and \(T\). It is given by
Strategies
Momentum Strategy
The investor buys (sells) an asset at a stopping time \(\tau^b_n\) when he or she recognizes that the process passed its previous local minimum (maximum)and the investor expects a continuation of the movement. The signal \(s_t\) is given by
where \(+1\) indicates opening a long trade or closing a short trade, \(-1\) indicates opening a short trade or closing a long trade and \(0\) indicates holding the previous position.
The profit from one trade according to the momentum H-strategy over time from \(\tau^b_{n−1}\) to \(\tau^b_{n}\) is
and the total profit from time \(0\) till time \(T\) is
Contrarian Strategy
The investor sells (buys) an asset at a stopping time \(\tau^b_n\) when he or she decides that the process has passed far enough from its previous local minimum (maximum), and the investor expects a movement reversion. The signal \(s_t\) is given by
where \(+1\) indicates opening a long trade or closing a short trade, \(-1\) indicates opening a short trade or closing a long trade and \(0\) indicates holding the previous position.
The profit from one trade according to the momentum H-strategy over time from \(\tau^b_{n−1}\) to \(\tau^b_{n}\) is
and the total profit from time $0$ till time $T$ is
Properties
It is clear that the choice of H-strategy depends on the value of H-volatility. If \(\xi^1_T > 2H\), then to achieve a positive profit the investor should employ a momentum H-strategy. If, on the other hand, \(\xi^1_T < 2H\) then the investor should use a contrarian H-strategy.
Suppose \(P(t)\) follows the Wiener process, the H-volatility \(\xi^1_T = 2H\). As a result, it is impossible to profit by trading on the process \(P(t)\). We can also see that H-volatility \(\xi^1_T = 2H\) is a property of a martingale. Likewise \(\xi^1_T > 2H\) could be a property of a sub-martingale or a super-martingale or a process that regularly switches back-and-forth over time between a sub-martingale and a super-martingale.
In this paper, the author proposes that for any mean-reverting process, regardless of its distribution, the H-volatility is less than \(2H\). Hence, theoretically, trading the mean-reverting process by the contrarian H-strategy is profitable for any choice of \(H\).
Pairs Selection
Purpose: Select trading pairs from the assets pool by using the properties of the H-construction.
Algorithm:
Determine the assets pool and the length of historical data.
Take log-prices of all assets based on the history, combine them in all possible pairs and build a spread process for each pair.
\(spread_{ij} = log(P_i) - log(P_j)\)
For each spread process, calculate its standard deviation, and set it as the threshold of the H-construction.
Determine the construction type of the H-construction.
It could be either Renko or Kagi.
Build the H-construction on the spread series formed by each possible pair.
The top N pairs with the highest/lowest H-inversion are used for pairs trading.
Mean-reverting process tends to have higher H-inversion.
Implementation
HConstruction
HSelection
Examples
HConstruction
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import yfinance as yf
>>> from arbitragelab.time_series_approach.h_strategy import HConstruction
>>> data = yf.download("KO PEP", start="2019-01-01", end="2020-12-31", progress=False)[
... "Adj Close"
... ]
>>> # Construct spread series
>>> series = np.log(data["KO"]) - np.log(data["PEP"])
>>> threshold = series["2019"].std()
>>> hc = HConstruction(series["2020"], threshold, "Kagi")
>>> # Get H-statistics
>>> hc.h_inversion()
19
>>> hc.h_distances()
1.475...
>>> hc.h_volatility()
0.0776...
>>> # Extract signals
>>> signals = hc.get_signals("contrarian")
>>> signals
Date
2020-01-02 0.0...
>>> # A quick backtest
>>> positions = signals.replace(0, np.nan).ffill()
>>> returns = data["KO"]["2020"].pct_change() - data["PEP"]["2020"].pct_change()
>>> total_returns = ((positions.shift(1) * returns).dropna() + 1).cumprod()
>>> fig = total_returns.plot()
>>> fig
<Axes:...>
HSelection
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import yfinance as yf
>>> from arbitragelab.time_series_approach.h_strategy import HSelection
>>> # Fetch data
>>> tickers = "AAPL MSFT AMZN META GOOGL GOOG TSLA NVDA JPM"
>>> data = yf.download(tickers, start="2019-01-01", end="2020-12-31", progress=False)[
... "Adj Close"
... ]
>>> hs = HSelection(data)
>>> hs.select() # Calculate H-inversion statistic
>>> pairs = hs.get_pairs(5, "highest", False)
>>> # Inspect the first pair
>>> # Each pair contains [H-inversion statistic, H-construction threshold, Asset pair]
>>> pairs[0]
[34, 0.0034..., ('GOOG', 'GOOGL')]
>>> # Inspect another pair
>>> pairs[1]
[12, 0.132..., ('AAPL', 'NVDA')]
Research Notebooks
The following research notebook can be used to better understand the method described above.