Note

The following implementations and documentation closely follow the publication by Bogomolov, T: Pairs trading based on statistical variability of the spread process. Quantitative Finance, 13(9): 1411–1430.

H-Strategy

In this paper, the author proposes a new non-parametric approach to pairs trading based on the idea of Renko and Kagi charts. This approach exploits statistical information about the variability of the tradable process. The approach does not aim to find a long-run mean of the process and trade towards it like other methods of pairs trading. Instead, it manages the problem of how far the process should move in one direction before trading in the opposite direction potentially becomes profitable, which is done by measuring the variability of the process.

H-construction

Suppose $P(t)$ is a continuous time series on the time interval $[0, T]$.

Renko construction

Step 1: Generate the Renko Process

The Renko process $X(i)$ is defined as,

\[X(i) : X(i) = P(\tau_i), i = 0, 1, ..., N,\]

where $\tau_i$, $i = 0, 1, ..., N$ is an increasing sequence of time moments such that

for some arbitrary $H > 0$, $\tau_0 = 0$ and $P(\tau_0) = P(0)$,

\[H \leq \max \limits_{t \in [0,T]} P(t) - \min \limits_{t \in [0,T]} P(t),\]

\[\tau_i = inf\{u \in [\tau_{i - 1}, T] : |P(u) − P(\tau_{i - 1})| = H\}.\]

Step 2: Determine Turning Points

We create another sequence of time moments $\{(\tau^a_n, \tau^b_n), n = 0, 1, ..., M\}$ based on the sequence ${\tau_i}$. The sequence $\{\tau^a_n\}$ defines time moments when the renko process $X(i)$ has a local maximum or minimum, that is the process $X(i) = P(\tau_i)$ changes its direction, and the sequence $\{\tau^b_n\}$ defines the time moments when the local maximum or minimum is detected.

More precisely, when take $\tau^a_0 = \tau_0$ and $\tau^b_0 = \tau_1$ then

\[\tau^b_n = min\{\tau_i > \tau^b_{n-1}: (P(\tau_i) − P(\tau_{i-1}))(P(\tau_{i-1}) − P(\tau_{i-2})) < 0\},\]

\[\tau^a_n = \{\tau_{i - 1} : \tau^b_n = \tau_i\}.\]

Kagi construction

The Kagi construction is similar to the Renko construction with the only difference being that to create the sequence of time moments $\{(\tau^a_n, \tau^b_n), n = 0, 1, ..., M\}$ for the Kagi construction we use local maximums and minimums of the process $P(t)$ rather than the process $X(i)$ derived from it. The sequence $\{\tau^a_n\}$ then defines the time moments when the price process $P(t)$ has a local maximum or minimum and the sequence $\{\tau^b_n\}$ defines the time moments when that local maximum or minimum is recognized, that is, the time when the process $P(t)$ moves away from its last local maximum or minimum by a distance equal to $H$.

More precisely, $\tau^a_0$, $\tau^b_0$ and $S_0$ is defined as,

\[\tau^b_0 = inf\{u \in [0, T] : \max \limits_{t \in [0,u]} P(t) − \min \limits_{t \in [0,u]} P(t) = H\},\]

\[\tau^a_0 = inf\{u < \tau^b_0: |P(u) − P(\tau^b_0)| = H\},\]

\[S_0 = sign(P(\tau^a_0) − P(\tau^b_0)),\]

where $S_0$ can take two values: $1$ for a local maximum and $−1$ for a local minimum.

Then we define $(\tau^a_n, \tau^b_n)$, $n > 0$ recursively. The construction of the full sequence $\{(\tau^a_n, \tau^b_n), n = 0, 1, ..., M\}$ is done inductively by alternating the following cases.

$Case\ 1: \ \ S_{n-1} = -1$

if $S_{n-1} = -1$, then $\tau^a_n, \tau^b_n$ and $S_n$ is defined as,

\[\tau^b_n = inf\{u \in [\tau^a_{n-1}, T] : P(u) − \min \limits_{t \in [\tau^a_{n-1}\ \ ,\ u]} P(t) = H\},\]

\[\tau^a_n = inf\{u < \tau^b_n: P(u) = \min \limits_{t \in [\tau^a_{n-1}\ \ ,\ \tau^b_n]} P(t)\},\]

\[S_n = 1.\]

$Case\ 2: \ \ S_{n-1} = 1$

if $S_{n-1} = 1$, then $\tau^a_n, \tau^b_n$ and $S_n$ is defined as,

\[\tau^b_n = inf\{u \in [\tau^a_{n-1}, T] : \max \limits_{t \in [\tau^a_{n-1}\ \ ,\ u]} P(t) - P(u) = H\},\]

\[\tau^a_n = inf\{u < \tau^b_n: P(u) = \max \limits_{t \in [\tau^a_{n-1}\ \ ,\ \tau^b_n]} P(t)\},\]

\[S_n = -1.\]

H-statistics

H-inversion

H-inversion counts the number of times the process $P(t)$ changes its direction for selected $H$, $T$ and $P(t)$. It is given by

\[N_T (H, P) = \max \{n : \tau^{b}_{n} = T\} = N,\]

where $H$ denotes the threshold of the H-construction, and $P$ denotes the process $P(t)$.

H-distances

H-distances counts the sum of vertical distances between local maximums and minimums to the power $p$. It is given by

\[V^p_T (H, P) = \sum_{n = 1}^{N}|P(\tau^a_n) − P(\tau^a_{n−1})|^p.\]

H-volatility

H-volatility of order p measures the variability of the process $P(t)$ for selected $H$ and $T$. It is given by

\[\xi^p_T = {V^p_T (H, P)}/{N_T (H, P)}.\]

Strategies

Momentum Strategy

The investor buys (sells) an asset at a stopping time $\tau^b_n$ when he or she recognizes that the process passed its previous local minimum (maximum)and the investor expects a continuation of the movement. The signal $s_t$ is given by

\[\begin{split}s_t = \left\{\begin{array}{l} +1,\ if\ t = \tau^b_n\ and\ P(\tau^b_n) - P(\tau^a_n) > 0\\ -1,\ if\ t = \tau^b_n\ and\ P(\tau^b_n) - P(\tau^a_n) < 0\\ 0,\ otherwise \end{array}\right.\end{split}\]

where $+1$ indicates opening a long trade or closing a short trade, $-1$ indicates opening a short trade or closing a long trade and $0$ indicates holding the previous position.

The profit from one trade according to the momentum H-strategy over time from $\tau^b_{n−1}$ to $\tau^b_{n}$ is

\[Y_{\tau^b_n} = (P(\tau^b_n) − P(\tau^b_{n−1})) · sign(P(\tau^a_n) − P(\tau^a_{n−1}))\]

and the total profit from time $0$ till time $T$ is

\[Y_T(H, P) = (\xi^1_T (H, P) − 2H) \cdot N_T (H, P)\]

Contrarian Strategy

The investor sells (buys) an asset at a stopping time $\tau^b_n$ when he or she decides that the process has passed far enough from its previous local minimum (maximum), and the investor expects a movement reversion. The signal $s_t$ is given by

\[\begin{split}s_t = \left\{\begin{array}{l} +1,\ if\ t = \tau^b_n\ and\ P(\tau^b_n) - P(\tau^a_n) < 0\\ -1,\ if\ t = \tau^b_n\ and\ P(\tau^b_n) - P(\tau^a_n) > 0\\ 0,\ otherwise \end{array}\right.\end{split}\]

where $+1$ indicates opening a long trade or closing a short trade, $-1$ indicates opening a short trade or closing a long trade and $0$ indicates holding the previous position.

The profit from one trade according to the momentum H-strategy over time from $\tau^b_{n−1}$ to $\tau^b_{n}$ is

\[Y_{\tau^b_n} = (P(\tau^b_n) − P(\tau^b_{n−1})) · sign(P(\tau^a_{n−1}) - P(\tau^a_n)),\]

and the total profit from time $0$ till time $T$ is

\[Y_T(H, P) = (2H - \xi^1_T (H, P)) \cdot N_T (H, P).\]

Properties

It is clear that the choice of H-strategy depends on the value of H-volatility. If $\xi^1_T > 2H$, then to achieve a positive profit the investor should employ a momentum H-strategy. If, on the other hand, $\xi^1_T < 2H$ then the investor should use a contrarian H-strategy.

Suppose $P(t)$ follows the Wiener process, the H-volatility $\xi^1_T = 2H$. As a result, it is impossible to profit by trading on the process $P(t)$. We can also see that H-volatility $\xi^1_T = 2H$ is a property of a martingale. Likewise $\xi^1_T > 2H$ could be a property of a sub-martingale or a super-martingale or a process that regularly switches back-and-forth over time between a sub-martingale and a super-martingale.

In this paper, the author proposes that for any mean-reverting process, regardless of its distribution, the H-volatility is less than $2H$. Hence, theoretically, trading the mean-reverting process by the contrarian H-strategy is profitable for any choice of $H$.

Pairs Selection

Purpose: Select trading pairs from the assets pool by using the properties of the H-construction.
Algorithm:
1. Determine the assets pool and the length of historical data.
2. Take log-prices of all assets based on the history, combine them in all possible pairs and build a spread process for each pair.
  $spread_{ij} = log(P_i) - log(P_j)$
3. For each spread process, calculate its standard deviation, and set it as the threshold of the H-construction.
4. Determine the construction type of the H-construction.
  It could be either Renko or Kagi.
5. Build the H-construction on the spread series formed by each possible pair.
6. The top N pairs with the highest/lowest H-inversion are used for pairs trading.
  Mean-reverting process tends to have higher H-inversion.

Implementation

HConstruction

HSelection

Examples

HConstruction

>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import yfinance as yf
>>> from arbitragelab.time_series_approach.h_strategy import HConstruction
>>> data = yf.download("KO PEP", start="2019-01-01", end="2020-12-31", progress=False)[
...     "Adj Close"
... ]
>>> # Construct spread series
>>> series = np.log(data["KO"]) - np.log(data["PEP"])
>>> threshold = series["2019"].std()
>>> hc = HConstruction(series["2020"], threshold, "Kagi")
>>> # Get H-statistics
>>> hc.h_inversion()  
19
>>> hc.h_distances()  
1.475...
>>> hc.h_volatility()  
0.0776...
>>> # Extract signals
>>> signals = hc.get_signals("contrarian")
>>> signals  
Date
2020-01-02 0.0...
>>> # A quick backtest
>>> positions = signals.replace(0, np.nan).ffill()
>>> returns = data["KO"]["2020"].pct_change() - data["PEP"]["2020"].pct_change()
>>> total_returns = ((positions.shift(1) * returns).dropna() + 1).cumprod()
>>> fig = total_returns.plot()
>>> fig  
<Axes:...>

HSelection

>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import yfinance as yf
>>> from arbitragelab.time_series_approach.h_strategy import HSelection
>>> # Fetch data
>>> tickers = "AAPL MSFT AMZN META GOOGL GOOG TSLA NVDA JPM"
>>> data = yf.download(tickers, start="2019-01-01", end="2020-12-31", progress=False)[
...     "Adj Close"
... ]
>>> hs = HSelection(data)
>>> hs.select()  # Calculate H-inversion statistic
>>> pairs = hs.get_pairs(5, "highest", False)
>>> # Inspect the first pair
>>> # Each pair contains [H-inversion statistic, H-construction threshold, Asset pair]
>>> pairs[0]  
[34, 0.0034..., ('GOOG', 'GOOGL')]
>>> # Inspect another pair
>>> pairs[1]  
[12, 0.132..., ('AAPL', 'NVDA')]

Research Notebooks

The following research notebook can be used to better understand the method described above.

H-Strategy

Research Article

Presentation Slides

References

Bogomolov, T., Pairs trading based on statistical variability of the spread process. Quantitative Finance, 13(9): 1411–1430.