Multivariate Cointegration Strategy

Introduction

This trading strategy takes new spread values one by one and allows checking if the conditions to open a position are fulfilled with each new timestamp and value provided. This allows for easier integration of these strategies into an existing data pipeline. Also, the strategy object keeps track of open and closed trades and the supporting information related to them.

Multivariate Cointegration Strategy

The trading strategy logic is described in more detail in the Multivariate Cointegration Framework section of the documentation.

The trading strategy itself works as follows:

  1. Estimate the cointegration vector \(\hat{\mathbf{b}}\) with Johansen test using training data. This step is done by the MultivariateCointegration class.

  2. Construct the realization \(\hat{Y}_t\) of the process \(Y_t\) by calculating \(\hat{\mathbf{b}}^T \ln P_t\), and calculate \(\hat{Z}_t = \hat{Y}_t - \hat{Y}_{t-1}\).

  3. Compute the finite sum \(\sum_{p=1}^P \hat{Z}_{t-p}\), where the lag \(P\) is the length of a data set.

  4. Partition the assets into two sets \(L\) and \(S\) according to the sign of the element in the cointegration vector \(\hat{\mathbf{b}}\).

  5. Following the formulae below, calculate the number of assets to trade so that the notional of the positions would equal to \(C\).

\[ \begin{align}\begin{aligned}\Bigg \lfloor \frac{-b^i C \text{ sgn} \bigg( \sum_{p=1}^{P} Z_{t-p} \bigg)}{P_t^i \sum_{j \in L} b^j} \Bigg \rfloor, \: i \in L\\\Bigg \lfloor \frac{b^i C \text{ sgn} \bigg( \sum_{p=1}^{P} Z_{t-p} \bigg)}{P_t^i \sum_{j \in L} b^j} \Bigg \rfloor, \: i \in S\end{aligned}\end{align} \]

Note

The trading signal is determined by \(\sum_{p=1}^{\infty} Z_{t-p}\), which sums to time period \(t-1\). The price used to convert the notional to the number of shares/contracts to trade is the closing price of time \(t\). This ensures that no look-ahead bias will be introduced.

  1. Open the positions on time \(t\) and close the positions on time \(t+1\).

  2. Every once in a while - once per month (22 trading days) for example, re-estimate the cointegration vector. If it is time for a re-estimate, go to step 1; otherwise, go to step 2.

The strategy is trading at daily frequency and always in the market.

The strategy object is initialized with the cointegration vector.

The update_price_values method allows adding new price values one by one - when they are available. At each stage, the get_signal method generates the number of shares to trade per asset according to the above-described logic. A new trade can be added to the internal dictionary using the add_trade method.

As well, the update_trades method can be used to close the previously opened trade. If so, the internal dictionaries are updated, and the list of the closed trades at this stage is returned.

Implementation

class MultivariateCointegrationTradingRule(coint_vec: array, nlags: int = 30, dollar_invest: float = 10000000.0)

This class implements trading strategy from the Multivariate Cointegration method from the paper by Galenko, A., Popova, E. and Popova, I. in “Trading in the presence of cointegration”

The strategy generates a signal - number of shares to go long and short per each asset, based on the cointegration vector from the MultivariateCointegration class.

It’s advised to re-estimate the cointegration vector (i.e. re-run the MultivariateCointegration) each month or more frequently, if the data has higher than daily granularity.

The strategy rebalances the portfolio of assets with each new entry, meaning that the position opened at time t should be closed at time t+1, and the new trade should be opened.

This strategy allows only one open trade at a time.

__init__(coint_vec: array, nlags: int = 30, dollar_invest: float = 10000000.0)

Class constructor.

Parameters:
  • coint_vec – (np.array) Cointegration vector, b.

  • nlags – (int) Amount of lags for cointegrated returns sum, corresponding to the parameter P in the paper.

  • dollar_invest – (float) The value of long/short positions, corresponding to the parameter C in the paper.

add_trade(start_timestamp: Timestamp, pos_shares: array, neg_shares: array, uuid: UUID | None = None)

Adds a new trade to track.

Parameters:
  • start_timestamp – (pd.Timestamp) Timestamp of the future label.

  • pos_shares – (np.array) Number of shares bought per asset.

  • neg_shares – (np.array) Number of shares sold per asset.

  • uuid – (str) Unique identifier used to link label to tradelog action.

static calc_log_price(price_df: DataFrame) DataFrame

Calculate the log price of each asset for position size calculation.

Parameters:

price_df – (pd.DataFrame) Dataframe that contains the raw asset price.

Returns:

(pd.DataFrame) Log prices of the assets.

get_signal() tuple

Function which calculates the number of shares to trade in the current timestamp based on the price changes, dollar investment, and cointegration vector from the MultivariateCointegration class.

Returns:

(np.array, np.array, np.array, np.array) The number of shares to trade; the notional values of positions.

update_price_values(latest_price_values: Series)

Adds latest price values of assets to self.price_series.

Parameters:

latest_price_values – (pd.Series) Latest price values.

update_trades(update_timestamp: Timestamp) list

Closes previously opened trade and updates list of closed trades.

Parameters:

update_timestamp – (pd.Timestamp) New timestamp to check vertical threshold.

Returns:

(list) List of closed trades.

Example

# Importing packages
import pandas as pd
import numpy as np

# Importing ArbitrageLab tools
from arbitragelab.cointegration_approach.multi_coint import MultivariateCointegration
from arbitragelab.trading.multi_coint import MultivariateCointegrationTradingRule

# Using MultivariateCointegration as optimizer ...

# Generating the cointegration vector to later use in a trading strategy
coint_vec = optimizer.get_coint_vec()

# Creating a strategy
strategy = MultivariateCointegrationTradingRule(coint_vec)

# Adding initial price values
strategy.update_price_values(data.iloc[0])

# Feeding price values to the strategy one by one
for ind in range(data.shape[0]):

    time = spread.index[ind]
    value = spread.iloc[ind]

    strategy.update_price_values(value)

    # Getting signal - number of shares to trade per asset
    pos_shares, neg_shares, pos_notional, neg_notional = strategy.get_signal()

    # Close previous trade
    strategy.update_trades(update_timestamp=time)

    # Add a new trade
    strategy.add_trade(start_timestamp=time, pos_shares=pos_shares, neg_shares=neg_shares)

# Checking currently open trades
open_trades = strategy.open_trades

# Checking all closed trades
closed_trades = strategy.closed_trades

Research Notebooks

The following research notebook can be used to better understand the Strategy described above.

References