Note

The following documentation closely follows the book by Ernest P. Chan: Algorithmic Trading: Winning Strategies and Their Rationale.

Kalman Filter



Introduction

While for truly cointegrating price series we can use the tools described in the cointegration approach (Johansen and Engle-Granger tests), however, for real price series we might want to use other tools to estimate the hedge ratio, as the cointegration property can be hard to achieve as the hedge ratio changes through time.

Using a look-back period, as in the cointegration approach to estimate the parameters of a model has its disadvantages, as a short period can cut a part of the information. Then we might improve these methods by using an exponential weighting of observations, but it’s not obvious if this weighting is optimal either.

../_images/kalman_cumulative_returns.png

Cumulative returns of Kalman Filter Strategy on a EWC-EWA pair. An example from “Algorithmic Trading: Winning Strategies and Their Rationale” by Ernest P. Chan.

This module describes a scheme that allows using the Kalman filter for hedge ratio updating, as presented in the book by Ernest P. Chan “Algorithmic Trading: Winning Strategies and Their Rationale”.

One of the advantages of this approach is that we don’t have to pick a weighting scheme for observations in the look-back period. Based on this scheme a Kalman Filter Mean Reversion Strategy can be created, which it is also described in this module.

Kalman Filter

Following the descriptions by Ernest P. Chan:

The “Kalman filter is an optimal linear algorithm that updates the expected value of a hidden variable based on the latest value of an observable variable.

It is linear because it assumes that the observable variable is a linear function of the hidden variable with noise. It also assumes the hidden variable at time \(t\) is a linear function of itself at time \(t − 1\) with noise, and that the noises present in these functions have Gaussian distributions (and hence can be specified with an evolving covariance matrix, assuming their means to be zero.) Because of all these linear relations, the expected value of the hidden variable at time \(t\) is also a linear function of its expected value prior to the observation at \(t\), as well as a linear function of the value of the observed variable at \(t\).

The Kalman filter is optimal in the sense that it is the best estimator available if we assume that the noises are Gaussian, and it minimizes the mean square error of the estimated variables.”

As we’re searching for the hedge ratio, we’re using the following linear function:

\[y(t) = x(t) \beta(t) + \epsilon(t)\]

where \(y\) and \(x\) are price series of the first and the second asset, \(\beta\) is the hedge ratio that we are searching and \(\epsilon\) is the Gaussian noise with variance \(V_{\epsilon}\).

Allowing the spread between the \(x\) and \(y\) to have a nonzero mean, \(\beta\) will be a vector of size \((2, 1)\) denoting both the intercept and the slope of the linear relation between \(x\) and \(y\). For this needs, the \(x(t)\) is augmented with a vector of ones to create an array of size \((N, 2)\).

Next, an assumption is made that the regression coefficient changes in the following way:

\[\beta(t) = \beta(t-1) + \omega(t-1)\]

where \(\omega\) is a Gaussian noise with covariance \(V_{\omega}\). So the regression coefficient at time \(t\) is equal to the regression coefficient at time \(t-1\) plus noise.

With this specification, the Kalman filter can generate the expected value of the hedge ratio \(\beta\) at each observation \(t\).

Kalman filter also generates an estimate of the standard deviation of the forecast error of the observable variable. It can be used as the moving standard deviation of a Bollinger band.

../_images/kalman_slope.png

Slope estimated between EWC(y) and EWA(x) using the Kalman Filter. An example from “Algorithmic Trading: Winning Strategies and Their Rationale” by Ernest P. Chan.

../_images/kalman_intercept.png

Intercept estimated between EWC(y) and EWA(x) using the Kalman Filter. An example from “Algorithmic Trading: Winning Strategies and Their Rationale” by Ernest P. Chan.

Implementation

class KalmanFilterStrategy(observation_covariance: float = 0.001, transition_covariance: float = 0.0001)

KalmanFilterStrategy implements a dynamic hedge ratio estimation between two assets using the Kalman filter. Kalman Filter is a state space model that assumes the system state evolves following some hidden and unobservable pattern. The goal of the state space model is to infer information about the states, given the observations, as new information arrives. The strategy has two important values to fit: observation covariance and transition covariance.

There are two ways to fit them: using cross-validation technique or by applying Autocovariance Least Squares (ALS) algorithm. Kalman filter approach generalizes a rolling linear regression estimate.

This class implements the Kalman Filter from the book by E.P Chan: “Algorithmic Trading: Winning Strategies and Their Rationale”,

__init__(observation_covariance: float = 0.001, transition_covariance: float = 0.0001)

Init Kalman Filter strategy.

Kalman filter has two important parameters which need to be set in advance or optimized: observation covariance and transition covariance.

Parameters:
  • observation_covariance – (float) Observation covariance value.

  • transition_covariance – (float) Transition covariance value.

update(x: float, y: float)

Update the hedge ratio based on the recent observation of two assets.

By default, y is the observed variable and x is the hidden one. That is the hedge ratio for y is 1 and the hedge ratio for x is estimated by the Kalman filter.

Mean-reverting portfolio series is formed by:

y - self.hedge_ratios * x

One can get spread series from self.spread_series and self.spread_std_series to trade the Bollinger Bands strategy.

Parameters:
  • x – (float) X variable value (hidden).

  • y – (float) Y variable value.

Kalman Filter Strategy

Quantities that were computed using the Kalman filter can be utilized to generate trading signals.

The forecast error \(e(t)\) can be interpreted as the deviation of a pair spread from the predicted value. This spread can be bought when it has high negative values and sold when it has high positive values.

As a threshold for the \(e(t)\), its standard deviation \(\sqrt{Q(t)}\) is used:

  • If \(e(t) < - entry\_std\_score * \sqrt{Q(t)}\), a long position on the spread should be taken: Long \(N\) units of the \(y\) asset and short \(N*\beta\) units of the \(x\) asset.

  • If \(e(t) \ge - exit\_std\_score * \sqrt{Q(t)}\), a long position on the spread should be closed.

  • If \(e(t) > entry\_std\_score * \sqrt{Q(t)}\), a short position on the spread should be taken: Short \(N\) units of the \(y\) asset and long \(N*\beta\) units of the \(x\) asset.

  • If \(e(t) \le exit\_std\_score * \sqrt{Q(t)}\), a short position on the spread should be closed.

So it’s the same logic as in the Bollinger Band Strategy from the Mean Reversion module.

../_images/kalman_cumulative_returns.png

Cumulative returns of Kalman Filter Strategy on a EWC-EWA pair. An example from “Algorithmic Trading: Winning Strategies and Their Rationale” by Ernest P. Chan.

Implementation

class KalmanFilterStrategy(observation_covariance: float = 0.001, transition_covariance: float = 0.0001)

KalmanFilterStrategy implements a dynamic hedge ratio estimation between two assets using the Kalman filter. Kalman Filter is a state space model that assumes the system state evolves following some hidden and unobservable pattern. The goal of the state space model is to infer information about the states, given the observations, as new information arrives. The strategy has two important values to fit: observation covariance and transition covariance.

There are two ways to fit them: using cross-validation technique or by applying Autocovariance Least Squares (ALS) algorithm. Kalman filter approach generalizes a rolling linear regression estimate.

This class implements the Kalman Filter from the book by E.P Chan: “Algorithmic Trading: Winning Strategies and Their Rationale”,

trading_signals(entry_std_score: float = 3, exit_std_score: float = -3) DataFrame

Generate trading signals based on existing data.

This method uses recorded forecast errors and standard deviations of forecast errors to generate trading signals, as described in the book by E.P Chan “Algorithmic Trading: Winning Strategies and Their Rationale”.

The logic is to have a long position open from e(t) < -entry_std_score * sqrt(Q(t)) till e(t) >= -exit_std_score * sqrt(Q(t)) And a short position from e(t) > entry_std_score * sqrt(Q(t)) till e(t) <= exit_std_score * sqrt(Q(t))

where e(t) is a forecast error at time t, and sqrt(Q(t)) is the standard deviation of standard errors at time t.

Parameters:
  • entry_std_score – (float) Number of st.d. values to enter (long or short) the position.

  • exit_std_score – (float) Number of st.d. values to exit (long or short) the position.

Returns:

(pd.DataFrame) Series with forecast errors and target allocation on each observation.

Examples

# Importing packages
import pandas as pd
from arbitragelab.other_approaches.kalman_filter import KalmanFilterStrategy

# Getting the dataframe with time series of asset prices
data = pd.read_csv('X_FILE_PATH.csv', index_col=0, parse_dates = [0])

# Running the Kalman Filter to find the slope, forecast error, etc.
filter_strategy = KalmanFilterStrategy()

# We assume the first element is X and the second is Y
for observations in data.values:
   filter_strategy.update(observations[0], observations[1])

# Getting a list of the hedge ratios
hedge_ratios = filter_strategy.hedge_ratios

# Getting a list of intercepts
intercepts = filter_strategy.intercepts

# Getting a list of forecast errors
forecast_errors = filter_strategy.spread_series

# Getting a list of forecast error standard deviations
error_st_dev = filter_strategy.spread_std_series

# Getting a DataFrame with trading signals
target_quantities = filter_strategy.trading_signals(self,
                                                    entry_std_score=3,
                                                    exit_std_score=-3)

Research Notebooks

The following research notebook can be used to better understand the Kalman Filter approach and strategy described above.

Presentation Slides


References