Sparse Mean-reverting Portfolio Selection



Introduction

Assets that exhibit significant mean-reversion are difficult to find in efficient markets. As a result, investors focus on creating long-short asset baskets to form a mean-reverting portfolio whose aggregate value shows mean-reversion. Classic solutions, including cointegration or canonical correlation analysis, can only construct dense mean-reverting portfolios, i.e. they include every asset in the investing universe. These portfolios have shown significant such as higher transaction costs, worse P&L interpretability, and inability to capture meaningful statistical arbitrage opportunities. On the other hand, sparse mean-reverting portfolios, which require trading as few assets as possible, can mitigate these shortcomings.

This module provides all the tools to construct sparse mean-reverting portfolios, including:

  • Covariance selection and penalized regression techniques to narrow down the investing universe

  • Greedy search to construct sparse mean-reverting portfolios

  • Semidefinite programming (SDP) approach to construct sparse mean-reverting portfolios above a volatility threshold

In addition, the module also includes Box-Tiao canonical decomposition for constructing dense mean-reverting portfolios and Ornstein-Uhlenbeck (OU) model fitting to directly compare mean-reversion strength of the portfolios.

Mean-reversion Strength Metrics and Proxies

Suppose the investing universe consists of \(n\)-assets, and the sparse mean-reverting portfolio selection problem can be summarized as follows:

Note

Maximize the mean-reversion strength, while trading \(k\)-assets (\(k \ll n\)) in the portfolio.

One solution is to assume the portfolio value follows an Ornstein-Uhlenbeck (OU) process and use the mean-reversion speed parameter \(\lambda\) to measure the mean-reversion strength.

\begin{gather*} dP_t = \lambda (\bar{P} - P_t)dt + \sigma dZ_t \\ P_t = \sum_{i=1}^n x_i S_{ti} \\ \lambda > 0 \end{gather*}

However, it is hard to express the OU mean-reversion speed \(\lambda\) as a function of the portfolio weight vector \(\mathbf{x}\). Instead of optimizing \(\lambda\), this module will only use \(\lambda\) to evaluate the sparse portfolios generated and employ three other mean-reversion strength proxies to solve the sparse mean-reverting portfolio selection problem.

  1. Predictability based on Box-Tiao canonical decomposition.

  2. Portmanteau statistic.

  3. Crossing statistic.

Predictability and Box-Tiao Canonical Decomposition

Univariate Case

Assume that the portfolio value \(P\) follows the univariate recursion below:

\[P_t = \hat{P}_{t-1} + \varepsilon_t\]

where \(\hat{P}_{t-1}\) is a predictor of \(x_t\) built upon all past portfolio values recorded up to \(t-1\), and \(\varepsilon_t\) is a vector i.i.d. Gaussian noise, where \(\varepsilon_t \sim N(0, \Sigma)\), independent of all past portfolio values \(P_0, P_1, \ldots, P_{t-1}\).

Now calculate the variance of the both sides of the equation:

\[\mathbf{E}[x_t^2] = \mathbf{E}[\hat{x}_{t-1}^2] + \mathbf{E}[\varepsilon_t^2]\]

Denote the variance of \(x_t\) by \(\sigma^2\), the variance of \(\hat{x}_{t-1}\) by \(\hat{\sigma}^2\), then we can write the above equation as:

\begin{align*} \sigma^2 & = \hat{\sigma}^2 + \Sigma \\ 1 & = \frac{\hat{\sigma}^2}{\sigma^2} + \frac{\Sigma}{\sigma^2} \end{align*}

The predictability is thus defined by the ratio:

\[\nu \equiv \frac{\hat{\sigma}^2}{\sigma^2}\]

and its meaning is straightforward. When \(\nu\) is small, the variance of the Gaussian noise \(\Sigma\) dominates and the portfolio value process will look like noise and is more strongly mean-reverting. Otherwise, the variance of the predicted value \(\hat{\sigma}^2\) dominates and the portfolio value process can be accurately predicted on average. Figure 1 shows a comparison between a series with high predictability and a series with low predictability.

../_images/MR_strength_box_tiao.png

Figure 1. Two series with different levels of predictability.

Multivariate Case

The portfolio value is a linear combination of the asset prices, and can be explicitly expressed as:

\[\mathbf{x}^T S_t = \mathbf{x}^T \hat{S}_{t-1} + \mathbf{x}^T \varepsilon_t\]

Without loss of generality, the price of each asset can be assumed to have a zero mean, and the predictability can now be written as:

\[\nu = \frac{\mathbf{x}^T \hat{\Gamma_0} \mathbf{x}}{\mathbf{x}^T \Gamma_0 \mathbf{x}}\]

where \(\hat{\Gamma_0}\) and \(\Gamma_0\) are the covariance matrices of \(\hat{S}_{t-1}\) and \(S_t\), respectively. The machinery of Box-Tiao canonical decomposition requires a model assumption to form \(\hat{S}_{t-1}\), the conditional expectation of \(S_t\) given previous observations, and here the vector autoregressive model of order one - the VAR(1) model - is chosen. The process \(S_t\) can be now written as:

\[S_t = S_{t-1} A + Z_t\]

where \(A\) is a \(n\) by \(n\) square matrix and \(Z_t\) is a vector of i.i.d. Gaussian noise with \(Z_t \sim N(0, \Sigma)\), independent of \(S_{t-1}\).

There are two ways to get the estimate of \(A\). Either use the ordinary least square (OLS) estimate:

\[\hat{A} = (S_{t-1}^T S_{t-1})^{-1} S_{t-1}^T S_t\]

or the Yule-Walker estimate:

\[\hat{A} = \gamma_0^{-1} \gamma_1\]

where \(\gamma_k\) is the sample lag-\(k\) autocovariance matrix, defined as:

\begin{align*} \gamma_k & \equiv \frac{1}{T-k-1} \sum_{t=1}^{T-k}\tilde{S}_t \tilde{S}_{t+k}^T \\ \tilde{S}_t & \equiv S_t - \frac{1}{T} \sum_{t=1}^T S_t \end{align*}

The module follows closely to the d’Aspremont (2011) and Cuturi (2015) papers as to which estimate of \(A\) is used in the portfolio selection optimization.

The predictability of the time series under the VAR(1) model assumption can be now written as:

\[\nu = \frac{\mathbf{x}^T A^T \Gamma_0 A \mathbf{x}}{\mathbf{x}^T \Gamma_0 \mathbf{x}}\]

Finding a strongly mean-reverting portfolio is equivalent to minimizing the predictability of the portfolio value, and the solution is given by the minimum generalized eigenvalue \(\lambda_{\text{min}}\) by solving

\[\det (\lambda_{\text{min}} \Gamma_0 - A^T \Gamma_0 A) = 0\]

For most financial series, \(\Gamma_0\) is positive definite, and the portfolio weight is the eigenvector corresponding to the smallest eigenvalue \(\lambda_{\text{min}}\) of the matrix \(\Gamma_0^{-1} A \Gamma_0 A^T\). The Box-Tiao canonical decomposition in this module will solve all eigenvectors of this matrix and sort them according to their corresponding eigenvalues in descending order, which represents \(n\) portfolios ranked by decreasing predictability.

Portmanteau Statistic

Portmanteau statistic of order \(p\) (Ljung and Box, 1978) tests if a process is a white noise. By definition, the portmanteau statistic is 0 if a process is a white noise. Therefore, maximizing mean-reversion strength is equivalent to minimizing the portmanteau statistic.

The advantage of the portmanteau statistic over the Box-Tiao predictability is that this statistic requires no modeling assumptions. The disadvantage, on the other hand, is higher computational complexity. The estimate of the portmanteau statistic of order \(p\) is given as follows:

\[\hat{\phi}_p(y) = \frac{1}{p} \sum_{i=1}^p \Big( \frac{\mathbf{x}^T \gamma_i \mathbf{x}}{\mathbf{x}^T \gamma_0 \mathbf{x}} \Big)^2\]

The definition of \(\gamma_i\) has already been given in the previous subsection.

Crossing Statistic

Kedem and Yakowitz (1994) define the crossing statistic of a univariate process \(x_t\) as the expected number of crosses around 0 per unit of time:

\[\xi(x_t) = \mathbf{E} \Bigg[ \frac{\sum_{t=2}^T \unicode{x1D7D9}_{\{x_t x_{t-1} \leq 0 \}}}{T-1} \Bigg]\]

For a stationary AR(1) process, the crossing statistic can be reformulated with the cosine formula:

\[\xi(x_t) = \frac{\arccos (a)}{\pi}\]

where \(a\) is the first-order autocorrelation, or the AR(1) coefficient of the stationary AR(1) process, where \(\lvert a \rvert < 1\). The function \(y = \arccos (a)\) is monotonic decreasing with respect to \(a\) when \(\lvert a \rvert < 1\). Therefore, stronger mean-reversion strength implies a greater crossing statistic, which in turn implies a smaller first-order autocorrelation. To extend this result to the multivariate case, Cuturi (2015) proposed to minimize \(\mathbf{x}^T \gamma_1 \mathbf{x}\) and ensure that all absolute higher-order autocorrelations \(\lvert \mathbf{x}^T \gamma_k \mathbf{x} \rvert, \, k > 1\) are small.

Covariance Selection via Graphical LASSO and Structured VAR(1) Estimate via Penalized Regression

The Box-Tiao canonical decomposition relies on estimates of both the covariance matrix \(\Gamma_0\) and the VAR(1) coefficient matrix \(A\) of the asset prices. Using an \(\ell_1\)-penalty, as shown in d’Aspremont (2011), is able to simultaneously obtain numerically stable estimates and isolate key idiosyncratic dependencies in the asset prices. The penalized estimates of \(\Gamma_0\) and \(A\) provides a different perspective on the conditional dependencies and their graphical representations help cluster the assets into several smaller groups. Therefore, both covariance selection and structured VAR(1) estimation can be regarded as a preprocessing step for the sparse canonical decomposition techniques, i.e. the greedy search and the SDP approach.

Covariance Selection

Covariance selection is a process where the maximum likelihood estimation of the covariance matrix \(\Gamma_0\) is penalized by setting a certain number of coefficients in the inverse covariance matrix \(\Gamma_0^{-1}\) to zero. Zeroes in \(\Gamma_0^{-1}\) corresponds to conditionally independent assets in the model, and the penalized, or sparse, estimate of \(\Gamma_0^{-1}\) is both numerically robust and indicative of the underlying structure of the asset price dynamics.

The sparse estimate of \(\Gamma_0^{-1}\) is obtained by solving the following optimization problem:

\[\max_X \log \det X - \mathbf{Tr} (\Sigma X) - \alpha \lVert X \rVert_1\]

where \(\Sigma = \gamma_0\) is the sample covariance matrix, \(\alpha> 0\) is the \(\ell_1\)-regularization parameter, and \(\lVert X \rVert_1\) is the sum of the absolute value of all the matrix elements. Figure 2 demonstrates an example where the graph structure of the inverse covariance matrix becomes sparser as \(\alpha\) increases.

../_images/cov_select_demo-opt.gif

Figure 2. Covariance selection at work.

Structured VAR(1) Model Estimate

Recall that under a VAR(1) model assumption, the asset prices \(S_t\) follow the following process:

\[S_t = S_{t-1} A + Z_t\]

For most financial time series, the noise terms are correlated such that \(Z_t \sim N(0, \Sigma)\), where the noise covariance is \(\Sigma\). In this case, the VAR(1) coefficient matrix \(A\) has to be directly estimated from the data. A structured (penalized), or sparse, estimate of \(A\) can be obtained column-wise via a LASSO regression by minimizing the following objective function:

\[\DeclareMathOperator*{\argmin}{arg\,min} a_i = \argmin_x \lVert S_{it} - S_{t-1}x \rVert^2 + \lambda \lVert x \rVert_1\]

where \(a_i\) is a column of the matrix \(A\), and \(S_{it}\) is the price of asset \(i\).

The sparse estimate of \(A\) can be also obtained by applying a more aggressive penalty under a multi-task LASSO model. The objective function being minimized is:

\[\DeclareMathOperator*{\argmin}{arg\,min} \argmin_A \lVert S_{t} - S_{t-1}A \rVert^2 + \alpha \sum_i \sqrt{\sum_j a_{ij}^2}\]

where \(a_{ij}\) is the element of the matrix \(A\).

The multi-task LASSO model will suppress the coefficients in an entire column to zero, but its estimate is less robust than the column-wise LASSO regression in terms of numerical stability. The module provides the option to choose which LASSO model to use for the sparse estimate of \(A\). Figure 3 and 4 demonstrate the difference between the estimate generated by the column-wise LASSO regression and the one by the multi-task LASSO.

../_images/column_lasso_demo-opt.gif

Figure 3. Column-wise LASSO at work.

../_images/multitask_lasso_demo-opt.gif

Figure 4. Multi-task LASSO at work.

Pinpoint the Clusters

If the Gaussian noise in the VAR(1) model is uncorrelated:

\[S_t = S_{t-1} A + Z_t, \; Z_t \sim N(0, \sigma \mathbf{I}), \, \sigma > 0\]

then Gilbert (1994) has shown that the graph of the inverse covariance matrix \(\Gamma_0^{-1}\) and the graph of \(A^T A\) share the same structure, i.e. the graph of \(\Gamma_0^{-1}\) and \(A^T A\) are disconnected along the same clusters of assets.

When the Gaussian noise \(Z_t\) is correlated, however, the above relation is no longer valid, but it is still possible to find common clusters between the graph of penalized estimate of \(\Gamma_0^{-1}\) and penalized estimate of \(A^T A\). This will help find a much smaller investing universe for sparse mean-reverting portfolios. Figure 5 demonstrates an example of the clustering process.

../_images/cluster.gif

Figure 5. The clustering process via penalized estimates of the inverse covariance matrix and VAR(1) coefficient matrix.

Semidefinite Programming (SDP) Approach

An alternative to greedy search is to relax the cardinality constraint and reformulate the original non-convex optimization problem

\begin{align*} \text{minimize } & \frac{\mathbf{x}^T \mathbf{A} \mathbf{x}}{\mathbf{x}^T \mathbf{B} \mathbf{x}} \\ \text{subject to } & \lVert \mathbf{x} \rVert_0 \leq k \\ & \lVert \mathbf{x} \rVert_2 = 1 \end{align*}

into a convex one. The concept “convex” means “when an optimal solution is found, then it is guaranteed to be the best solution”. The convex optimization problem is formed in terms of the symmetric matrix \(X = \mathbf{x}\mathbf{x}^T\):

\begin{align*} \text{minimize } & \frac{\mathbf{Tr} (\mathbf{A} X)}{\mathbf{Tr} (\mathbf{B} X)} \\ \text{subject to } & \mathbf{1}^T \lvert X \rvert \mathbf{1} \leq k \\ & \mathbf{Tr} (X) = 1 \\ & X \succeq 0 \end{align*}

The objective function is the quotient of the traces of two matrices, which is only quasi-convex. The following change of variables can make it convex:

\begin{align*} Y & = \frac{X}{\mathbf{Tr}(\mathbf{B} X)} \\ z & = \frac{1}{\mathbf{Tr}(\mathbf{B} X)} \end{align*}

and the optimization problem can be rewritten as:

\begin{align*} \text{minimize } & \mathbf{Tr}(\mathbf{A} Y) \\ \text{subject to } & \mathbf{1}^T \lvert Y \rvert \mathbf{1} \leq k \, \mathbf{Tr}(Y) \\ & \mathbf{Tr} (\mathbf{B} Y) = 1 \\ & \mathbf{Tr} (Y) \geq 0 \\ & Y \succeq 0 \end{align*}

This is now a convex optimization problem that can be solved with SDP. However, this convex relaxation formulation suffers from a few drawbacks.

  1. It has numerical stability issues;

  2. It cannot properly handle volatility constraints;

  3. It cannot optimize mean-reversion strength proxies other than predictability.

Instead of implementing this SDP formulation with a relaxed cardinality constraint, this module followed the regularizer form of the SDP proposed by Cuturi (2015) to mitigate the above drawbacks.

The predictability optimization SDP is as follows:

\begin{align*} \text{minimize } & \mathbf{Tr} (\gamma_1 \gamma_0^{-1} \gamma_1^T X) + \rho \lVert X \rVert_1 \\ \text{subject to } & \mathbf{Tr} (\gamma_0 X) \geq V \\ & \mathbf{Tr} (X) = 1 \\ & X \succeq 0 \end{align*}

The portmanteau statistic optimization SDP is as follows:

\begin{align*} \text{minimize } & \sum_{i=1}^p \mathbf{Tr} (\gamma_i X)^2 + \rho \lVert X \rVert_1 \\ \text{subject to } & \mathbf{Tr} (\gamma_0 X) \geq V \\ & \mathbf{Tr} (X) = 1 \\ & X \succeq 0 \end{align*}

The crossing statistic optimization SDP is as follows:

\begin{align*} \text{minimize } & \mathbf{Tr}(\gamma_1 X) + \mu \sum_{i=2}^p \mathbf{Tr} (\gamma_i X)^2 + \rho \lVert X \rVert_1 \\ \text{subject to } & \mathbf{Tr} (\gamma_0 X) \geq V \\ & \mathbf{Tr} (X) = 1 \\ & X \succeq 0 \end{align*}

where \(\rho>0\) is the \(\ell_1\)-regularization parameter, \(\mu>0\) is a specific regularization parameter for crossing statistic optimization, and \(V > 0\) is the portfolio variance lower threshold.

In some restricted cases, the convex relaxation are tight, which means the optimal solution of the SDP is exactly the optimal solution to the original non-convex problem. However, in most cases this correspondence is not guaranteed and the optimal solution of these SDPs has to be deflated into a rank one matrix \(xx^T\) where \(x\) can be considered as a good candidate for portfolio weights with the designated cardinality \(k\). This module uses Truncated Power method (Yuan and Zhang, 2013) as the deflation method to retrieve the leading sparse vector of the optimal solution \(X^*\) that has \(k\) non-zero elements.

Implementation

This module selects sparse mean-reverting portfolios out of an asset universe. The methods implemented in this module include the following:

  1. Box-Tiao canonical decomposition.

  2. Greedy search.

  3. Semidefinite relaxation.

  4. Graphical LASSO regression for sparse covariance selection.

  5. Column-wise LASSO and multi-task LASSO regression for sparse VAR(1) coefficient matrix estimation.

  6. Semidefinite programming approach to predictability optimization under a minimum volatility constraint.

  7. Semidefinite programming approach to portmanteau statistics optimization under a minimum volatility constraint.

  8. Semidefinite programming approach to crossing statistics optimization under a minimum volatility constraint.

class SparseMeanReversionPortfolio(assets: DataFrame)

Module for sparse mean reversion portfolio selection.

__init__(assets: DataFrame)

Constructor of the small mean-reverting portfolio identification module. The constructor will subtract the mean price of each asset from the original price such that the price processes have zero mean.

Parameters:

assets – (pd.DataFrame) The price history of each asset.

LASSO_VAR_fit(alpha: float, multi_task_lasso: bool = True, max_iter: int = 1000, threshold: int = 10, use_standardized: bool = True) array

Fit the LASSO model with the designated alpha for a sparse VAR(1) coefficient matrix estimate.

Parameters:
  • alpha – (float) Optimized l1-regularization coefficient.

  • multi_task_lasso – (bool) If True, use multi-task LASSO for sparse estimate, where the LASSO will yield full columns of zeros; otherwise, do LASSO column-wise.

  • max_iter – (int) Maximum number of iterations of LASSO regression.

  • threshold – (int) Round precision cutoff threshold. For example, a threshold of n means that a number less than \(10^{-n}\) will be treated as zero.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

Returns:

(np.array) Sparse estimate of VAR(1) matrix.

LASSO_VAR_tuning(sparsity: float, multi_task_lasso: bool = False, alpha_min: float = -5.0, alpha_max: float = 0.0, n_alphas: int = 100, max_iter: int = 1000, use_standardized: bool = True) float

Tune the l1-regularization coefficient (alpha) of LASSO regression for a sparse estimate of the VAR(1) matrix.

Parameters:
  • sparsity – (float) Percentage of zeros required in the VAR(1) matrix.

  • multi_task_lasso – (bool) If True, use multi-task LASSO for sparse estimate, where the LASSO will yield full columns of zeros; otherwise, do LASSO column-wise.

  • alpha_min – (float) Minimum l1-regularization coefficient.

  • alpha_max – (float) Maximum l1-regularization coefficient.

  • n_alphas – (int) Number of l1-regularization coefficient for the parameter search.

  • max_iter – (int) Maximum number of iterations for LASSO regression.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

Returns:

(float) Minimum alpha that satisfies the sparsity requirement.

property assets: DataFrame

Getter for the asset price data.

Returns:

(pd.DataFrame) The price history of each asset.

autocov(nlags: int, symmetrize: bool = True, use_standardized: bool = True) array

Calculate the autocovariance matrix.

Parameters:
  • nlags – Lag of autocovariance. If nlags = 0, return the covariance matrix.

  • symmetrize – (bool) If True, symmetrize the autocovariance matrix \(\frac{A^T + A}{2}\); otherwise, return the original autocovariance matrix.

  • use_standardized – (bool) If True, use standardized data; otherwise, use demeaned data.

Returns:

(np.array) Autocovariance or covariance matrix.

box_tiao(threshold: int = 7) array

Perform Box-Tiao canonical decomposition on the assets dataframe.

Parameters:

threshold – (int) Round precision cutoff threshold. For example, a threshold of n means that a number less than \(10^{-n}\) will be rounded to zero.

Returns:

(np.array) The weighting of each asset in the portfolio. There will be N decompositions for N assets, where each column vector corresponds to one portfolio. The order of the weightings correspond to the descending order of the eigenvalue.

static check_symmetric(matrix: array, rtol: float = 1e-05, atol: float = 1e-08) bool

Check if a matrix is symmetric.

Parameters:
  • matrix – (np.array) The matrix under inspection.

  • rtol – (float) Relative tolerance for np.allclose.

  • atol – (float) Absolute tolerance for np.allclose.

Returns:

(bool) True if the matrix symmetric, False otherwise.

covar_sparse_fit(alpha: float, max_iter: int = 1000, threshold: int = 10, use_standardized: bool = True) Tuple[array, array]

Fit the graphical LASSO model using the optimized alpha for a sparse covariance matrix estimate.

Parameters:
  • alpha – (float) Optimized regularization coefficient of graphical LASSO.

  • max_iter – (int) Maximum number of iterations for graphical LASSO fit.

  • threshold – (int) Round precision cutoff threshold. For example, a threshold of n means that a number less than \(10^{-n}\) will be treated as zero.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

Returns:

(np.array, np.array) Sparse estimate of covariance matrix; inverse of the sparse covariance matrix, i.e. precision matrix as graph representation.

covar_sparse_tuning(max_iter: int = 1000, alpha_min: float = 0.0, alpha_max: float = 1.0, n_alphas: int = 100, clusters: int = 3, use_standardized: bool = True) float

Tune the regularization parameter (alpha) of the graphical LASSO model for a sparse estimate of the covariance matrix.

Parameters:
  • max_iter – (int) Maximum number of iterations for graphical LASSO fit.

  • alpha_min – (float) Minimum regularization parameter.

  • alpha_max – (float) Maximum regularization parameter.

  • n_alphas – (int) Number of regularization parameter for parameter search.

  • clusters – (int) Number of smaller clusters desired from the precision matrix. The higher the number, the larger the best alpha will be. This parameter cannot exceed the number of assets.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

Returns:

(float) Optimal alpha to split the graph representation of the inverse covariance matrix into designated number of clusters.

property demeaned: DataFrame

Getter for the demeaned price data.

Returns:

(pd.DataFrame) The processed price history of each asset with zero mean.

find_clusters(precision_matrix: array, var_estimate: array) Graph

Use the intersection of the graph \(\Gamma^{-1}\) and the graph \(A^T A\) to pinpoint the clusters of assets to perform greedy search or semidefinite relaxation on.

Parameters:
  • precision_matrix – (np.array) The inverse of the estimated sparse covariance matrix.

  • var_estimate – (np.array) The sparse estimate of VAR(1) coefficient matrix.

Returns:

(networkx.Graph) A graph representation of the clusters.

Greedy search algorithm for sparse decomposition.

Parameters:
  • cardinality – (int) Number of assets to include in the portfolio.

  • var_est – (np.array) Estimated VAR(1) coefficient matrix.

  • cov_est – (np.array) Estimated covariance matrix.

  • threshold – (int) Round precision cutoff threshold. For example, a threshold of n means that a number less than \(10^{-n}\) will be treated as zero.

  • maximize – (bool) If True, maximize predictability; otherwise, minimize predictability.

Returns:

(np.array) Weight of each selected assets.

static is_semi_pos_def(matrix: array) bool

Check if a matrix is positive definite.

Parameters:

matrix – (np.array) The matrix under inspection.

Returns:

(bool) True if the matrix is positive definite, False otherwise.

least_square_VAR_fit(use_standardized: bool = False) array

Calculate the least square estimate of the VAR(1) matrix.

Parameters:

use_standardized – (bool) If True, use standardized data; otherwise, use demeaned data.

Returns:

(np.array) Least square estimate of VAR(1) matrix.

static mean_rev_coeff(weights: array, assets: DataFrame, interval: str = 'D') Tuple[float, float]

Calculate the Ornstein-Uhlenbeck model mean reversion speed and half-life.

Parameters:
  • weights – (np.array) The weightings for each asset.

  • assets – (pd.DataFrame) The price history of each asset.

  • interval – (str) The time interval, or the frequency, of the price data. Options are [‘D’, ‘M’, ‘Y’].

Returns:

(float, float) Mean reversion coefficient; half-life of the OU process.

sdp_crossing_vol(rho: float, mu: float, variance: float, nlags: int = 3, use_standardized: bool = True, verbose: bool = True, max_iter: int = 10000) array

Semidefinite relaxation optimization of crossing statistic with a volatility threshold following the formulation of Cuturi and d’Aspremont (2015).

\begin{align*} \text{minimize } & \mathbf{Tr}(\gamma_1 Y) + \mu \sum_{i=2}^p \mathbf{Tr}(\gamma_i Y)^2 + \rho \lVert Y \rVert_1 \\ \text{subject to } & \mathbf{Tr}(\gamma_0 Y) >= V \\ & \mathbf{Tr}(Y) = 1 \\ & Y \succeq 0 \end{align*}

where \(\gamma_i\) is the lag-\(k\) sample autocovariance (when \(k=0\), it is the sample covariance). \(V\) is the variance lower bound of the portfolio.

Parameters:
  • rho – (float) Regularization parameter of the \(l_1\)-norm in the objective function.

  • mu – (float) Regularization parameter of higher-order autocovariance.

  • variance – (float) Variance lower bound for the portfolio.

  • nlags – (int) Order of portmanteau statistic \(p\).

  • verbose – (bool) If True, print the SDP solver iteration details for debugging; otherwise, suppress the debug output.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

  • max_iter – (int) Set number of iterations for the SDP solver.

Returns:

(np.array) The optimized matrix \(Y\).

sdp_portmanteau_vol(rho: float, variance: float, nlags: int = 3, use_standardized: bool = True, verbose: bool = True, max_iter: int = 10000) array

Semidefinite relaxation optimization of portmanteau statistic with a volatility threshold following the formulation of Cuturi and d’Aspremont (2015).

\begin{align*} \text{minimize } & \sum_{i=1}^p \mathbf{Tr}(\gamma_i Y)^2 + \rho \lVert Y \rVert_1 \\ \text{subject to } & \mathbf{Tr}(\gamma_0 Y) >= V \\ & \mathbf{Tr}(Y) = 1 \\ & Y \succeq 0 \end{align*}

where \(\gamma_i\) is the lag-\(k\) sample autocovariance (when \(k=0\), it is the sample covariance). \(V\) is the variance lower bound of the portfolio.

Parameters:
  • rho – (float) Regularization parameter of the \(l_1\)-norm in the objective function.

  • variance – (float) Variance lower bound for the portfolio.

  • nlags – (int) Order of portmanteau statistic \(p\).

  • verbose – (bool) If True, print the SDP solver iteration details for debugging; otherwise, suppress the debug output.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

  • max_iter – (int) Set number of iterations for the SDP solver.

Returns:

(np.array) The optimized matrix \(Y\).

sdp_predictability_vol(rho: float, variance: float, use_standardized: bool = True, verbose: bool = True, max_iter: int = 10000) array

Semidefinite relaxation optimization of predictability with a volatility threshold following the formulation of Cuturi and d’Aspremont (2015).

\begin{align*} \text{minimize } & \mathbf{Tr}(\gamma_1 \gamma_0^{-1} \gamma_1^T Y) + \rho \lVert Y \rVert_1 \\ \text{subject to } & \mathbf{Tr}(\gamma_0 Y) >= V \\ & \mathbf{Tr}(Y) = 1 \\ & Y \succeq 0 \end{align*}

where \(\gamma_i\) is the lag-\(k\) sample autocovariance (when \(k=0\), it is the sample covariance). \(V\) is the variance lower bound of the portfolio.

Parameters:
  • rho – (float) Regularization parameter of the \(l_1\)-norm in the objective function.

  • variance – (float) Variance lower bound for the portfolio.

  • verbose – (bool) If True, print the SDP solver iteration details for debugging; otherwise, suppress the debug output.

  • use_standardized – (bool) If True, use standardized data for optimization; otherwise, use de-meaned data.

  • max_iter – (int) Set number of iterations for the SDP solver.

Returns:

(np.array) The optimized matrix \(Y\).

sparse_eigen_deflate(sdp_result: array, cardinality: int, tol: float = 1e-06, max_iter: int = 100, verbose: bool = True) array

Calculate the leading sparse eigenvector of the SDP result. Deflate the original leading eigenvector to the input cardinality using Truncated Power method (Yuan and Zhang, 2013).

The Truncated Power method is ported from the Matlab code provided by the original authors.

Parameters:
  • sdp_result – (np.array) The optimization result from semidefinite relaxation.

  • cardinality – (int) Desired cardinality of the sparse eigenvector.

  • tol – (float) Convergence tolerance of the Truncated Power method.

  • max_iter – (int) Maximum number of iterations for Truncated Power method.

  • verbose – (bool) If True, print the Truncated Power method iteration details; otherwise, suppress the debug output.

Returns:

(np.array) Leading sparse eigenvector of the SDP result.

property standardized: DataFrame

Getter for the standardized price data.

Returns:

(pd.DataFrame) The standardized price history of each asset with zero mean and unit variance.

Examples

Tip

  • The module provides options to use either standardized data or zero-centered mean data. It is recommended to try both options to obtain the best results.

  • Once a smaller cluster is obtained via covariance selection and structured VAR estimate, it is recommended to create a new SparseMeanReversionPortfolio class using only the asset prices within the cluster and perform greedy search or SDP optimization.

  • Always check the number in the deflated eigenvector obtained from SDP optimization. For example, if the deflated eigenvector has several weights that are significantly smaller than the others, then this suggests that the regularization parameter \(\rho\) is too large. Try a smaller \(\rho\) to achieve the desired cardinality.

Load Data

# Importing packages
import pandas as pd
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
from arbitragelab.cointegration_approach.sparse_mr_portfolio import SparseMeanReversionPortfolio

# Read price series data, set date as index
data = pd.read_csv('X_FILE_PATH.csv', parse_dates=['Date'])
data.set_index('Date', inplace=True)

# Initialize the sparse mean-reverting portfolio searching class
sparse_portf = SparseMeanReversionPortfolio(data)

Box-Tiao Canonical Decomposition

# Perform Box-Tiao canonical decomposition
bt_weights = sparse_portf.box_tiao()

# Check the weight that corresponds to the largest eigenvalue and the one to the smallest
# Retrieve the weights by column vectors
bt_least_mr = bt_weights[:, 0]
bt_most_mr = bt_weights[:, -1]

Covariance Selection and Structured VAR Estimate

# Covariance selection: Tune the graphical LASSO regularization parameter such that
# the graph of the inverse covariance matrix is split into 4 clusters
best_covar_alpha = sparse_portf.covar_sparse_tuning(alpha_min=0.6, alpha_max=0.9,
                                                    n_alphas=200, clusters=4)

# Fit the model using the optimal parameter
covar, precision = sparse_portf.covar_sparse_fit(best_cover_alpha)

# Structured VAR estimate: Tune the column-wise LASSO regularization parameter such that
# 50% of the VAR(1) coefficient matrix is 0
best_VAR_alpha = sparse_portf.LASSO_VAR_tuning(0.5, alpha_min=3e-4, alpha_max=7e-4,
                                               n_alphas=200, max_iter=5000)

# Fit the model using the optimal parameter
VAR_matrix = sparse_portf.LASSO_VAR_fit(best_VAR_alpha, max_iter=5000)

# Narrow down the clusters
cluster = etf_sparse_portf.find_clusters(precision, VAR_matrix)

# Print the largest cluster (the small ones are usually single nodes)
print(max(nx.connected_components(cluster), key=len))

Greedy Search

# Get sample covariance estimate and least-square VAR(1) coefficient matrix estimate
full_VAR_est = sparse_portf.least_square_VAR_fit()
full_cov_est = sparse_portf.autocov(0)

# Construct the sparse mean-reverting portfolio with the smallest predictability
# with a designated number of assets
cardinality = 8
greedy_weight = sparse_portf.greedy_search(cardinality, full_VAR_est, full_cov_est, maximize=False)

# Check the OU mean-reversion coefficient and half-life
mr_speed, half_life = sparse_portf.mean_rev_coeff(greedy_weight.squeeze(), sparse_portf.assets)

SDP with Convex Relaxation

Predictability with variance lower bound

# Use the covariance estimate, VAR(1) coefficient matrix estimate,
# and the designated cardinality from greedy search

# Calculate the median of the variance of each assets
variance_median = np.median(sparse_portf.autocov(0, use_standardized=False))

# Use a fraction of the median variance as the lower threshold and solve the SDP
sdp_pred_vol_result = sparse_portf.sdp_predictability_vol(rho=0.001,
                                                          variance=0.3*variance_median,
                                                          max_iter=5000, use_standardized=False)

# Deflate the SDP solution into the portfolio vector
sdp_pred_vol_weights = sparse_portf.sparse_eigen_deflate(sdp_pred_vol_result, cardinality)

# Check the OU mean-reversion coefficient and half-life
mr_speed, half_life = sparse_portf.mean_rev_coeff(sdp_pred_vol_weights.squeeze(),
                                                  sparse_portf.assets)

Portmanteau statistic with variance lower bound

# Use a fraction of the median variance as the lower threshold and solve the SDP
sdp_portmanteau_vol_result = sparse_portf.sdp_portmanteau_vol(rho=0.001,
                                                              variance=0.3*variance_median,
                                                              nlags=3, max_iter=10000,
                                                              use_standardized=False)

# Deflate the SDP solution into the portfolio vector
sdp_portmanteau_vol_weights = sparse_portf.sparse_eigen_deflate(sdp_portmanteau_vol_result,
                                                                cardinality)

# Check the OU mean-reversion coefficient and half-life
mr_speed, half_life = sparse_portf.mean_rev_coeff(sdp_pred_portmanteau_vol_weights.squeeze(),
                                                  sparse_portf.assets)

Crossing statistic with variance lower bound

# Use a fraction of the median variance as the lower threshold and solve the SDP
sdp_crossing_vol_result = sparse_portf.sdp_crossing_vol(rho=0.001, mu=0.01,
                                                        variance=0.3*variance_median,
                                                        nlags=3, max_iter=10000,
                                                        use_standardized=False)

# Deflate the SDP solution into the portfolio vector
sdp_crossing_vol_weights = sparse_portf.sparse_eigen_deflate(sdp_crossing_vol_result, cardinality)

# Check the OU mean-reversion coefficient and half-life
mr_speed, half_life = sparse_portf.mean_rev_coeff(sdp_pred_crossing_vol_weights.squeeze(),
                                                  sparse_portf.assets)

Research Notebook

The following research notebook can be used to better understand how to use the sparse mean-reverting portfolio selection module.

Research Article


Presentation Slides

../_images/sparse_mr_slides.png

References