arbitragelab.codependence.codependence_matrix

This implementation lets user generate dependence and distance matrix based on the various methods of Information Codependence described in Cornell lecture notes on Codependence: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3512994&download=yes

Module Contents

Functions

get_dependence_matrix(→ pandas.DataFrame)

This function returns a dependence matrix for elements given in the dataframe using the chosen dependence method.

get_distance_matrix(→ pandas.DataFrame)

Applies distance operator to a dependence matrix.

get_dependence_matrix(df: pandas.DataFrame, dependence_method: str, theta: float = 0.5, n_bins: int = None, normalize: bool = True, estimator: str = 'standard', target_dependence: str = 'comonotonicity', gaussian_corr: float = 0.7, var_threshold: float = 0.2) pandas.DataFrame

This function returns a dependence matrix for elements given in the dataframe using the chosen dependence method.

List of supported algorithms to use for generating the dependence matrix: information_variation, mutual_information, distance_correlation, spearmans_rho, gpr_distance, gnpr_distance, optimal_transport.

Parameters:
  • df – (pd.DataFrame) Features.

  • dependence_method – (str) Algorithm to be use for generating dependence_matrix.

  • theta – (float) Type of information being tested in the GPR and GNPR distances. Falls in range [0, 1]. (0.5 by default)

  • n_bins – (int) Number of bins for discretization in information_variation and mutual_information, if None the optimal number will be calculated. (None by default)

  • normalize – (bool) Flag used to normalize the result to [0, 1] in information_variation and mutual_information. (True by default)

  • estimator – (str) Estimator to be used for calculation in mutual_information. [standard, standard_copula, copula_entropy] (standard by default)

  • target_dependence – (str) Type of target dependence to use in optimal_transport. [comonotonicity, countermonotonicity, gaussian, positive_negative, different_variations, small_variations] (comonotonicity by default)

  • gaussian_corr – (float) Correlation coefficient to use when creating gaussian and small_variations copulas. [from 0 to 1] (0.7 by default)

  • var_threshold – (float) Variation threshold to use for coefficient to use in small_variations. Sets the relative area of correlation in a copula. [from 0 to 1] (0.2 by default)

Returns:

(pd.DataFrame) Dependence matrix.

get_distance_matrix(X: pandas.DataFrame, distance_metric: str = 'angular') pandas.DataFrame

Applies distance operator to a dependence matrix.

This allows to turn a correlation matrix into a distance matrix. Distances used are true metrics.

List of supported distance metrics to use for generating the distance matrix: angular, squared_angular, and absolute_angular.

Parameters:
  • X – (pd.DataFrame) Dataframe to which distance operator to be applied.

  • distance_metric – (str) The distance metric to be used for generating the distance matrix.

Returns:

(pd.DataFrame) Distance matrix.