arbitragelab.codependence.codependence_matrix
This implementation lets user generate dependence and distance matrix based on the various methods of Information Codependence described in Cornell lecture notes on Codependence: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3512994&download=yes
Module Contents
Functions
|
This function returns a dependence matrix for elements given in the dataframe using the chosen dependence method. |
|
Applies distance operator to a dependence matrix. |
- get_dependence_matrix(df: pandas.DataFrame, dependence_method: str, theta: float = 0.5, n_bins: int = None, normalize: bool = True, estimator: str = 'standard', target_dependence: str = 'comonotonicity', gaussian_corr: float = 0.7, var_threshold: float = 0.2) pandas.DataFrame
This function returns a dependence matrix for elements given in the dataframe using the chosen dependence method.
List of supported algorithms to use for generating the dependence matrix:
information_variation
,mutual_information
,distance_correlation
,spearmans_rho
,gpr_distance
,gnpr_distance
,optimal_transport
.- Parameters:
df – (pd.DataFrame) Features.
dependence_method – (str) Algorithm to be use for generating dependence_matrix.
theta – (float) Type of information being tested in the GPR and GNPR distances. Falls in range [0, 1]. (0.5 by default)
n_bins – (int) Number of bins for discretization in
information_variation
andmutual_information
, if None the optimal number will be calculated. (None by default)normalize – (bool) Flag used to normalize the result to [0, 1] in
information_variation
andmutual_information
. (True by default)estimator – (str) Estimator to be used for calculation in
mutual_information
. [standard
,standard_copula
,copula_entropy
] (standard
by default)target_dependence – (str) Type of target dependence to use in
optimal_transport
. [comonotonicity
,countermonotonicity
,gaussian
,positive_negative
,different_variations
,small_variations
] (comonotonicity
by default)gaussian_corr – (float) Correlation coefficient to use when creating
gaussian
andsmall_variations
copulas. [from 0 to 1] (0.7 by default)var_threshold – (float) Variation threshold to use for coefficient to use in
small_variations
. Sets the relative area of correlation in a copula. [from 0 to 1] (0.2 by default)
- Returns:
(pd.DataFrame) Dependence matrix.
- get_distance_matrix(X: pandas.DataFrame, distance_metric: str = 'angular') pandas.DataFrame
Applies distance operator to a dependence matrix.
This allows to turn a correlation matrix into a distance matrix. Distances used are true metrics.
List of supported distance metrics to use for generating the distance matrix:
angular
,squared_angular
, andabsolute_angular
.- Parameters:
X – (pd.DataFrame) Dataframe to which distance operator to be applied.
distance_metric – (str) The distance metric to be used for generating the distance matrix.
- Returns:
(pd.DataFrame) Distance matrix.