treeple.experimental.mutual_info_ksg#

treeple.experimental.mutual_info_ksg(X, Y, Z=None, k=0.2, metric='forest', algorithm='kd_tree', n_jobs=-1, transform='rank', random_seed=None)[source]#

Compute the generalized (conditional) mutual information KSG estimate.

Parameters:
XArrayLike of shape (n_samples, n_features_x)

The X covariate space.

YArrayLike of shape (n_samples, n_features_y)

The Y covariate space.

ZArrayLike of shape (n_samples, n_features_z), optional

The Z covariate space, by default None. If None, then the MI is computed. If Z is defined, then the CMI is computed.

kfloat, optional

The number of neighbors to use in defining the radius, by default 0.2.

metricstr

Any distance metric accepted by sklearn.neighbors.NearestNeighbors. If ‘forest’ (default), then uses an treeple.UnsupervisedObliqueRandomForest to compute geodesic distances.

algorithmstr, optional

Method to use, by default ‘knn’. Can be (‘ball_tree’, ‘kd_tree’, ‘brute’).

n_jobsint, optional

Number of parallel jobs, by default -1.

transformone of {‘rank’, ‘standardize’, ‘uniform’}

Preprocessing, by default “rank”.

random_seedint, optional

Random seed, by default None.

Returns:
valfloat

The estimated MI, or CMI value.

Notes

Given a dataset with n samples, the KSG estimator proceeds by:

  1. For fixed k, get the distance to the kth nearest-nbr in XYZ subspace, call it ‘r’

  2. Get the number of NN in XZ subspace within radius ‘r’

  3. Get the number of NN in YZ subspace within radius ‘r’

  4. Get the number of NN in Z subspace within radius ‘r’

  5. Apply analytic solution for KSG estimate

For MI, the analytical solution is:

\[\psi(k) - E[(\psi(n_x) + \psi(n_y))] + \psi(n)\]

For CMI, the analytical solution is:

\[\psi(k) - E[(\psi(n_{xz}) + \psi(n_{yz}) - \psi(n_{z}))]\]

where \(\psi\) is the DiGamma function, and each expectation term is estimated by taking the sample average.

Note that the \(n_i\) terms denote the number of neighbors within radius ‘r’ in the subspace of ‘i’, where ‘i’ could be for example the X, Y, XZ, etc. subspaces. This term does not include the sample itself.