treeple.stats.build_permutation_forest#
- treeple.stats.build_permutation_forest(est, perm_est, X, y, covariate_index=None, metric='s@98', n_repeats=500, verbose=False, seed=None, return_posteriors=True, **metric_kwargs)[source]#
Build a hypothesis testing forest using a permutation-forest approach.
The permutation-forest approach stems from standard permutaiton-testing, where each forest is trained on a new permutation of the dataset. The original test statistic is computed on the original data. Then the pvalue is computed by comparing the original test statistic to the null distribution of the test statistic computed from the permuted forests.
- Parameters:
- estForest
The type of forest to use. Must be enabled with
bootstrap=True
.- perm_estForest
The forest to use for the permuted dataset. Should be
PermutationHonestForestClassifier
.- XArrayLike of shape (n_samples, n_features)
Data.
- yArrayLike of shape (n_samples, n_outputs)
Binary target, so
n_outputs
should be at most 1.- covariate_indexArrayLike, optional of shape (n_covariates,)
The index array of covariates to shuffle, by default None.
- metric
str
, optional The metric to compute, by default “s@98”, for sensitivity at 98% specificity.
- n_repeats
int
, optional Number of times to bootstrap sample the two forests to construct the null distribution, by default 10000. The construction of the null forests will be parallelized according to the
n_jobs
argument of theest
forest.- verbose
bool
, optional Verbosity, by default False.
- seed
int
, optional Random seed, by default None.
- return_posteriors
bool
, optional Whether or not to return the posteriors, by default True.
- **metric_kwargs
dict
, optional Additional keyword arguments to pass to the metric function.
- Returns:
- observe_stat
float
The test statistic. To compute the test statistic, take
permute_stat_
and subtractobserve_stat_
.- pvalue
float
The p-value of the test statistic.
- orig_forest_probaArrayLike of shape (n_estimators, n_samples, n_outputs)
The predicted posterior probabilities for each estimator on their out of bag samples.
- perm_forest_probaArrayLike of shape (n_estimators, n_samples, n_outputs)
The predicted posterior probabilities for each of the permuted estimators on their out of bag samples.
- observe_stat
References