treeple.stats.build_cv_forest#

treeple.stats.build_cv_forest(est, X, y, cv=5, test_size=0.2, verbose=False, return_indices=False, seed=None)[source]#

Build a hypothesis testing forest using using cross-validation.

Parameters:
estForest

The type of forest to use. Must be enabled with bootstrap=True.

XArrayLike of shape (n_samples, n_features)

Data.

yArrayLike of shape (n_samples, n_outputs)

Binary target, so n_outputs should be at most 1.

cvint, optional

Number of folds to use for cross-validation, by default 5.

test_sizefloat, optional

Proportion of samples per tree to use for the test set, by default 0.2.

verbosebool, optional

Verbosity, by default False.

return_indicesbool, optional

Whether or not to return the train and test indices, by default False.

seedint, optional

Random seed, by default None.

Returns:
estForest

Fitted forest.

all_proba_listlist of ArrayLike of shape (n_estimators, n_samples, n_outputs)

The predicted posterior probabilities for each estimator on their out of bag samples. Length of list is equal to the number of splits.

train_idx_listlist of ArrayLike of shape (n_samples,)

The training indices for each split.

test_idx_listlist of ArrayLike of shape (n_samples,)

The testing indices for each split.