treeple.stats.build_cv_forest#
- treeple.stats.build_cv_forest(est, X, y, cv=5, test_size=0.2, verbose=False, return_indices=False, seed=None)[source]#
Build a hypothesis testing forest using using cross-validation.
- Parameters:
- estForest
The type of forest to use. Must be enabled with
bootstrap=True
.- XArrayLike of shape (n_samples, n_features)
Data.
- yArrayLike of shape (n_samples, n_outputs)
Binary target, so
n_outputs
should be at most 1.- cv
int
, optional Number of folds to use for cross-validation, by default 5.
- test_size
float
, optional Proportion of samples per tree to use for the test set, by default 0.2.
- verbose
bool
, optional Verbosity, by default False.
- return_indices
bool
, optional Whether or not to return the train and test indices, by default False.
- seed
int
, optional Random seed, by default None.
- Returns:
- estForest
Fitted forest.
- all_proba_list
list
of ArrayLike of shape (n_estimators, n_samples, n_outputs) The predicted posterior probabilities for each estimator on their out of bag samples. Length of list is equal to the number of splits.
- train_idx_list
list
of ArrayLike of shape (n_samples,) The training indices for each split.
- test_idx_list
list
of ArrayLike of shape (n_samples,) The testing indices for each split.