Version 0.8#

This development fixes a major bug with (CO)MIGHT, where low sample sizes produce biased tree posteriors, which is fixed by stratifying the sampling of the dataset to ensure that each class is represented in the bootstrap sample. Additionally, the release includes a number of bug fixes and improvements to the codebase.

Changelog#

  • Fix Previously missing-values in X input array for sktree estimators

    did not raise an error, and silently ran, assuming the missing-values were encoded as infinity value. This is now fixed, and the estimators will raise an ValueError if missing-values are encountered in X input array. By Adam Li (##264)

  • Feature Simulations in sktree.datasets.hyppo now throw a warning instead

    of an error when the number of samples is less than the number of dimensions. By Sambit Panda (##279)

  • API Change sktree.HonestForestClassifier now has bootstrap=True as the default

    argument. By Adam Li (##274)

  • API Change Removed all instances of FeatureImportanceForestClassifier and outdated

    MIGHT code. By Adam Li (##274)

  • Fix Fixed a bug in the sktree.HonestForestClassifier where posteriors

    estimated on oob samples were biased when there was a low number of samples due to imbalance in the classes when bootstrap=True. By Adam Li (##283)

Code and Documentation Contributors#

Thanks to everyone who has contributed to the maintenance and improvement of the project since version inception, including: