Conservation machine learning

Table 1 Conservation random forests

Feat	Info	Cl	Perf forests	Perf jungles	Imp
10	3	2	0.85 (0.02)	0.85 (0.02)	0.0%
20	10	3	0.83 (0.02)	0.84 (0.02)	1.2%
50	40	4	0.64 (0.03)	0.69 (0.03)	7.8%
100	90	5	0.49 (0.03)	0.62 (0.03)	26.5%
200	150	6	0.3 (0.03)	0.45 (0.03)	50.0%
300	270	7	0.22 (0.03)	0.35 (0.03)	59.1%
400	350	8	0.18 (0.03)	0.29 (0.03)	61.1%
500	400	9	0.14 (0.02)	0.22 (0.03)	57.1%
1000	500	10	0.11 (0.02)	0.15 (0.03)	36.4%
1000	800	10	0.12 (0.02)	0.17 (0.03)	41.7%

Each line shows the results of 30 replicate experiments, with 5-fold cross validation, 100 independent runs per fold, forests of size 100, and resultant jungles of size 10,000. Feat: number of features in the dataset. Info: number of informative features. Cl: number of target classes. Perf forests: mean performance of forests on test set across all replicates (SD). Perf jungles: mean performance of jungles on test set across all replicates (SD); a jungle’s output was computed through straightforward majority voting. Imp: Percent improvement of Perf jungles vs. Perf forests

ISSN: 1756-0381