PMLB: a large benchmark suite for machine learning evaluation and comparison

Olson, Randal S.; La Cava, William; Orzechowski, Patryk; Urbanowicz, Ryan J.; Moore, Jason H.

doi:10.1186/s13040-017-0154-4

BioData Mining

Table 1 Machine learning algorithms and parameters tuned in the PMLB benchmark

From: PMLB: a large benchmark suite for machine learning evaluation and comparison

Machine learning algorithm	Tuned parameters
Gaussian Naïve Bayes (NB)	No parameters.
Bernoulli Naïve Bayes	alpha: Additive smoothing parameter.
	binarize: Threshold for binarizing the features.
	fit_prior: Whether or not to learn class prior probabilities.
Multinomial Naïve Bayes	alpha: Additive smoothing parameter.
	fit_prior: Whether or not to learn class prior probabilities.
Logistic regression	C: Regularization strength.
	penalty: Whether to use Lasso or Ridge regularization.
	fit_intercept: Whether or not the intercept of the linear
	classifier should be computed.
Linear classifier trained via stochastic gradient	loss: Loss function to be optimized.
descent (SGD)	penalty: Whether to use Lasso, Ridge, or ElasticNet
	regularization.
	alpha: Regularization strength.
	learning_rate: Shrinks the contribution of each successive
	training update.
	fit_intercept: Whether or not the intercept of the linear
	classifier should be computed.
	l1_ratio: Ratio of Lasso vs. Ridge reguarlization to use.
	Only used when the ‘penalty’ is ElasticNet.
	eta0: Initial learning rate.
	power_t: Exponent for inverse scaling of the learning rate.
Linear classifier trained via the passive aggressive	loss: Loss function to be optimized.
algorithm	C: Maximum step size for regularization.
	fit_intercept: Whether or not the intercept of the linear
	classifier should be computed.
Support vector machine for classification (SVC)	kernel: ‘linear’, ‘poly’, ‘sigmoid’, or ‘rbf’.
	C: Penalty parameter for regularization.
	gamma: Kernel coef. for ‘rbf’, ‘poly’ & ‘sigmoid’ kernels.
	degree: Degree for the ‘poly’ kernel.
	coef0: Independent term in the ‘poly’ and ‘sigmoid’ kernels.
K-Nearest Neighbor (KNN)	n_neighbors: Number of neighbors to use.
	weights: Function to weight the neighbors’ votes.
Decision tree	min_weight_fraction_leaf: The minimum number of
	(weighted) samples for a node to be considered a leaf.
	Controls the depth and complexity of the decision tree.
	max_features: Number of features to consider when
	computing the best node split.
	criterion: Function used to measure the quality of a split.
Random forest & Extra random forest	n_estimators: Number of decision trees in the ensemble.
(a.k.a. Extra Trees Classifier)	min_weight_fraction_leaf: The minimum number of
	(weighted) samples for a node to be considered a leaf.
	Controls the depth and complexity of the decision trees.
	max_features: Number of features to consider when
	computing the best node split.
	criterion: Function used to measure the quality of a split.
AdaBoost	n_estimators: Number of decision trees in the ensemble.
	learning_rate: Shrinks the contribution of each successive
	decision tree in the ensemble.
Gradient tree boosting	n_estimators: Number of decision trees in the ensemble.
	learning_rate: Shrinks the contribution of each successive
	decision tree in the ensemble.
	loss: Loss function to be optimized via gradient boosting.
	max_depth: Maximum depth of the decision trees.
	Controls the complexity of the decision trees.
	max_features: Number of features to consider when
	computing the best node split.

Back to article page

ISSN: 1756-0381

Contact us

General enquiries: journalsubmissions@springernature.com