Skip to main content

Table 1 Machine learning algorithms and parameters tuned in the PMLB benchmark

From: PMLB: a large benchmark suite for machine learning evaluation and comparison

Machine learning algorithm

Tuned parameters

Gaussian Naïve Bayes (NB)

No parameters.

Bernoulli Naïve Bayes

alpha: Additive smoothing parameter.

 

binarize: Threshold for binarizing the features.

 

fit_prior: Whether or not to learn class prior probabilities.

Multinomial Naïve Bayes

alpha: Additive smoothing parameter.

 

fit_prior: Whether or not to learn class prior probabilities.

Logistic regression

C: Regularization strength.

 

penalty: Whether to use Lasso or Ridge regularization.

 

fit_intercept: Whether or not the intercept of the linear

 

classifier should be computed.

Linear classifier trained via stochastic gradient

loss: Loss function to be optimized.

descent (SGD)

penalty: Whether to use Lasso, Ridge, or ElasticNet

 

regularization.

 

alpha: Regularization strength.

 

learning_rate: Shrinks the contribution of each successive

 

training update.

 

fit_intercept: Whether or not the intercept of the linear

 

classifier should be computed.

 

l1_ratio: Ratio of Lasso vs. Ridge reguarlization to use.

 

Only used when the ‘penalty’ is ElasticNet.

 

eta0: Initial learning rate.

 

power_t: Exponent for inverse scaling of the learning rate.

Linear classifier trained via the passive aggressive

loss: Loss function to be optimized.

algorithm

C: Maximum step size for regularization.

 

fit_intercept: Whether or not the intercept of the linear

 

classifier should be computed.

Support vector machine for classification (SVC)

kernel: ‘linear’, ‘poly’, ‘sigmoid’, or ‘rbf’.

 

C: Penalty parameter for regularization.

 

gamma: Kernel coef. for ‘rbf’, ‘poly’ & ‘sigmoid’ kernels.

 

degree: Degree for the ‘poly’ kernel.

 

coef0: Independent term in the ‘poly’ and ‘sigmoid’ kernels.

K-Nearest Neighbor (KNN)

n_neighbors: Number of neighbors to use.

 

weights: Function to weight the neighbors’ votes.

Decision tree

min_weight_fraction_leaf: The minimum number of

 

(weighted) samples for a node to be considered a leaf.

 

Controls the depth and complexity of the decision tree.

 

max_features: Number of features to consider when

 

computing the best node split.

 

criterion: Function used to measure the quality of a split.

Random forest & Extra random forest

n_estimators: Number of decision trees in the ensemble.

(a.k.a. Extra Trees Classifier)

min_weight_fraction_leaf: The minimum number of

 

(weighted) samples for a node to be considered a leaf.

 

Controls the depth and complexity of the decision trees.

 

max_features: Number of features to consider when

 

computing the best node split.

 

criterion: Function used to measure the quality of a split.

AdaBoost

n_estimators: Number of decision trees in the ensemble.

 

learning_rate: Shrinks the contribution of each successive

 

decision tree in the ensemble.

Gradient tree boosting

n_estimators: Number of decision trees in the ensemble.

 

learning_rate: Shrinks the contribution of each successive

 

decision tree in the ensemble.

 

loss: Loss function to be optimized via gradient boosting.

 

max_depth: Maximum depth of the decision trees.

 

Controls the complexity of the decision trees.

 

max_features: Number of features to consider when

 

computing the best node split.