Skip to main content

Table 1 Machine learning algorithms and parameters tuned in the PMLB benchmark

From: PMLB: a large benchmark suite for machine learning evaluation and comparison

Machine learning algorithm Tuned parameters
Gaussian Naïve Bayes (NB) No parameters.
Bernoulli Naïve Bayes alpha: Additive smoothing parameter.
  binarize: Threshold for binarizing the features.
  fit_prior: Whether or not to learn class prior probabilities.
Multinomial Naïve Bayes alpha: Additive smoothing parameter.
  fit_prior: Whether or not to learn class prior probabilities.
Logistic regression C: Regularization strength.
  penalty: Whether to use Lasso or Ridge regularization.
  fit_intercept: Whether or not the intercept of the linear
  classifier should be computed.
Linear classifier trained via stochastic gradient loss: Loss function to be optimized.
descent (SGD) penalty: Whether to use Lasso, Ridge, or ElasticNet
  regularization.
  alpha: Regularization strength.
  learning_rate: Shrinks the contribution of each successive
  training update.
  fit_intercept: Whether or not the intercept of the linear
  classifier should be computed.
  l1_ratio: Ratio of Lasso vs. Ridge reguarlization to use.
  Only used when the ‘penalty’ is ElasticNet.
  eta0: Initial learning rate.
  power_t: Exponent for inverse scaling of the learning rate.
Linear classifier trained via the passive aggressive loss: Loss function to be optimized.
algorithm C: Maximum step size for regularization.
  fit_intercept: Whether or not the intercept of the linear
  classifier should be computed.
Support vector machine for classification (SVC) kernel: ‘linear’, ‘poly’, ‘sigmoid’, or ‘rbf’.
  C: Penalty parameter for regularization.
  gamma: Kernel coef. for ‘rbf’, ‘poly’ & ‘sigmoid’ kernels.
  degree: Degree for the ‘poly’ kernel.
  coef0: Independent term in the ‘poly’ and ‘sigmoid’ kernels.
K-Nearest Neighbor (KNN) n_neighbors: Number of neighbors to use.
  weights: Function to weight the neighbors’ votes.
Decision tree min_weight_fraction_leaf: The minimum number of
  (weighted) samples for a node to be considered a leaf.
  Controls the depth and complexity of the decision tree.
  max_features: Number of features to consider when
  computing the best node split.
  criterion: Function used to measure the quality of a split.
Random forest & Extra random forest n_estimators: Number of decision trees in the ensemble.
(a.k.a. Extra Trees Classifier) min_weight_fraction_leaf: The minimum number of
  (weighted) samples for a node to be considered a leaf.
  Controls the depth and complexity of the decision trees.
  max_features: Number of features to consider when
  computing the best node split.
  criterion: Function used to measure the quality of a split.
AdaBoost n_estimators: Number of decision trees in the ensemble.
  learning_rate: Shrinks the contribution of each successive
  decision tree in the ensemble.
Gradient tree boosting n_estimators: Number of decision trees in the ensemble.
  learning_rate: Shrinks the contribution of each successive
  decision tree in the ensemble.
  loss: Loss function to be optimized via gradient boosting.
  max_depth: Maximum depth of the decision trees.
  Controls the complexity of the decision trees.
  max_features: Number of features to consider when
  computing the best node split.