Skip to main content

Table 2 Criteria used for determining the number of clusters

From: Comparison of cancer subtype identification methods combined with feature selection methods in omics data analysis

Subtype Identification Methods

Determination of the number of clusters k

CC

Choose k in an adhoc way that produces a consensus matrix that corresponds to the cleanest consensus matrix, i.e. a matrix with all entries with either 1 or 0

NMF

Choose k where the magnitude of the cophenetic correlation coefficient which indicates the dispersion of the consensus matrix begins to fall significantly

SNF

Chooses k by using spectral clustering that aims to minimize ratiocut

PINS

Chooses k that minimizes the absolute difference between the original connectivity matrix and the perturbed connectivity matrices

ICB

Choose k in an ad hoc way by selecting k where the Bayesian Information Criterion (BIC) value reaches the minimum or where the deviance ratio reaches a plateau which both indicate that the model fits the data best when the samples are divided into k + 1 subtypes. We used both BIC and deviance ratio to select k

NEMO

Chooses k using the eigengap method by selecting k that maximizes the product of k and the difference between the eigenvalues of the average relative similarity matrix of k + 1 and k