Comparison of cancer subtype identification methods combined with feature selection methods in omics data analysis

Table 2 Criteria used for determining the number of clusters

Subtype Identification Methods	Determination of the number of clusters k
CC	Choose k in an adhoc way that produces a consensus matrix that corresponds to the cleanest consensus matrix, i.e. a matrix with all entries with either 1 or 0
NMF	Choose k where the magnitude of the cophenetic correlation coefficient which indicates the dispersion of the consensus matrix begins to fall significantly
SNF	Chooses k by using spectral clustering that aims to minimize ratiocut
PINS	Chooses k that minimizes the absolute difference between the original connectivity matrix and the perturbed connectivity matrices
ICB	Choose k in an ad hoc way by selecting k where the Bayesian Information Criterion (BIC) value reaches the minimum or where the deviance ratio reaches a plateau which both indicate that the model fits the data best when the samples are divided into k + 1 subtypes. We used both BIC and deviance ratio to select k
NEMO	Chooses k using the eigengap method by selecting k that maximizes the product of k and the difference between the eigenvalues of the average relative similarity matrix of k + 1 and k

ISSN: 1756-0381