Skip to main content

Table 1 Summary of datasets

From: Scalable non-negative matrix tri-factorization

Dataset

Database

Rows

Columns

Shape

Data type

Density

Nonzero

Fetus

GIANT [29]

25,569

25,608

rectangular

sparse

4.7%

31M

TCGA-BRCA

GDC [25]

1,222

60,483

wide

dense

100.0%

74M

E-TABM-185

ArrayExpress [28]

5,896

22,283

tall

dense

100.0%

131M

Retina

GIANT [29]

25,823

25,822

rectangular

dense

22.0%

147M

Cochlea

GIANT [29]

25,824

25,824

rectangular

dense

42.0%

280M

TCGA-Methyl

GDC [25]

10,181

485,577

wide

dense

81.4%

3841M

  1. We manually categorized each data matrix into three shapes: tall datasets have substantially more rows than columns, wide datasets vice versa, and rectangular datasets have a comparable number of rows and columns. Density denotes the fraction of nonzero matrix elements. The number of nonzero elements in each matrix is given in the last column