Skip to main content

Table 1 Summary of datasets

From: Scalable non-negative matrix tri-factorization

Dataset Database Rows Columns Shape Data type Density Nonzero
Fetus GIANT [29] 25,569 25,608 rectangular sparse 4.7% 31M
TCGA-BRCA GDC [25] 1,222 60,483 wide dense 100.0% 74M
E-TABM-185 ArrayExpress [28] 5,896 22,283 tall dense 100.0% 131M
Retina GIANT [29] 25,823 25,822 rectangular dense 22.0% 147M
Cochlea GIANT [29] 25,824 25,824 rectangular dense 42.0% 280M
TCGA-Methyl GDC [25] 10,181 485,577 wide dense 81.4% 3841M
  1. We manually categorized each data matrix into three shapes: tall datasets have substantially more rows than columns, wide datasets vice versa, and rectangular datasets have a comparable number of rows and columns. Density denotes the fraction of nonzero matrix elements. The number of nonzero elements in each matrix is given in the last column