Skip to main content

Table 2 Summary of the results. The finished column shows how many times the distance calculation was successful for different choices of read length and correlation. The following columns contain average assembly time, distance matrix calculation time, Pearson’s correlation coefficient of distance matrices, the Fowlkes-Mallows index for \(k=4\) and \(k=8\), and the triplets distance. Note that the triplets distance is calculated only on a sample of read length and coverage values on the influenza and various datasets. The averaged results are only for the situations when the method finished. The rank columns show the average rank of the methods in distance calculation time and correlation, including the situations when the method did not finish. The ‘reference’ method calculates distances of the original sequences. We show only assembly algorithms that gave the highest and the lowest correlation. From d-type measures, the one with the highest correlation is selected. For an explanation of the rank column, see Evaluation criteria section

From: Reference-free phylogeny from sequencing data

\(\mathsf{Data}\)

\(\mathsf{method}\)

\(\mathsf{finished}\)

\(\frac{\mathsf {assem.}}{\textrm{ms}}\)

\(\frac{\mathsf{distances}}{\textrm{ms}}\)

\(\frac{\mathsf{rank}}{\mathsf{distances}}\)

\(\mathsf {corr.}\)

\(\frac{\mathsf{rank}}{\mathsf {corr.}}\)

\(B_{4}\)

\(B_{8}\)

\(\mathsf {trip.d.}\)

\(\mathsf{Influenza}\)

\(\mathsf{reference}\)

112/112

0

2,602

29.2

1

1

1

1

0

max(|\(R_{A}\)|,|\(R_{B}\)|)

112/112

0

335

13.3

.801

46.5

.66

.32

57

\(\mathsf{dist}_{\mathsf{MESSG}}(R_{A},\ R_{B})\)

107/112

0

899,270

60.1

.983

9.7

1

1

5

\(\mathsf{dist}_{\mathsf{MESSGq}}\)

112/112

0

50,808

42.5

.966

27.9

1

.97

28

\(\mathsf{dist}_{\mathsf{C}} \mathsf{SPAdes}\)

43/112

13,529

22,661

56.8

.973

49.4

.99

.93

8

\(\mathsf{dist}_{\mathsf{C}} \mathsf{SSAKE}\)

68/112

2,079

17,735

48.5

.944

44.5

.97

.84

22

\(\mathsf{dist} \mathsf{SPAdes}\)

112/112

12,380

625,883

56.7

.983

8.9

1

1

0

\(\mathsf{dist} \mathsf{Velvet}\)

111/112

378

749,033

57.9

.971

29.1

1

.99

23

\(\mathsf{dist}_{q} \mathsf{SPAdes}\)

112/112

14,345

28,690

37.6

.971

23.1

1

.94

28

\(\mathsf{dist}_{q} \mathsf{Velvet}\)

112/112

446

22,478

37.7

.956

35.3

1

.97

38

\(\mathsf{Mash}\)

112/112

0

101

9

.679

46.8

.44

.61

152

\(d_{2}^{*}\)

112/112

0

389

18.3

.837

44.7

.4

.9

118

\(\mathsf{longest contig SPAdes}\)

43/112

13,529

1,465

48.2

.751

51.5

.71

.56

106

\(\mathsf{longest contig Velvet}\)

110/112

385

38

7.5

.569

53.8

.46

.23

133

\(\mathsf{Various}\)

\(\mathsf{reference}\)

112/112

0

57,099

16.9

1

1

1

1

0

\(\mathsf{max}(|R_{A}|,|R_{B}|)\)

112/112

0

847

4.1

.907

14.1

.85

.92

48

\(\mathsf{dist}_{\mathsf{MESSG}}\)

64/112

0

1,299,980

24.8

.933

13

.93

.93

19

\(\mathsf{dist}_{\mathsf{MESSGq}}\)

109/112

0

605,647

20

.927

8.7

.84

.97

42

\(\mathsf{dist}_{\mathsf{C}} \mathsf{SSAKE}\)

108/112

1,235

749,197

20.7

.928

5.4

.84

.92

25

\(\mathsf{dist}_{\mathsf{C}} \mathsf{Velvet}\)

34/112

17,783

1,239,632

25.5

.917

19.8

.88

.94

16

\(\mathsf{dist} \mathsf{Edena}\)

69/112

168

1,681,308

24.6

.932

12.3

.92

.93

18

\(\mathsf{dist} \mathsf{SSAKE}\)

64/112

568

1,635,059

26.1

.919

12.9

.83

.91

27

\(\mathsf{dist}_{q} \mathsf{ABySS}\)

110/112

10,937

252,197

16.5

.919

11.7

.85

.93

39

\(\mathsf{dist}_{q} \mathsf{SSAKE}\)

111/112

2,231

428,540

17.9

.934

6.4

.84

.95

62

\(\mathsf{Mash}\)

84/112

0

562

8.3

.664

17.8

.46

.34

344

\(d^{q*}_{2}\)

109/112

0

721

8

.573

17.4

.32

.28

399

\(\mathsf{longest contig SSAKE}\)

108/112

1,235

385

3.5

.386

20.9

.48

.43

349

\(\mathsf{longest contig Velvet}\)

34/112

17,783

34,858

22.5

.681

21.4

.62

.5

329

Hepatitis

reference

9/9

0

1,748,984

16.9

1

1

1

1

0

max(|\(R_{A}\)|,|\(R_{B}\)|)

9/9

0

29,340

5.8

.181

19.3

.72

.83

24,017

\(\mathsf{dist}_{\mathsf{MESSG}}\)

9/9

0

42,332,682

21.1

.965

8.3

1

.9

4,407

\(\mathsf{dist}_{\mathsf{MESSGq}\alpha }\)

9/9

0

1,118,585

15.4

.897

14.2

1

.94

4,543

\(\mathsf{dist}_{\mathsf{C}} \mathsf{SPAdes}\)

2/9

76,514

31,517,537

23.1

.869

20.6

1

.89

7,361

\(\mathsf{dist}_{\mathsf{C}} \mathsf{Velvet}\)

4/9

11,090

59,898,794

23.9

.98

14.6

1

.99

2,419

\(\mathsf{dist} \mathsf{Edena}\)

0/9

NaN

NaN

24.4

NaN

24.4

NaN

NaN

NaN

\(\mathsf{dist}_{q\alpha }\ \mathsf{ABySS}\)

9/9

48,194

520,227

12.7

.957

10.6

1

.93

13,051

\(\mathsf{dist}_{q\alpha }\ \mathsf{SSAKE}\)

9/9

88,516

615,615

14.2

.901

12.9

.96

.94

13,710

\(\mathsf{Mash}\)

9/9

0

2,350

1.4

.967

8.1

1

.92

9,532

\(d^{q}_{2}\)

9/9

0

27,885

6.7

.973

5.1

1

.87

5,347

\(\mathsf{longest contig Edena}\)

9/9

7,038

1,581,613

15.6

.515

17.8

.92

.76

23,452

\(\mathsf{longest contig Velvet}\)

4/9

11,090

515

13.1

.296

21.3

.92

.47

51,443

\(\mathsf{Chroms}\)

\(\mathsf{reference}\)

1/1

0

668,767

20

1

1

1

1

0

max(|\(R_{A}\)|,|\(R_{B}\)|)

1/1

0

2,184

13

.331

18

.61

.3

880

\(\mathsf{dist}_{\mathsf{MESSG}}\)

1/1

0

23,758,416

24

.848

14

.58

.26

923

\(\mathsf{dist}_{\mathsf{MESSGq}\alpha }\)

1/1

0

202,517

19

.825

15

.9

.25

939

\(\mathsf{dist} \mathsf{ABySS}\)

1/1

17,838

24,085,638

25

.911

6

.64

.34

707

\(\mathsf{dist} \mathsf{SPAdes}\)

1/1

22,898

23,757,934

23

.873

13

.68

.21

968

\(\mathsf{dist}_{q\alpha }\ \mathsf{SPAdes}\)

1/1

22,898

127,061

16

.881

11

.81

.33

991

\(\mathsf{dist}_{q\alpha }\ \mathsf{SSAKE}\)

1/1

51,604

126,565

15

.914

4

.81

.21

987

\(\mathsf{Mash}\)

1/1

0

173

3

.33

19

.6

.38

787

\(d^{q*}_{2}\)

1/1

0

697

6

.959

2

.81

.32

1,083

\(\mathsf{longest contig Velvet}\)

1/1

7,866

31

1

.574

16

.81

.4

1,007

  1. The boldface numbers mark three best results on each dataset
  2. Please note, that some of the marked results might be in the Supplementary materials