Skip to main content

Table 1 Model summaries of linear regressions for predicting yield outputs

From: SEQdata-BEACON: a comprehensive database of sequencing performance and statistical tools for performance evaluation and yield simulation in BGISEQ-500

Independent variables

Analytical results for Eq. (3)

Analytical results for Eq. (4)

Analytical results for Eq. (5)

Std. error

t-Value

P-value

Std. error

t-Value

P-value

Std. error

t-Value

P-value

Constants

11.424

−11.353

< 2e-16**

10.662

−12.173

< 2e-16**

39.411

−3.578

0.00035**

TotalEsr*Dnbnumber

0.010

95.192

< 2e-16**

0.010

96.130

< 2e-16**

0.033

38.837

< 2e-16**

BIC

0.192

7.278

4.73e-13**

0.178

7.873

5.46e-15**

0.724

6.556

6.91e-11**

accGRR

6.175

0.023

0.981

59.009

− 10.243

< 2e-16**

SNR

0.388

−8.586

< 2e-16**

0.388

−8.596

< 2e-16**

0.653

−5.307

1.23e-07**

FIT

4.757

5.675

1.57e-08**

4.577

5.905

4.08e-09**

11.249

2.567

0.010*

  1. TotalEsr: ESR (Effective Spot Rate), the percentage of filtered Reads among the DNBs recognized by Basecalling. ESR = Total Reads/theoretical maximum reads number of one sequencing lane. TotalEsr calculated ESR value in the first 15 cycles in read1 and read2, and kept constant in the rest of each read
  2. Dnbnumber: The theoretical maximum number of DNBs on the patterned array
  3. BIC: Basecall information content, the percentage of DNBs that can be used for Basecalling among the DNBs recognized by the optical system. BIC = (numbers of DNB that can be used for Basecalling/numbers of DNB that can be recognized by the optical systems) × 100%
  4. accGRR: Accumulated Good Reads Rate, taking chastity greater than 0.6 as the filtering criteria, the percentage of filtered Reads among the DNBs recognized by Basecalling. accGRR = Total Reads/theoretical maximum reads number of one sequencing lane. This value is only a statistical indicator which reflects the overall quality of the read (multi-cycle state)
  5. SNR: Signal to Noise Ratio, taking the SNR calculation of a single DNB as an example, base A (maximum light intensity) is used as the signal, the CGT is the background, and the variance of the CGT light intensity is noise. A_SNR = A_mean/CGT_dev
  6. FIT: FIT value represents the distribution of differences between signal and noise for each base. The FIT value is higher when the distribution of differences between signal to noise for each channel/color are more concentrated
  7. **Significant at the 1% probability level
  8. * Significant at the 5% probability level