Skip to main content

Table 1 Software for predicting structural variations

From: Unraveling genomic variation from next generation sequencing data

Tool

Single-End

Pair-End

Reference genome

Insertion

Deletion

Inversion

Translocation across chromosomes

Translocation within chromosome

Properties

Input File

BreakDancer [61]

 

X

 

X

X

X

X

X

• BreakDancerMax for large regions and BreakDancerMini for indels of 10-100 bp

BAM, SAM

CNV-seq [62]

X

 

X

X

X

   

• Shotgun sequencing

Map locations from a BAM file (by SAM tools)

• Robust statistical model

GASV [63]

 

X

 

X

X

X

X

X

• Geometric approach

BAM

• A SV is pictured as a polygon on a surface

• Comparison of SVs across multiple samples

HyDRa [64]

 

X

 

X

X

X

X

X

• SV breakpoints by clustering discordant paired-end alignments

Tab-delimiteddiscordant paired-end mappings

MoDIL [65]

 

X

 

X

X

   

• Medium sized (10-50 bp) paired-end indels

Software specific

• Able identify shorter heterozygous, as well as homozygous variants with higher accuracy

 

MrFast [66]

X

  

X

X

   

• Short sequence reads (>25 bp)

FASTA, FASTQ

NovelSeq [67]

 

X

 

X

    

• Long novel sequence insertions

Software specific

• Multiple types of variations

PEMer [68]

 

X

 

X

X

X

X

X

• PEMer: variations

SVdB API

• SV-Simulation: simulated paired-end reads

• BreakDB: annotations

Pindel [69]

 

X

X

X

X

   

• Large deletions (1 bp–10 kb)

BAM,SAM,FASTA, FASTQ

• Medium sized insertions (1–20 bp) from 36 bp paired-end short reads

rSW-seq [70]

X

 

X

X

X

   

• Based on an iterative Smith-Waterman dynamic sequence alignment method

Tab-delimited file denoting the tumor/normal status for each of aligned read positions

VariationHunter [71]

 

X

 

X

X

X

  

• Evaluation of the entire possible mapping set of positions of each paired-end read and final mapping of the SVs interdependently.

Software specific

VarScan [72, 73]

X

 

X

X

X

   

• Germline variants (SNPs and indels) in individual samples or pools of samples.

Pileup, VCF

• Shared and private variants in multi-sample datasets (with mpileup).

• Somatic mutations, LOH events, and germline variants in tumor-normal pairs.

         

• Somatic copy number alterations (CNAs) in tumor-normal exome data.

Â