From: Visual programming for next-generation sequencing data analytics
Library Name | Release Date | Programming Language | License | Website | Features |
---|---|---|---|---|---|
EMBOSS [43] | 2000 | C C++ BTL others | GNU GPL | Sequence alignment; rapid database search; protein motif identification; nucleotide sequence pattern analysis; codon usage analysis for small genomes; rapid identification of sequence patterns in large scale sequence sets; presentation tools for publication. | |
BTL [41] | 2001 | C++ | GNU GPL | Data structures (e.g. graphs); nucleotide string methods (e.g. Fourier transform, Needleman-Wunsch alignment). | |
Bioperl [47] | 2002 | Perl | Artistic License GNU GPL | Access sequence data from local/remote data bases; manage data base formats; data base search; manipulating sequences/sequence alignments; gene annotations. | |
Bioconductor [50] | 2003 | R (C/C++) | Artistic BSD GNU GPL | Repository of multiple libraries for analysis and comprehension of genomic and –omics data, including NGS. | |
BioPHP | 2003 | PHP | GNU GPL | DNA and protein sequence analysis, sequence alignment. | |
GenomeTools [58] | 2003 | C | Open BSD | Parsing, compression, k-mer, suffix trees, annotation, error correction and other sequence analytics (FASTA, FASTQ) | |
Pizza&Chili [94] | 2005 | C/C++ | GNU Lesser GPL | Compressed indices, text collections | |
Bio++[42] | 2006 | C++ | CeCILL GPL | Sequence analysis, phylogenetics, molecular evolution; population genetics. | |
Biojava [46] | 2008 | Java | GNU Lesser GPL | Manipulate biological sequences; file parse; DAS client/server support; access to BioSQL/Ensembl data bases; tools for making sequence analysis GUIs; statistical routines; dynamic programming toolkit. | |
SeqAn [52] | 2008 | C++ | BSD 3-clause | Extensive set of algorithms and data structures for the analysis of nucleotide sequences, with emphasis on NGS data; includes index, compression, data base search, support for NGS-specific file formats (fastq, SAM/BAM, VCF, BED). | |
Biopython [45] | 2009 | Python, C | Biopython | Sequence input/output; alignment input/output; population genetics; structural bioinformatics; SQL interface. | |
htslib SAMtools BCFtools [37] | 2009 | C | MIT Expat Modified BSD | Read, write, edit, index, view SAM/BAM/CRAM formats; read, write BCF2/VCF/gVCF files; call, filter, summarize SNP/short indels. | |
BioRuby [44] | 2010 | Ruby | GNU GPL | DNA and protein sequence analysis, sequence alignment, biological database parsing, ontology, structural biology. | |
BAMTools [36] | 2011 | C++ | MIT | Read, write, manipulate BAM formats | |
libStatGen [40] | 2011 | C++ | GNU GPL | Handle SAM/BAM, fastq, GLF, VCF, ASP. | |
NGS++ [38] | 2013 | C++ | GNU Lesser GPL | Read, write, manipulate multiple genomic file formats and data associated with BED type files (epigenomics). | |
Bioclojure [39] | 2014 | Clojure | GNU Lesser GPL | Parse of Genbank, Uniprot XML, fasta, fastq formats; wrappers for BLAST, signalP, TMHMM; index files for random access, lazy processing of sequences from very large files. |