This article has Open Peer Review reports available.
Visualizing genomic information across chromosomes with PhenoGram
© Wolfe et al.; licensee BioMed Central Ltd. 2013
Received: 8 August 2013
Accepted: 2 October 2013
Published: 16 October 2013
With the abundance of information and analysis results being collected for genetic loci, user-friendly and flexible data visualization approaches can inform and improve the analysis and dissemination of these data. A chromosomal ideogram is an idealized graphic representation of chromosomes. Ideograms can be combined with overlaid points, lines, and/or shapes, to provide summary information from studies of various kinds, such as genome-wide association studies or phenome-wide association studies, coupled with genomic location information. To facilitate visualizing varied data in multiple ways using ideograms, we have developed a flexible software tool called PhenoGram which exists as a web-based tool and also a command-line program.
With PhenoGram researchers can create chomosomal ideograms annotated with lines in color at specific base-pair locations, or colored base-pair to base-pair regions, with or without other annotation. PhenoGram allows for annotation of chromosomal locations and/or regions with shapes in different colors, gene identifiers, or other text. PhenoGram also allows for creation of plots showing expanded chromosomal locations, providing a way to show results for specific chromosomal regions in greater detail. We have now used PhenoGram to produce a variety of different plots, and provide these as examples herein. These plots include visualization of the genomic coverage of SNPs from a genotyping array, highlighting the chromosomal coverage of imputed SNPs, copy-number variation region coverage, as well as plots similar to the NHGRI GWA Catalog of genome-wide association results.
PhenoGram is a versatile, user-friendly software tool fostering the exploration and sharing of genomic information. Through visualization of data, researchers can both explore and share complex results, facilitating a greater understanding of these data.
KeywordsData visualization Bioinformatics Genome-wide association study GWAS Copy-number variants CNV SNP Ideogram
As the types and amount of genomic data being collected continue to increase, so does the need for tools to visualize, analyze, and share these data. One useful data visualization approach for genomic results is the use of chromosomal ideograms. An ideogram is a graphical representation of chromosomes, and these plots have been used with the addition of overlaid points, lines, and shapes to provide summary information of various kinds coupled with genomic location information [1, 2]. For example, the National Human Genome Research Institute (NHGRI) Genome-Wide Association Study (GWAS) Catalog has plotted the results of multiple genome-wide association studies using ideograms, highlighting genomic regions and a range of associated phenotypes for current published GWAS (http://www.genome.gov/gwastudies/) .
Any –omic data that can be represented by chromosomal base pair locations or regions can also be plotted with ideograms. Genotyping array coverage information, single nucleotide polymorphism (SNP) imputation results, and the results of association studies with multiple phenotypes such as phenome-wide associations studies (PheWAS) [4, 5], are examples of other types of data that can benefit from the broad perspective offered by visualizing data with a chromosomal ideogram. The software PhenoGram has been developed to meet the need for an accessible tool that can allow researchers to both better understand complex data and easily disseminate the results.
PhenoGram was initially conceived as a method to highlight SNP-phenotype association results across the genome through the use of color-coded circles corresponding to various phenotypes, linked by lines to genomic locations, similar to the aforementioned NHGRI GWAS Catalog plots. We subsequently expanded the PhenoGram feature set, providing more options for other types of plots. Via the command line or on the web using a graphical interface, researchers can supply different types of information along with base-pair or region data that plotted onto an ideogram according to the researcher’s preferences. Resulting PhenoGram plots can be downloaded as 1200 dots per inch (DPI) lossless PNG images that are publication ready.
For example, researchers can annotate chromosomal locations or biologically relevant regions to indicate traits associated with specific positions, and can choose different shapes to highlight ancestry or another study attribute related to specific data points. The use of PhenoGram is not limited to association results, as it can be used to plot and annotate chromosome regions across an ideogram without phenotype information. PhenoGram offers a complete genomic picture. Data that relates gene loci, phenotypes, or other attributes to genome location can be complex, and summarizing such data with visualization methods can be important for better understanding results.
PhenoGram plotting options and arguments (− arg name) for creating PhenoGram plots using pheno_gram.rb
Show the help message and exit
Show PhenoGram version
-i input filename
Filename of input configuration file
-o output filename
Filename of output plot
Main plot title (enclose in double quotes)
-f image type
Output image format (default is PNG); other options depend on ImageMagick installation
-p phenotype spacing
Determines standard, equal, or alternative algorithm
-c color range
Determines random, web, generator, group, or list algorithm
Sets plot resolution to 1200 DPI
Plot only chromosomes with position
Plot with smaller phenotype circles
Plot phenotype circles with black outline
-Z zoom location
Zoom on chromosome (i.e. 7) or region (i.e. 7:10000–20000)
Include annotation on plot
Show more transparent lines on chromosomes
Show thinner lines across chromosomes
Increase thickness of chromosome boundary
Increase font size of phenotype labels
Add shading to inaccessible or cytogenetic chromosome regions
-r random seed
Seed for random number generator (default is 7)
PhenoGram input file formatting parameters
Recognized column header
Base-pair location of the SNP or starting location of a base-pair to base-pair region
Name of the phenotype; required only when plotting phenotypes as colored shapes
Ending location of a base-pair to base-pair region
NOTE (or ANNOTATION)
Values in this column are shown on the plot position to the right of the chromosome; limited to 10 characters
ETHNICITY (or ANCESTRY)
Specifies ethnicity or ancestry for the associated position; accepts up to three unique values in this column
Specifies a group identifier such that all phenotypes of the same identifier share a common color
Shades transverse lines on the line plot with a color; specified by an integer 0-7
Results and discussion
To show the utility of PhenoGram, and the ways that multiple options can be combined for different types of plots, we describe here several example uses of this software. For the first set of examples, we have used a subset of data from the NHGRI GWAS Catalog to demonstrate some features of PhenoGram, highlighting some of the similarities and differences in our plots compared to the NHGRI GWAS Catalog plots. We chose this data because allowed us to represent multiple phenotypes across the genome and highlight other relationships in the data such as pleiotropy or ancestry. In addition, the GWAS Catalog data could be prepared as input to PhenoGram with a single database query and minimal data.
With the ever increasing amounts of data being collected, visually summarizing data can be important for providing insight into complex results. Multiple data results can be plotted across chromosomes, providing useful summary information, and aiding in data analyses as well as sharing results. PhenoGram offers a robust feature set, allowing researchers to plot data of many kinds across a chromosomal ideogram according to preference. In the future we will be adding in additional color option choices for plots, as well as additional software features, to expand plotting options with PhenoGram. The features of PhenoGram can further facilitate the exploration and sharing of genomic information.
Availability and requirements
Project name: PhenoGram
Project home page: http://visualization.ritchielab.psu.edu
Operating systems(s): Linux, Mac OS X, Windows
Programming language: Ruby
Other requirements: RMagick
License: GNU General Public License
Any restrictions to use by non-academics: PhenoGram use is restricted to academic and non-profit users
We would like to thank everyone who has had suggestions for improvements and additions to this software. This work was supported by the following funding agencies and grants: 5U01 HG004798-03, 5R01 LM010040-02, and U19 HL065962-10.
- Ramos PS, Criswell LA, Moser KL, Comeau ME, Williams AH, Pajewski NM, Chung SA, Graham RR, Zidovetzki R, Kelly JA, Kaufman KM, Jacob CO, Vyse TJ, Tsao BP, Kimberly RP, Gaffney PM, Alarcón-Riquelme ME, Harley JB, Langefeld CD, International Consortium on the Genetics of Systemic Erythematosus: A comprehensive analysis of shared loci between systemic lupus erythematosus (SLE) and sixteen autoimmune diseases reveals limited genetic overlap. Plos Genet. 2011, 7: e1002406-View ArticlePubMedPubMed CentralGoogle Scholar
- Grossman SR, Andersen KG, Shlyakhter I, Tabrizi S, Winnicki S, Yen A, Park DJ, Griesemer D, Karlsson EK, Wong SH, Cabili M, Adegbola RA, Bamezai RNK, Hill AVS, Vannberg FO, Rinn JL, Lander ES, Schaffner SF, Sabeti PC, 1000 Genomes Project: Identifying recent adaptations in large-scale genomic data. Cell. 2013, 152: 703-713.View ArticlePubMedPubMed CentralGoogle Scholar
- Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci. 2009, 106: 9362-9367.View ArticlePubMedPubMed CentralGoogle Scholar
- Pendergrass SA, Brown-Gentry K, Dudek SM, Torstenson ES, Ambite JL, Avery CL, Buyske S, Cai C, Fesinmeyer MD, Haiman C, Heiss G, Hindorff LA, Hsu C-N, Jackson RD, Kooperberg C, Le Marchand L, Lin Y, Matise TC, Moreland L, Monroe K, Reiner AP, Wallace R, Wilkens LR, Crawford DC, Ritchie MD: The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery. Genet Epidemiol. 2011, 35: 410-422.View ArticlePubMedPubMed CentralGoogle Scholar
- Pendergrass SA, Brown-Gentry K, Dudek S, Frase A, Torstenson ES, Goodloe R, Ambite JL, Avery CL, Buyske S, Bůžková P, Deelman E, Fesinmeyer MD, Haiman CA, Heiss G, Hindorff LA, Hsu C-N, Jackson RD, Kooperberg C, Le Marchand L, Lin Y, Matise TC, Monroe KR, Moreland L, Park SL, Reiner A, Wallace R, Wilkens LR, Crawford DC, Ritchie MD: Phenome-Wide Association Study (PheWAS) for Detection of Pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network. Plos Genet. 2013, 9: e1003087-View ArticlePubMedPubMed CentralGoogle Scholar
- Cortes A, Brown MA: Promise and pitfalls of the Immunochip. Arthritis Res Ther. 2011, 13: 101-View ArticlePubMedPubMed CentralGoogle Scholar
- Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R, Conroy J, Magalhaes TR, Correia C, Abrahams BS, Almeida J, Bacchelli E, Bader GD, Bailey AJ, Baird G, Battaglia A, Berney T, Bolshakova N, Bölte S, Bolton PF, Bourgeron T, Brennan S, Brian J, Bryson SE, Carson AR, Casallo G, Casey J, Chung BHY, Cochrane L, Corsello C: Functional impact of global rare copy number variation in autism spectrum disorders. Nature. 2010, 466: 368-372.View ArticlePubMedPubMed CentralGoogle Scholar
- Girirajan S, Johnson RL, Tassone F, Balciuniene J, Katiyar N, Fox K, Baker C, Srikanth A, Yeoh KH, Khoo SJ, Nauth TB, Hansen R, Ritchie M, Hertz-Picciotto I, Eichler EE, Pessah IN, Selleck SB: Global increases in both common and rare copy number load associated with autism. Hum Mol Genet. 2013, 22: 2870-2880.View ArticlePubMedPubMed CentralGoogle Scholar
- Furey TS, Haussler D: Integration of the cytogenetic map with the draft human genome sequence. Hum Mol Genet. 2003, 12: 1037-1044.View ArticlePubMedGoogle Scholar
- Bickmore WA: Karyotype Analysis and Chromosome Banding. 2001, John Wiley & Sons, Ltd: In eLSView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.