Semin Reprod Med 2014; 32(01): 005-013
DOI: 10.1055/s-0033-1361817
Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

Use of Whole Genome Shotgun Metagenomics: A Practical Guide for the Microbiome-Minded Physician Scientist

Jun Ma
1   Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Baylor College of Medicine
2   Department of Molecular and Human Genetics, Bioinformatics Research Lab, Baylor College of Medicine
,
Amanda Prince
1   Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Baylor College of Medicine
,
Kjersti M. Aagaard
1   Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Baylor College of Medicine
2   Department of Molecular and Human Genetics, Bioinformatics Research Lab, Baylor College of Medicine
3   Department of Molecular and Cell Biology, Baylor College of Medicine
4   Alkek Center for Metagenomics and Microbiome Research, Baylor College of Medicine, Houston, Texas
› Author Affiliations
Further Information

Publication History

Publication Date:
03 January 2014 (online)

Abstract

Whole genome shotgun sequencing (WGS) has been increasingly recognized as the most comprehensive and robust approach for metagenomics research. When compared with 16S-based metagenomics, it offers the advantage of identification of species level taxonomy and the estimation of metabolic pathway activities from human and environmental samples. Several large-scale metagenomic projects have been recently conducted or are currently underway utilizing WGS. With the generation of vast amounts of data, the bioinformatics and computational analysis of WGS results become vital for the success of a metagenomics study. However, each step in the WGS data analysis, including metagenome assembly, gene prediction, taxonomy identification, function annotation, and pathway analysis, is complicated by the shear amount of data. Algorithms and tools have been developed specifically to handle WGS-generated metagenomics data with the hope of reducing the requirement on computational time and storage space. Here, we present an overview of the current state of metagenomics through WGS sequencing, challenges frequently encountered, and up-to-date solutions. Several applications that are uniquely applicable to microbiome studies in reproductive and perinatal medicine are also discussed.

 
  • References

  • 1 Human Microbiome Project Consortium. A framework for human microbiome research. Nature 2012; 486 (7402) 215-221
  • 2 Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome. Nature 2012; 486 (7402) 207-214
  • 3 Aagaard K, Petrosino J, Keitel W , et al. The Human Microbiome Project strategy for comprehensive sampling of the human microbiome and why it matters. FASEB J 2013; 27 (3) 1012-1022
  • 4 Aagaard K, Riehle K, Ma J , et al. A metagenomic approach to characterization of the vaginal microbiome signature in pregnancy. PLoS ONE 2012; 7 (6) e36466
  • 5 Qin J, Li Y, Cai Z , et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 2012; 490 (7418) 55-60
  • 6 Joossens M, Huys G, Cnockaert M , et al. Dysbiosis of the faecal microbiota in patients with Crohn's disease and their unaffected relatives. Gut 2011; 60 (5) 631-637
  • 7 Turnbaugh PJ, Bäckhed F, Fulton L, Gordon JI. Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell Host Microbe 2008; 3 (4) 213-223
  • 8 Sobhani I, Tap J, Roudot-Thoraval F , et al. Microbial dysbiosis in colorectal cancer (CRC) patients. PLoS ONE 2011; 6 (1) e16393
  • 9 Goldenberg RL, Hauth JC, Andrews WW. Intrauterine infection and preterm delivery. N Engl J Med 2000; 342 (20) 1500-1507
  • 10 Nold C, Anton L, Brown A, Elovitz M. Inflammation promotes a cytokine response and disrupts the cervical epithelial barrier: a possible mechanism of premature cervical remodeling and preterm birth. Am J Obstet Gynecol 2012; 208: e201-e207
  • 11 Wooley JC, Godzik A, Friedberg I. A primer on metagenomics. PLOS Comput Biol 2010; 6 (2) e1000667
  • 12 Prakash T, Taylor TD. Functional assignment of metagenomic data: challenges and applications. Brief Bioinform 2012; 13 (6) 711-727
  • 13 Gonzalez A, Knight R. Advancing analytical algorithms and pipelines for billions of microbial sequences. Curr Opin Biotechnol 2012; 23 (1) 64-71
  • 14 Teeling H, Glöckner FO. Current opportunities and challenges in microbial metagenome analysis—a bioinformatic perspective. Brief Bioinform 2012; 13 (6) 728-742
  • 15 Pop M. Genome assembly reborn: recent computational challenges. Brief Bioinform 2009; 10 (4) 354-366
  • 16 Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008; 18 (5) 821-829
  • 17 Myers EW, Sutton GG, Delcher AL , et al. A whole-genome assembly of Drosophila. Science 2000; 287 (5461) 2196-2204
  • 18 Li R, Zhu H, Ruan J , et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 2010; 20 (2) 265-272
  • 19 Pevzner PA, Tang H, Waterman MS. An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci U S A 2001; 98 (17) 9748-9753
  • 20 Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res 2012; 40 (20) e155
  • 21 Peng Y, Leung HC, Yiu SM, Chin FY. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 2012; 28 (11) 1420-1428
  • 22 Wu YW, Rho M, Doak TG, Ye Y. Stitching gene fragments with a network matching algorithm improves gene assembly for metagenomics. Bioinformatics 2012; 28 (18) i363-i369
  • 23 Boisvert S, Raymond F, Godzaridis E, Laviolette F, Corbeil J. Ray Meta: scalable de novo metagenome assembly and profiling. Genome Biol 2012; 13 (12) R122
  • 24 Boisvert S, Laviolette F, Corbeil J. Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol 2010; 17 (11) 1519-1533
  • 25 Koren S, Treangen TJ, Pop M. Bambus 2: scaffolding metagenomes. Bioinformatics 2011; 27 (21) 2964-2971
  • 26 Yok NG, Rosen GL. Combining gene prediction methods to improve metagenomic gene annotation. BMC Bioinformatics 2011; 12: 20
  • 27 Noguchi H, Park J, Takagi T. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res 2006; 34 (19) 5623-5630
  • 28 Zhu W, Lomsadze A, Borodovsky M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res 2010; 38 (12) e132
  • 29 Hoff KJ, Lingner T, Meinicke P, Tech M. Orphelia: predicting genes in metagenomic sequencing reads. Nucleic Acids Res 2009; 37 (Web Server issue): W101-105
  • 30 Cole JR, Wang Q, Cardenas E , et al. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 2009; 37 (Database issue): D141-145
  • 31 Huson DH, Auch AF, Qi J, Schuster SC. MEGAN analysis of metagenomic data. Genome Res 2007; 17 (3) 377-386
  • 32 Krause L, Diaz NN, Goesmann A , et al. Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Res 2008; 36 (7) 2230-2239
  • 33 Gerlach W, Junemann S, Tille F, Goesmann A, Stoye J. WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads. BMC Bioinformatics 2009; 10: 430
  • 34 Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 1999; 27 (23) 4636-4641
  • 35 Brady A, Salzberg SL. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nat Methods 2009; 6 (9) 673-676
  • 36 Liu B, Gibbons T, Ghodsi M, Treangen T, Pop M. Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences. BMC Genomics 2011; ;12 Suppl 2: S4
  • 37 Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C. Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 2012; 9 (8) 811-814
  • 38 Markowitz VM, Chen IM, Palaniappan K , et al. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res 2012; 40 (Database issue): D115-122
  • 39 De Filippo C, Ramazzotti M, Fontana P, Cavalieri D. Bioinformatic approaches for functional annotation and pathway inference in metagenomics data. Brief Bioinform 2012; 13 (6) 696-710
  • 40 Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010; 26 (19) 2460-2461
  • 41 Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 2011; 39 (Web Server issue): W29-37
  • 42 Farfan F, Ma J, Sartor MA, Michailidis G, Jagadish HV. THINK Back: Knowledge-based interpretation of high throughput data. BMC Bioinformatics 2012; ;13 Suppl 2: S4
  • 43 Subramanian A, Tamayo P, Mootha VK , et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U. S. A. 2005; 102 (43) 15545-15550
  • 44 Ma J, Sartor MA, Jagadish HV. Appearance frequency modulated gene set enrichment testing. BMC Bioinformatics 2011; 12: 81
  • 45 Abubucker S, Segata N, Goll J , et al. Metabolic reconstruction for metagenomic data and its application to the human microbiome. PLoS Comput Biol 2012; 8 (6) e1002358
  • 46 Ye Y, Doak TG. A parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes. PLoS Comput Biol 2009; 5 (8) e1000465
  • 47 Lozupone C, Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 2005; 71 (12) 8228-8235
  • 48 Dinsdale EA, Edwards RA, Bailey BA , et al. Multivariate analysis of functional metagenomes. Frontiers in Genetics 2013; 4: 41
  • 49 Segata N, Izard J, Waldron L , et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011; 12 (6) R60
  • 50 Arumugam M, Raes J, Pelletier E , et al. Enterotypes of the human gut microbiome. Nature 2011; 473 (7346) 174-180
  • 51 Ravel J, Gajer P, Abdo Z , Sci USA. Vaginal microbiome of reproductive-age women. Proc Natl Acad 108 Suppl 2011, 1 SRC - GoogleScholar: 4680-4687
  • 52 Spor A, Koren O, Ley R. Unravelling the effects of the environment and host genotype on the gut microbiome. Nat Rev Microbiol 2011; 9 (4) 279-290
  • 53 Wu GD, Chen J, Hoffmann C , et al. Linking long-term dietary patterns with gut microbial enterotypes. Science 2011; 334 (6052) 105-108
  • 54 Riehle K, Coarfa C, Jackson A , et al. The Genboree Microbiome Toolset and the analysis of 16S rRNA microbial sequences. BMC Bioinformatics 2012; ;13 Suppl 13: S11
  • 55 Glass EM, Wilkening J, Wilke A, Antonopoulos D, Meyer F. Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes. Cold Spring Harb Protoc 2010; ;2010 (1):pdb prot5368
  • 56 Aziz RK, Bartels D, Best AA , et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 2008; 9: 75
  • 57 Arumugam M, Harrington ED, Foerstner KU, Raes J, Bork P. SmashCommunity: a metagenomic annotation and analysis tool. Bioinformatics 2010; 26 (23) 2977-2978
  • 58 Treangen TJ, Koren S, Sommer DD , et al. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013; 14 (1) R2
  • 59 Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 2009; 10 (3) R25
  • 60 Koren O, Goodrich JK, Cullender TC , et al. Host remodeling of the gut microbiome and metabolic changes during pregnancy. Cell 2012; 150 (3) 470-480
  • 61 Han YW, Shen T, Chung P, Buhimschi IA, Buhimschi CS. Uncultivated bacteria as etiologic agents of intra-amniotic inflammation leading to preterm birth. J. Clin Microbiol 2009; 47 (1) 38-47
  • 62 Steel JH, Malatos S, Kennea N , et al. Bacteria and inflammatory cells in fetal membranes do not always cause preterm labor. Pediatr. Res 2005; 57 (3) 404-411
  • 63 Stout MJ, Conlon B, Landeau M , et al. Identification of intracellular bacteria in the basal plate of the human placenta in term and preterm gestations. Am J Obstet Gynecol 2013; 208 (3) : 226 e221-227