Pneumologie 2015; 69 - A9
DOI: 10.1055/s-0035-1556601

Normalization and calibration in the metagenomic analysis of CF lung microbiomes

L Wiehlmann 1, P Chouvarine 1, P Moran Losada 1, B Tümmler 1
  • 1Medizinische Hochschule Hannover

Next-generation sequencing is becoming increasingly affordable, which makes whole-metagenome sequencing an attractive alternative to traditional 16S rDNA, RFLP, or culturing approaches to analysis of metagenomic samples. The advantage of whole-metagenome sequencing is a broader spectrum of detectable microorganisms (including also fungi and many viruses), quantitative analysis of the microbial community and information about the metabolic capacity and physiological features of the studied metagenome, even without the knowledge of genotypes and phenotypes of all members of the microbial community. Problems of 16S rDNA sequencing, such as unknown copy number of the 16S gene and lack of sequence homology to some of the target 16S genes are not present in whole-metagenome sequencing. On the other hand, next-generation sequencing suffers from biases resulting in non-uniform coverage of the sequenced genomes (e.g. GC- bias, sequence based differences in ligase efficiency), differences in the genome length of microorgansms and their content of variable genomic elements (e.g. gene islands, phages), which, however, can and must be normalized.

We performed our analysis using a collection of 30 upper airways samples from cystic fibrosis patients for validation of filtration techniques and to analyze absolute bacterial abundances. The proposed filtration techniques identify and filter out reads mapping to genomic islands based on their positional distribution across the reference genome. We used a sample of seven pooled bacteria present in equal DNA amounts for establishment of the GC-normalization model.

While there has been substantial research in normalization and filtration of read-count data in such techniques as RNA-seq or Chip-seq, to our knowledge, this has not been the case for the newly developing field of whole-metagenome shotgun sequencing. We present a model for the correction of GC-bias affecting sequencing reads in metagenomic samples and filtration and normalization techniques necessary for accurate quantification of microbial organisms in such samples.

*Presenting author