. 2012;7(4):e36009.

doi: 10.1371/journal.pone.0036009. Epub 2012 Apr 27.

Generation and analysis of a mouse intestinal metatranscriptome through Illumina based RNA-sequencing

Xuejian Xiong¹, Daniel N Frank, Charles E Robertson, Stacy S Hung, Janet Markle, Angelo J Canty, Kathy D McCoy, Andrew J Macpherson, Philippe Poussier, Jayne S Danska, John Parkinson

Affiliations

PMID: 22558305
PMCID: PMC3338770
DOI: 10.1371/journal.pone.0036009

Generation and analysis of a mouse intestinal metatranscriptome through Illumina based RNA-sequencing

Xuejian Xiong et al. PLoS One. 2012.

. 2012;7(4):e36009.

doi: 10.1371/journal.pone.0036009. Epub 2012 Apr 27.

Authors

Xuejian Xiong¹, Daniel N Frank, Charles E Robertson, Stacy S Hung, Janet Markle, Angelo J Canty, Kathy D McCoy, Andrew J Macpherson, Philippe Poussier, Jayne S Danska, John Parkinson

Affiliation

¹ Program in Molecular Structure and Function, The Hospital for Sick Children, Toronto, Canada.

PMID: 22558305
PMCID: PMC3338770
DOI: 10.1371/journal.pone.0036009

Abstract

With the advent of high through-put sequencing (HTS), the emerging science of metagenomics is transforming our understanding of the relationships of microbial communities with their environments. While metagenomics aims to catalogue the genes present in a sample through assessing which genes are actively expressed, metatranscriptomics can provide a mechanistic understanding of community inter-relationships. To achieve these goals, several challenges need to be addressed from sample preparation to sequence processing, statistical analysis and functional annotation. Here we use an inbred non-obese diabetic (NOD) mouse model in which germ-free animals were colonized with a defined mixture of eight commensal bacteria, to explore methods of RNA extraction and to develop a pipeline for the generation and analysis of metatranscriptomic data. Applying the Illumina HTS platform, we sequenced 12 NOD cecal samples prepared using multiple RNA-extraction protocols. The absence of a complete set of reference genomes necessitated a peptide-based search strategy. Up to 16% of sequence reads could be matched to a known bacterial gene. Phylogenetic analysis of the mapped ORFs revealed a distribution consistent with ribosomal RNA, the majority from Bacteroides or Clostridium species. To place these HTS data within a systems context, we mapped the relative abundance of corresponding Escherichia coli homologs onto metabolic and protein-protein interaction networks. These maps identified bacterial processes with components that were well-represented in the datasets. In summary this study highlights the potential of exploiting the economy of HTS platforms for metatranscriptomics.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Figure 1. Phylogenetic distribution of transcripts mapped to known bacterial genes.**
(A) Distribution of mRNA transcripts. (B) Distribution of rRNA transcripts. Results are shown for the 12 independent samples described in Table 1. Consistent with 16S rRNA studies, the vast majority of identified mRNA transcripts derive from Parabacteroides, Bacteroides or Clostridial species.

**Figure 2. Distribution of COG functional annotations.**
(A) Distribution of COG functional annotations of reads mapping to known bacterial transcripts for the 12 samples analysed in this study. Also shown is the distribution of COG assignments for all proteins in the COG database (background). Asterisk's indicate COG categories with significant (Z-score>2) differences in relative frequency between samples and the background (Z-score is indicated). (B) Pearson correlation coefficients of frequency of reads assigned to each COG category across the 12 samples.

**Figure 3. Metatranscriptome data mapped in the context of a global metabolic network.**
(A) Network map highlighting metabolic enzyme expression for reads obtained from the NOD503CecMN sample. Size of node indicates relative expression. Colour of node indicates functional category of enzyme as defined by KEGG superclasses (see key for details). Several example pathways are indicated. (B) Spearman rank correlation coefficients of relative enzyme expression across the 12 samples. In general there is a high degree of consistency in enzyme expression across samples. (C) Top 100 expressed enzymes in the NOD503CecMN sample. For three subnetworks, links are apparent that extend beyond the boundaries of KEGG defined pathways.

**Figure 4. Samples display subtle differences in expression of enzymes of high betweenness centrality.**
For each enzyme in the network, betweenness centrality was calculated and mapped to node colour. Between the two samples we identify differences in the expression of nodes of high betweenness, suggesting an altered reliance on pathway flux within the network.

**Figure 5. Metatranscriptome data mapped in the context of a global *E. coli* protein interaction network.**
(A) Spearman correlation coefficients of abundance of reads mapping to *E. coli* homologs across samples. (B)–(D) Comparison between conservation, as defined by number of bacterial genomes in which a putative ortholog has been identified and relative expression of *E. coli* homologs for three selected subsystems derived from the transcriptome data for a single sample (NOD503CecMN). Subsystems are defined from a previously generated high quality protein-protein interaction network . Colours of nodes indicate genes involved in common functional modules. Size of nodes indicate either number of genomes or relative expression in terms of RPKM (largest = >900 genomes/transcripts). (B) Proteins involved in cell division and cell wall biogenesis. (C) Proteins involved in iron transport and tryptophan metabolism. (D) Proteins involved in select transport pathways - note of the ones shown, despite many being highly conserved, only components involved in spermidine transport (PotA-D) are abundant within our sample.

See this image and copyright information in PMC

References

1. Schloss PD, Handelsman J. Biotechnological prospects from metagenomics. Curr Opin Biotechnol. 2003;14:303–310. - PubMed
1. Breitbart M, Hewson I, Felts B, Mahaffy JM, Nulton J, et al. Metagenomic analyses of an uncultured viral community from human feces. J Bacteriol. 2003;185:6220–6223. - PMC - PubMed
1. Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464:59–65. - PMC - PubMed
1. Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, et al. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol. 2007;5:e16. - PMC - PubMed
1. Turnbaugh PJ, Quince C, Faith JJ, McHardy AC, Yatsunenko T, et al. Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins. Proc Natl Acad Sci U S A. 2010;107:7503–7508. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect - Access expert opinions and insights on biomedical research.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Generation and analysis of a mouse intestinal metatranscriptome through Illumina based RNA-sequencing

Affiliation

Generation and analysis of a mouse intestinal metatranscriptome through Illumina based RNA-sequencing

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources