Short reads from honey bee (<i>Apis</i> sp.) sequencing projects reflect microbial associate diversity



Gerth, Michael ORCID: 0000-0001-7553-4072 and Hurst, Gregory DD ORCID: 0000-0002-7163-7784
(2017) Short reads from honey bee (<i>Apis</i> sp.) sequencing projects reflect microbial associate diversity. PEERJ, 5 (7). e3529-.

[img] Text
peerj-3529.pdf - Published version

Download (2MB)

Abstract

High throughput (or 'next generation') sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and 'contaminating' material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these 'contaminations' provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from <i>Apis</i> retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee (<i>Apis</i> sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 <i>Apis</i> RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three <i>Apis</i> associated bacteria, as well as several viral strains <i>de novo</i>. We conclude that 'contamination' in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses.

Item Type: Article
Uncontrolled Keywords: Short read sequencing, Contamination, Symbionts, Hologenome, Metagenome, Spiroplasma, Lactobacillus, Deformed wing virus
Depositing User: Symplectic Admin
Date Deposited: 19 Jul 2017 15:02
Last Modified: 13 Oct 2023 05:01
DOI: 10.7717/peerj.3529
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3008566