Sibthorp, Christopher
Analysis of the Aspergillus nidulans transcriptome using high-throughput RNA sequencing.
Doctor of Philosophy thesis, University of Liverpool.
PDF (SibthorpChr_Sep2012_9973.pdf)
SibthorpChr_Sep2012_9973.pdf - Unspecified Available under License Creative Commons Attribution No Derivatives. Download (2MB) |
|
Archive (ZIP) (SibthorpChr_Sep2012_9973_supplementarydata.zip)
SibthorpChr_Sep2012_9973_supplementarydata.zip - Unspecified Available under License Creative Commons Attribution No Derivatives. Download (2MB) |
Abstract
The filamentous fungus, Aspergillus nidulans is a well-characterized model organism which has been used extensively for the study of eukaryotic cell biology and genetics over the past 60 years. The A. nidulans genome was sequenced in 2005, and various genome annotations have been released since, the majority of which rely heavily on in silico gene prediction. The development of high-throughput next generation sequencing technologies has revolutionised transcriptomics by allowing RNA-analysis of whole transcriptomes through massively parallel cDNA sequencing (RNA-seq). This sequencing approach has been applied to the A. nidulans transcriptome, and augmented by the development of a novel strategy for selectively sequencing the 5′ ends of RNAs on the ABI SOLiD platform. This aimed to produce a more robust resource for gene interrogation and the investigation of regulatory elements which impact on the transcriptomal landscape in A. nidulans. Bioinformatic analysis RNA-seq data was used to define 15,375 transcription start site (TSS) regions, which have been characterised by statistical analysis of mapped 5′ end distribution. Motif finding within sequence regions surrounding these TSS identified 16 putative functional promoter motifs based on overrepresentation and distributional analysis within promoters, and GO annotation found significant functional enrichment amongst genes associated with two of these motifs (AARARAAA and TTTYTTY). Transcript assembly of RNA-seq data has also revealed 16065 putative transcripts, 1112 of which were mapped to regions annotated as intergenic. From these transcripts we identified 38 strong candidates for novel protein coding genes (six of which contained non-canonical translation start sites), and over 400 additional transcripts containing putative coding regions. Separation of RNA-seq data in two sets of strand specific reads was shown to greatly increase the quality of transcript assembly and facilitated the identification of 2291 occurrences of sense:antisense overlap between assembled transcripts, four of which have been proven experimentally. Finally, assembled transcripts have been used to detect multiple transcript isoforms arising from alternative splicing events. 374 distinct loci were identified as the origins of alternatively spliced transcripts, and six of these were verified experimentally.
Item Type: | Thesis (Doctor of Philosophy) |
---|---|
Additional Information: | Date: 2012-09 (completed) |
Uncontrolled Keywords: | Aspergillus nidulans transcriptomics rna-seq |
Divisions: | Faculty of Health and Life Sciences |
Depositing User: | Symplectic Admin |
Date Deposited: | 03 Sep 2013 10:52 |
Last Modified: | 17 Dec 2022 01:14 |
DOI: | 10.17638/00009973 |
Supervisors: |
|
URI: | https://livrepository.liverpool.ac.uk/id/eprint/9973 |