Development and exploitation of GeneFriends: An online database for gene and transcript co-expression analysis



Van Dam, S
(2017) Development and exploitation of GeneFriends: An online database for gene and transcript co-expression analysis. PhD thesis, University of Liverpool.

[img] Text
200780343_Feb2017.pdf - Unspecified

Download (2MB)
[img] Other
Thesis supplements revised.rar - Unspecified

Download (37MB)

Abstract

Although many diseases have been well characterized at the molecular level, the underlying mechanisms often remain unclear. This may be attributed to the large number of genes for which it remains unknown in which biological processes and diseases they play a role. Genes involved in the same biological processes and diseases are often co-expressed, which information can be used to predict the biological process a poorly annotated gene likely plays its primary role in. With this purpose, we constructed a co-expression network from a large number of microarray and RNA-seq samples. We conclude that co-expression analysis can be used to postulate the functions of both coding and non-coding genes. Additionally, it can be used to predict diseases they likely play an important role in. It is also shown that gene-function predictions based on a co-expression network that is constructed on a transcript rather than gene level can differentiate between different functions of transcripts originating from the same gene. We have created an online resource, GeneFriends, the first online resource that utilizes a co-expression network constructed from RNA-seq data, also allowing users to query for co-expression at the transcript rather than gene level. This allows researchers to identify and prioritize novel candidate genes and transcripts involved in biological processes and complex diseases. This is a valuable resource to the research community as supported by usage of GeneFriends in a number of independent publications. GeneFriends is available online at: http://GeneFriends.org/. To validate the ability of our tool to identify genes that are relevant to diseases, we tested GeneFriends by conducting a co-expression analysis with seed lists for aging, cancer, and mitochondrial complex I disease. We identified several candidate genes that have previously been predicted as relevant targets for each of these diseases. Some of the identified genes were already being tested in clinical trials supporting the effectiveness of this approach. Furthermore, two of the novel candidates of unknown function that were identified by GeneFriends as co-expressed with cancer genes were selected for experimental validation. Knock-down of the human homologs (C1ORF112 and C12ORF48) of these two candidate genes in HeLa cells slowed proliferation suggesting that these genes indeed play a role in cancer growth. Co-expression analyses often lead to large lists of gene-disease associations without a clear indication which genes are most relevant for follow up studies. To select such relevant genes, those that are important nodes in a co-expression network are often identified under the notion that these are of higher biological relevance than the others. To validate if this method selects the most relevant genes for aging, we conduct a co-expression analysis on a rat thymus dataset and identified transcription factors that are important network nodes. Whilst literature supports that some of these transcription factors may be important regulators of the aging process, this method can also miss some of the most interesting intervention targets. Lastly, in a rat brain aging RNA-seq dataset, generated in our lab, we tested if we could identify co-expression modules for which the expression correlates with aging and investigate if we can identify dietary interventions that potentially affected this correlation. Although modules were identified that correlated with aging, no significant effect of the dietary interventions for any of these modules was detected. Additionally, this dataset contained detailed information about the expression of microRNAs in addition to the whole transcriptome data. This was utilized to investigate if expression of microRNAs and their targets are negatively correlated, which we did not observe.

Item Type: Thesis (PhD)
Divisions: Faculty of Health and Life Sciences > Faculty of Health and Life Sciences
Depositing User: Symplectic Admin
Date Deposited: 23 Aug 2017 08:42
Last Modified: 19 Jan 2023 07:15
DOI: 10.17638/03006075
Supervisors:
  • de Magalhães, JP
URI: https://livrepository.liverpool.ac.uk/id/eprint/3006075