Molecular epidemiology of lung cancer in the Liverpool lung project (LLP) cohort



De Souza, Nicosha
Molecular epidemiology of lung cancer in the Liverpool lung project (LLP) cohort. PhD thesis, University of Liverpool.

[thumbnail of DeSouzaNic_July2014_2002419.pdf] Text
DeSouzaNic_July2014_2002419.pdf - Unspecified
Available under License Creative Commons Attribution No Derivatives.

Download (3MB)

Abstract

The primary aim of the project was to evaluate the epidemiological and genetic susceptibility factors associated with lung cancer, in the Liverpool Lung Project (LLP) population. The associated datasets available for research with the LLP dataset (questionnaire) were: Office of National Statistics (ONS), Health Episode Statistics (HES) data with comorbidity data, single nucleotide polymorphism (SNP) data of 570 cases from Liverpool, 3000 controls from the 1958 Birth Cohort. The epidemiological (HES) data was used to study the effect of Charlson (CCI) and Elixhauser comorbidity index (ECI) on the incidence of lung cancer using the Cox proportional hazard regression and use the same HES data to design a 5-year sex specific incidence model for lung cancer with crucial covariates. The ECI and CCI were significant in both univariate and multivariate analyses adjusted for age at the start of the study, sex and smoking pack years. The developed models had a good discriminatory power (AUCmale = 0.73; AUCfemale = 0.77) when internally validated using a 10-fold cross validation. The genetic data for the LLP lung cancer cases was used in several contexts: i) to identify SNPS associated with lung cancer under a range of allelic models (additive, dominant, recessive and genotypic), using the Wellcome trust 1958 Birth Cohort as a control dataset; ii) to identify SNPs associated with cause specific and overall survival in lung cancer patients, utilising the Cox proportional hazard model with adjustment for various covariates; and iii) to identify gene pathways that are associated with lung cancer survival using the random forest survival method. SNPs within the genes PRDM11, ZNF382 and HMGA2 were identified in the genome wide case-control study when using the additive, dominant or genotypic models, whereas the recessive model identified the gene ITIH2. Significant SNPs (p≤10-6) associated with cause-specific survival in early stage cases were rs10230420 (WIPF3), rs3746619 and rs3827103 (both in MC3R). In advanced stage cases, significant SNPs were rs1868110 (NEK10) and rs2206779 (AF357533). For the overall survival analysis, significant SNPs were rs10230420 (WIPF3), rs2056533 (ZBTB20) and rs6708630 (CYS1) in early stage cases, whereas rs1868110 (NEK10) and rs2206779 (AF357533) were significantly associated with overall survival in advanced stage NSCLC cases. The pathway analysis using the random survival forest method was undertaken on 18 pathways for both cause-specific and overall survival of lung cancer cases. The results were consistent with apoptosis, base excision repair and mismatch repair being pathways influencing survival.

Item Type: Thesis (PhD)
Additional Information: Date: 2014-07 (completed)
Depositing User: Symplectic Admin
Date Deposited: 29 Jan 2015 09:13
Last Modified: 17 Dec 2022 00:49
DOI: 10.17638/02006199
URI: https://livrepository.liverpool.ac.uk/id/eprint/2006199