Exploiting Advanced Methods for Membrane Protein Structure Prediction

Mesdaghi, Shahram
(2023) Exploiting Advanced Methods for Membrane Protein Structure Prediction. PhD thesis, University of Liverpool.

[img] Text
Thesis_shah_corrected_30JUN23.pdf - Author Accepted Manuscript

Download (42MB) | Preview


Recent strides in computational structural biology have opened up an opportunity to understand previously uncharacterised proteins. The under-representation of transmembrane proteins in the Protein Data Bank highlights the need to apply new and advanced bioinformatics methods to shed light on their structure and function. A protein’s structural information is crucial to understand its function and evolution. Currently, there is only experimental structural data for a tiny fraction of proteins. For instance, membrane proteins are encoded by 30% of the protein-coding genes of the human genome, but they only have a 3.5% representation in the Protein Data Bank (PDB). Membrane protein families are particularly poorly understood due to experimental difficulties, such as over-expression, which can result in toxicity to host cells, as well as difficulty in finding a suitable membrane mimetic to reconstitute the protein. Additionally, membrane proteins are much less conserved across species compared to water-soluble proteins, making sequence-based homologue identification a challenge, and in turn rendering homology modelling of these proteins more difficult. Until the structure of poorly characterised protein families can be elucidated experimentally, ab initio protein modelling can be used to predict a fold allowing for structure based function inferences. Such methods have made significant strides recently due to the availability of contact predictions, with these methods addressing larger targets than conventional fragment-assembly-based ab initio methods. This study initially focusses on the structure and function transmembrane proteins specifically in the process of autophagosome construction and demonstrates how covariance prediction data have multiple roles in modern structural bioinformatics: not just by acting as restraints for model making and serving for validation of the final models but by predicting domain boundaries and revealing the presence of cryptic internal repeats not evidenced by sequence analysis. Furthermore, we characterised a contact map feature characteristic of a re-entrant helix which may in future allow detection of this feature in other protein families. The recent innovations in computational structural biology were employed further giving rise to an opportunity to revise our current understanding of the structure and function of clinically important proteins. Through the modelling of the transmembrane Pfam families and subsequent mining of their structural libraries we identified the human Oca2 protein as a protein of interest. Oca2 is located on mature melanosomal membranes and mutations of Oca2 can result in a form of oculocutanous albinism which is the most prevalent and visually identifiable form of albinism. Sequence analysis predicts Oca2 to be a member of the SLC13 transporter family but it has not been classified into any existing SLC families. The modelling of Oca2 with AlphaFold2 and other advanced methods shows that, like SLC13 members, it consists of a scaffold and transport domain and displays a pseudo inverted repeat topology that includes re-entrant loops. This finding contradicts the prevailing consensus view of its topology. In addition to the scaffold and transport domains the presence of a cryptic GOLD domain is revealed that is likely responsible for its trafficking from the endoplasmic reticulum to the Golgi prior to localisation at the melanosomes and possesses known glycosylation sites. Analysis of the putative ligand binding site of the model shows the presence of highly conserved key asparagine residues that suggest Oca2 may be a Na+/dicarboxylate symporter. Known critical pathogenic mutations map to structural features present in the repeat regions that form the transport domain. Exploiting the AlphaFold2 multimeric modelling protocol in combination with conventional homology modelling allowed the building of a plausible homodimer in both an inward- and outward-facing conformation supporting an elevator-type transport mechanism.

Item Type: Thesis (PhD)
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Tech, Infrastructure and Environmental Directorate
Depositing User: Symplectic Admin
Date Deposited: 30 Aug 2023 08:53
Last Modified: 30 Aug 2023 08:53
DOI: 10.17638/03171922
  • Rigden, Daniel
URI: https://livrepository.liverpool.ac.uk/id/eprint/3171922