Exploring unconventional approaches to Molecular Replacement in X-ray crystallography with SIMBAD



Simpkin, Adam
(2020) Exploring unconventional approaches to Molecular Replacement in X-ray crystallography with SIMBAD. PhD thesis, University of Liverpool.

[img] Text
200875953_Jan2020.pdf

Download (77MB) | Preview

Abstract

Molecular replacement (MR) is the most popular technique to solve the phase problem in macromolecular crystallography. The conventional approach to finding search models for MR is to use the sequence of the target structure to identify a suitable homologue. This approach is based on the assumption that sequence similarity is a useful guide to structural similarity. Whilst largely true, this strategy is not always effective. For example, when a contaminant protein has been crystallised or when the most similar matches sequentially are not the most similar structurally. This thesis describes the development of SIMBAD, a three-step pipeline to perform sequence-independent MR. The first step performs a lattice-parameter search against the entire Protein Data Bank (PDB), rapidly determining whether the protein or a close homologue has been solved in the same crystal form. The second step is designed to screen the data against a database of known contaminants; thus determining if a contaminant protein has been crystallised. The final step is a brute-force search of a non-redundant derivative of the PDB provided by the MoRDa software package. In Chapter 3 the initial implementation of SIMBAD using AMoRe’s fast rotation function is presented, with encouraging results. Testing on a set of structures that covered a wide range of resolution limits, copies in the asymmetric unit, space groups, monomer sizes and secondary-structure types, gave a 40% success-rate with the full MoRDa database search and increased to 52% when combined with the lattice-parameter search. Further validation has come in the form of nine structures deposited to the PDB which used SIMBAD for structure solution. Leading on from the work in Chapter 3, research was carried out on whether the maximum-likelihood enhanced rotation function in Phaser would improve the sensitivity of the full MoRDa database search. Results presented in Chapter 4 show that the use of Phaser yielded a 60% success-rate on the test cases, a marked improvement on the previous iteration of SIMBAD. Combining this method with ensemble search models improved this further to 68%. Lastly, Chapter 5 explores the use of anomalous Fourier maps (AFMs) to validate partial MR solutions obtained from SIMBAD. This was necessary as the absence of sequence information meant that automated model building could not be included in the pipeline as a means to test the correctness of a potential solution. The findings in Chapter 5 demonstrate that when anomalous signal was available, the maximum peak height obtained in AFMs could be combined with R-free to train a classifier with 99% precision and recall.

Item Type: Thesis (PhD)
Divisions: Fac of Health & Life Sciences > Institute of Integrative Biology
Depositing User: Symplectic Admin
Date Deposited: 08 Apr 2020 14:23
Last Modified: 03 Mar 2021 10:03
DOI: 10.17638/03072448
Supervisors:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3072448