Horder, Joseph
ORCID: 0000-0002-5714-6655, Connor, Abbie, Duggan, Amy, Hale, Joshua, McDermott, Frederick, Norris, Luke
ORCID: 0009-0002-4842-8164, Whinney, Sophie JD, Mesdaghi, Shahram, Murphy, David, Simpkin, Adam
ORCID: 0000-0003-1883-9376 et al (show 2 more authors)
(2023)
Deep Learning-based structural and functional annotation of Pandoravirus hypothetical proteins
[Preprint]
Abstract
Giant viruses, including Pandoraviruses, contain large amounts of genomic ‘dark matter’ - genes encoding proteins of unknown function. New generation, deep learning-based protein structure modelling offers new opportunities to apply structure-based function inference to these sequences, often labelled as hypothetical proteins. However, the AlphaFold Protein Structure Database, a convenient resource covering the majority of UniProt, currently lacks models for most viral proteins. Here, we apply a panoply of predictive methods to protein structure predictions representative of large clusters of hypothetical proteins shared among four Pandoraviruses. In several cases, strong functional predictions can be made. Thus, we identify a likely nucleotidyltransferase putatively involved in viral tRNA maturation that has a BTB domain presumably involved in protein-protein interactions. We further identify a cluster of membrane channel sequences presenting three paralogous families which may, as seen in other giant viruses, induce host cell membrane depolarization. And we identify homologues of calcium-activated potassium channel beta subunits and pinpoint their likely Acanthamoeba cellular alpha subunit counterparts. Despite these successes, many other clusters remain cryptic, having folds that are either too functionally promiscuous or too novel to provide strong clues as to their role. These results suggest that significant structural and functional novelty remains to be uncovered in the giant virus proteomes.
| Item Type: | Preprint |
|---|---|
| Uncontrolled Keywords: | 3101 Biochemistry and Cell Biology, 3102 Bioinformatics and Computational Biology, 31 Biological Sciences, Genetics, Networking and Information Technology R&D (NITRD), Biotechnology, Infectious Diseases, Machine Learning and Artificial Intelligence, 2.1 Biological and endogenous factors |
| Divisions: | Faculty of Health & Life Sciences Faculty of Health & Life Sciences > Inst. Systems, Molec & Integrative Biology > Inst. Systems, Molec & Integrative Biology |
| Depositing User: | Symplectic Admin |
| Date Deposited: | 15 Mar 2024 16:43 |
| Last Modified: | 15 Jan 2026 21:47 |
| DOI: | 10.1101/2023.12.02.569716 |
| Open Access URL: | https://www.biorxiv.org/content/10.1101/2023.12.02... |
| Related Websites: | |
| URI: | https://livrepository.liverpool.ac.uk/id/eprint/3179495 |
| Disclaimer: | The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate. |
Altmetric
Altmetric