Towards Optimal Grammars for RNA Structures



Onokpasa, Evarista, Wild, Sebastian ORCID: 0000-0002-6061-9177 and Wong, Prudence WH ORCID: 0000-0001-7935-7245
(2024) Towards Optimal Grammars for RNA Structures. In: 2024 Data Compression Conference (DCC), 2024-3-19 - 2024-3-22.

[thumbnail of optimal-grammars.pdf] Text
optimal-grammars.pdf - Author Accepted Manuscript
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

In past work (Onokpasa, Wild, Wong, DCC 2023), we showed that (a) for joint compression of RNA sequence and structure, stochastic context-free grammars are the best known compressors and (b) that grammars which have better compression ability also show better performance in ab initio structure prediction. Previous grammars were manually curated by human experts. In this work, we develop a framework for automatic and systematic search algorithms for stochastic grammars with better compression (and prediction) ability for RNA. We perform an exhaustive search of small grammars and identify grammars that surpass the performance of human-expert grammars.

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: 31 Biological Sciences, 3102 Bioinformatics and Computational Biology
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 12 Jun 2024 08:47
Last Modified: 08 Nov 2024 12:55
DOI: 10.1109/dcc58796.2024.00041
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3182157