Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently



Kell, Douglas B ORCID: 0000-0001-5838-7963, Samanta, Soumitra and Swainston, Neil ORCID: 0000-0001-7020-1236
(2020) Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently. BIOCHEMICAL JOURNAL, 477 (23). pp. 4559-4580.

Access the full-text of this item by clicking on the Open Access link.

Abstract

The number of 'small' molecules that may be of interest to chemical biologists - chemical space - is enormous, but the fraction that have ever been made is tiny. Most strategies are discriminative, i.e. have involved 'forward' problems (have molecule, establish properties). However, we normally wish to solve the much harder generative or inverse problem (describe desired properties, find molecule). 'Deep' (machine) learning based on large-scale neural networks underpins technologies such as computer vision, natural language processing, driverless cars, and world-leading performance in games such as Go; it can also be applied to the solution of inverse problems in chemical biology. In particular, recent developments in deep learning admit the in silico generation of candidate molecular structures and the prediction of their properties, thereby allowing one to navigate (bio)chemical space intelligently. These methods are revolutionary but require an understanding of both (bio)chemistry and computer science to be exploited to best advantage. We give a high-level (non-mathematical) background to the deep learning revolution, and set out the crucial issue for chemical biology and informatics as a two-way mapping from the discrete nature of individual molecules to the continuous but high-dimensional latent representation that may best reflect chemical space. A variety of architectures can do this; we focus on a particular type known as variational autoencoders. We then provide some examples of recent successes of these kinds of approach, and a look towards the future.

Item Type: Article
Uncontrolled Keywords: Computer Simulation, Deep Learning, Cheminformatics
Depositing User: Symplectic Admin
Date Deposited: 17 Dec 2020 14:22
Last Modified: 18 Jan 2023 23:06
DOI: 10.1042/BCJ20200781
Open Access URL: https://portlandpress.com/biochemj/article/477/23/...
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3110669