Issue 1, 2022

Improving the analysis of biological ensembles through extended similarity measures

Abstract

We present new algorithms to classify structural ensembles of macromolecules based on the recently proposed extended similarity measures. Molecular dynamics provides a wealth of structural information on systems of biological interest. As computer power increases, we capture larger ensembles and larger conformational transitions between states. Typically, structural clustering provides the statistical mechanics treatment of the system to identify relevant biological states. The key advantage of our approach is that the newly introduced extended similarity indices reduce the computational complexity of assessing the similarity of a set of structures from O(N2) to O(N). Here we take advantage of this favorable cost to develop several highly efficient techniques, including a linear-scaling algorithm to determine the medoid of a set (which we effectively use to select the most representative structure of a cluster). Moreover, we use our extended similarity indices as a linkage criterion in a novel hierarchical agglomerative clustering algorithm. We apply these new metrics to analyze the ensembles of several systems of biological interest such as folding and binding of macromolecules (peptide, protein, DNA–protein). In particular, we design a new workflow that is capable of identifying the most important conformations contributing to the protein folding process. We show excellent performance in the resulting clusters (surpassing traditional linkage criteria), along with faster performance and an efficient cost-function to identify when to merge clusters.

Graphical abstract: Improving the analysis of biological ensembles through extended similarity measures

Supplementary files

Article information

Article type
Paper
Submitted
01 Sep 2021
Accepted
27 Nov 2021
First published
29 Nov 2021

Phys. Chem. Chem. Phys., 2022,24, 444-451

Improving the analysis of biological ensembles through extended similarity measures

L. Chang, A. Perez and R. A. Miranda-Quintana, Phys. Chem. Chem. Phys., 2022, 24, 444 DOI: 10.1039/D1CP04019G

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements