Machine-learning-guided identification of protein secondary structures using spectral and structural descriptors

Abstract

Interrogation of the secondary structures of proteins is essential for designing and engineering more effective and safer protein-based biomaterials and other classes of theranostic materials. Protein secondary structures are commonly assessed using circular dichroism spectroscopy, followed by relevant downstream analysis using specialized software. As many proteins have complex secondary structures beyond the typical α-helix and β-sheet configurations, and the derived secondary structural contents are significantly influenced by the selection of software, estimations acquired through conventional methods may be less reliable. Herein, we propose the implementation of a machine-learning-based approach to improve the accuracy and reliability of the classification of protein secondary structures. Specifically, we leverage supervised machine learning to analyze the circular dichroism spectra and relevant attributes of 112 proteins to predict their secondary structures. Based on a range of spectral, structural, and molecular features, we systematically evaluate the predictive performance of numerous supervised classifiers and identify optimal combinations of algorithms with descriptors to achieve highly accurate and precise estimations of protein secondary structures. We anticipate that this work will offer a deeper insight into the development of machine-learning-based approaches to streamline the delineation of protein structures for different biological and biomedical applications.

Graphical abstract: Machine-learning-guided identification of protein secondary structures using spectral and structural descriptors

Supplementary files

Article information

Article type
Paper
Submitted
30 jan 2025
Accepted
21 apr 2025
First published
29 apr 2025
This article is Open Access
Creative Commons BY-NC license

Biomater. Sci., 2025, Advance Article

Machine-learning-guided identification of protein secondary structures using spectral and structural descriptors

Z. Wang and Kenry, Biomater. Sci., 2025, Advance Article , DOI: 10.1039/D5BM00153F

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements