Unifying sequence-structure coding for advanced protein engineering via a multimodal diffusion transformer

Xiaohan Lin; Zhenyu Chen; Yanheng Li; Zicheng Ma; Chuanliu Fan; Ziqiang Cao; Shihao Feng; Jun Zhang; Yi Qin Gao

doi:10.1039/D5SC02055G

Unifying sequence-structure coding for advanced protein engineering via a multimodal diffusion transformer†

Xiaohan Lin,‡^a Zhenyu Chen,‡^a Yanheng Li,‡^a Zicheng Ma,^bc Chuanliu Fan,^d Ziqiang Cao,^d Shihao Feng,*^b Jun Zhang*^b and Yi Qin Gao

*^ab

Author affiliations

* Corresponding authors

^a Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
E-mail: gaoyq@pku.edu.cn

^b Changping Laboratory, Beijing 102200, China
E-mail: fengsh@cpl.ac.cn, jzhang@cpl.ac.cn

^c Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China

^d Institute of Artificial Intelligence, Soochow University, Suzhou 215006, China

Abstract

Modern protein engineering demands integrated sequence–structure representations to tackle key challenges in designing, modifying, and evolving proteins for specific functions. While sequence-based methods are promising for generating novel proteins, incorporating structure-oriented information improves the success rate and helps target corresponding functions. Therefore, rather than relying solely on sequence or structure-based approaches, a consensus strategy is essential. Here, we introduce ProTokens, machine-learned “amino acids” derived from structural databases via self-supervised learning, providing a compact yet information-rich representation that bridges sequence and structure modalities. Instead of treating sequences and structures separately, we build PT-DiT, a multimodal diffusion transformer-based model that integrates both into a unified representation, enabling protein engineering in a joint sequence–structure space, streamlining the design process and facilitating the efficient encoding of 3D folds, contextual protein design, sampling of metastable states, and directed evolution for diverse objectives. Therefore, as a unified solution for in silico protein engineering, PT-DiT leverages sequence and structure insights to realize functional protein design.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D5SC02055G
Article type: Edge Article
Submitted: 16 Mar 2025
Accepted: 14 May 2025
First published: 15 May 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry

Download Citation

Chem. Sci., 2025, Advance Article

Permissions

Request permissions

Unifying sequence-structure coding for advanced protein engineering via a multimodal diffusion transformer

X. Lin, Z. Chen, Y. Li, Z. Ma, C. Fan, Z. Cao, S. Feng, J. Zhang and Y. Q. Gao, Chem. Sci., 2025, Advance Article , DOI: 10.1039/D5SC02055G

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Chemical Science

Unifying sequence-structure coding for advanced protein engineering via a multimodal diffusion transformer†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Unifying sequence-structure coding for advanced protein engineering via a multimodal diffusion transformer

Social activity

Search articles by author

Spotlight

Advertisements