Back to Search Start Over

Classification of helical polymers with deep-learning language models.

Authors :
Li, Daoyi
Jiang, Wen
Source :
Journal of Structural Biology. Dec2023, Vol. 215 Issue 4, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

[Display omitted] • HLM is a method that classifies helical filaments using 2D classification labels. • HLM is the first method that applies language models to cryo-EM image processing. • HLM is validated with many simulation tests and experimental cryo-EM datasets in EMPIAR. • The Transformer-based pipeline is more robust than the traditional word2vec pipeline. • HLM tests led to the discovery of a novel amyloid variant. Many macromolecules in biological systems exist in the form of helical polymers. However, the inherent polymorphism and heterogeneity of samples complicate the reconstruction of helical polymers from cryo-EM images. Currently, available 2D classification methods are effective at separating particles of interest from contaminants, but they do not effectively differentiate between polymorphs, resulting in heterogeneity in the 2D classes. As such, it is crucial to develop a method that can computationally divide a dataset of polymorphic helical structures into homogenous subsets. In this work, we utilized deep-learning language models to embed the filaments as vectors in hyperspace and group them into clusters. Tests with both simulated and experimental datasets have demonstrated that our method – HLM (H elical classification with L anguage M odel) can effectively distinguish different types of filaments, in the presence of many contaminants and low signal-to-noise ratios. We also demonstrate that HLM can isolate homogeneous subsets of particles from a publicly available dataset, resulting in the discovery of a previously unreported filament variant with an extra density around the tau filaments. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10478477
Volume :
215
Issue :
4
Database :
Academic Search Index
Journal :
Journal of Structural Biology
Publication Type :
Academic Journal
Accession number :
173973983
Full Text :
https://doi.org/10.1016/j.jsb.2023.108041