251. RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design
- Author
-
Anand, Rishabh, Joshi, Chaitanya K., Morehead, Alex, Jamasb, Arian R., Harris, Charles, Mathis, Simon V., Didi, Kieran, Hooi, Bryan, and Liò, Pietro
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning ,Quantitative Biology - Genomics - Abstract
We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone design. We build upon SE(3) flow matching for protein backbone generation and establish protocols for data preparation and evaluation to address unique challenges posed by RNA modeling. We formulate RNA structures as a set of rigid-body frames and associated loss functions which account for larger, more conformationally flexible RNA backbones (13 atoms per nucleotide) vs. proteins (4 atoms per residue). Toward tackling the lack of diversity in 3D RNA datasets, we explore training with structural clustering and cropping augmentations. Additionally, we define a suite of evaluation metrics to measure whether the generated RNA structures are globally self-consistent (via inverse folding followed by forward folding) and locally recover RNA-specific structural descriptors. The most performant version of RNA-FrameFlow generates locally realistic RNA backbones of 40-150 nucleotides, over 40% of which pass our validity criteria as measured by a self-consistency TM-score >= 0.45, at which two RNAs have the same global fold. Open-source code: https://github.com/rish-16/rna-backbone-design, Comment: To be presented as an Oral at ICML 2024 Structured Probabilistic Inference & Generative Modeling Workshop, and a Spotlight at ICML 2024 AI4Science Workshop
- Published
- 2024