Back to Search
Start Over
Persistent Dirac for molecular representation
- Publication Year :
- 2023
-
Abstract
- Molecular representations are of fundamental importance for the modeling and analysis of molecular systems. Representation models and in general approaches based on topological data analysis (TDA) have demonstrated great success in various steps of drug design and materials discovery. Here we develop a mathematically rigorous computational framework for molecular representation based on the persistent Dirac operator. The properties of the spectrum of the discrete weighted and unweighted Dirac matrices are systemically discussed and used to demonstrate the geometric and topological properties of both non-homology and homology eigenvectors of real molecular structures. This allows us to asses the influence of weighting schemes on the information encoded in the Dirac eigenspectrum. A series of physical persistent attributes, which characterize the spectrum of the Dirac matrices across a filtration, are proposed and used as efficient molecular fingerprints. Finally, our persistent Dirac-based model is used for clustering molecular configurations from nine types of organic-inorganic halide perovskites. We found that our model can cluster the structures very well, demonstrating the representation and featurization power of the current approach.<br />Comment: 22 pages, 7 figures
- Subjects :
- Quantitative Biology - Biomolecules
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2302.02386
- Document Type :
- Working Paper