1. Large-scale prediction of collision cross-section with very deep graph convolutional network for small molecule identification.
- Author
-
Xie, Ting, Yang, Qiong, Sun, Jinyu, Zhang, Hailiang, Wang, Yue, Zhang, Zhimin, and Lu, Hongmei
- Subjects
- *
GRAPH neural networks , *ION mobility spectroscopy , *MASS spectrometry , *RF values (Chromatography) , *ADRENAL glands - Abstract
Ion mobility spectrometry (IMS) is a promising analytical technique for mass spectrometry (MS)-based compound identification by providing collision cross-section (CCS) value as an additional dimension with structural information. Here, GraphCCS was proposed to accurately predict the CCS value and expand the coverage of CCS libraries. A new adduct encoding method was proposed to encode SMILES strings and adduct types of compounds into adduct graphs. GraphCCS extended its predictive capability to ten different adduct types. A very deep graph convolutional network with up to 40 GC N layers was built to predict CCS values from adduct graphs. A curated dataset with 12,775 experimental CCS values was used to train, validate, and test the GraphCCS model. The resulting CCS predictions achieved a median relative error (MedRE) of 0.94 % and a coefficient of determination (R2) of 0.994 on the test set. Results on external test sets showed that GraphCCS outperformed AllCCS2, CCSbase, SigmaCCS, and DeepCCS. Based on the developed GraphCCS method, a large-scale in-silico database was built, including 2,394,468 CCS values. Those CCS values can be used to filter false positives complementary to retention times and tandem mass spectra. Finally, the effectiveness of GraphCCS in assisting compound identification was tested on a mouse adrenal gland lipid dataset with 1,960 lipids. The results demonstrated that the in-silico CCS values combined with MS spectra and retention times can efficiently filter the false positive candidates. • GraphCCS encodes different adduct ions by constructing adduct graphs as the inputs to the graph neural network. • A very deep graph convolutional network with up to 40 GCN layers was trained to predict CCS values. • GraphCCS can predict CCS values for different adduct types, different compound classes, and different IMS instruments. • The predicted CCS can be complemented with retention time and m / z , thus effectively filtering out false positive compounds. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF