Back to Search Start Over

Performance prediction of polymer-fullerene organic solar cells and data mining-assisted designing of new polymers.

Authors :
Xiao, Fei
Saqib, Muhammad
Razzaq, Soha
Mubashir, Tayyaba
Tahir, Mudassir Hussain
Moussa, Ihab Mohamed
El-ansary, Hosam O.
Source :
Journal of Molecular Modeling. Aug2023, Vol. 29 Issue 8, p1-12. 12p.
Publication Year :
2023

Abstract

Context: Selecting high performance polymer materials for organic solar cells (OSCs) remains a compelling goal to improve device morphology, stability, and efficiency. To achieve these goals, machine learning has been reported as a powerful set of algorithms/techniques to solve complex problems and help/guide exploratory researchers to screen, map, and develop high performance materials. In present work, we have applied machine learning tools to screen data from reported studies and designed new polymer acceptor materials, respectively. Quantitative structure-activity relationship (QSAR) models were generated using machine learning-assisted simulation techniques. For this purpose, 3000 molecular descriptors are generated. Consequently, molecular descriptors having key effect on power conversion efficiency (PCE) were identified. Moreover, numerous regression models (e.g., random forest and bagging regressor models) were developed to predict the PCE. In particular, new materials were designed based on the similarity analysis. The GDB17 chemical database consisting of 166 million organic molecules in an ordered form is used for performing similarity analysis. A similarity behavior between GDB17 materials and the materials reported in literature is studied using RDKit (a cheminformatics software). Noteworthily, 100 monomers proved to be unique and effective, and PCEs of these monomers are predicted. Among these monomers, four monomers exhibited PCE higher than 14%, which is better than various reported studies. Our methodology provides a unique, time- and cost-efficient approach to screening and designing new polymers for OSCs using similarity analysis without revisiting the reported studies. Methods: To perform machine learning analysis, data from reported studies and online databases was collected. Different molecular descriptors were generated for polymer materials utilizing Dragon software. 3D structures of studied molecules were applied as input (SDF; structure data file format). Importantly, about 3000 molecular descriptors were generated. Comma-separated value (.csv) file format was used to export these molecular descriptors. To shortlist best descriptors, univariate regression analysis was performed. These descriptors were further utilized for training machine learning models. Moreover, necessary packages of Python for data analysis and visualization were imported such as Matplotlib, Numpy, Pandas, Scikit-learn, Seaborn, and Scipy. Random forest and bagging regressor models were applied for performing machine learning analysis. A cheminformatics software, RDKit, was applied for similarity analysis. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
16102940
Volume :
29
Issue :
8
Database :
Academic Search Index
Journal :
Journal of Molecular Modeling
Publication Type :
Academic Journal
Accession number :
169995489
Full Text :
https://doi.org/10.1007/s00894-023-05677-3