Context: This study employs a data-guided approach to evaluate zeolites for hydrogen storage, utilizing molecular simulations. The development of efficient and practical hydrogen storage materials is crucial for advancing clean energy technologies. Zeolites have shown promise as potential candidates due to their unique porous structure and tunable properties. However, the selection and design of suitable zeolites for hydrogen storage remain challenging. Therefore, this work aims to address this materials science question by utilizing molecular simulations and data-guided approaches to evaluate zeolites' performance for hydrogen storage. The results obtained from this study provide valuable insights into the evaluation of zeolites for hydrogen storage. Through molecular simulations, we analyze the adsorption behavior of hydrogen molecules in various zeolite structures. The performance of different zeolite frameworks in terms of hydrogen storage capacity, adsorption energy, and diffusion properties is assessed. Linde type A zeolite (LTA) had the highest capacity with a hydrogen capacity of 4.8wt% out of the 233 investigated zeolites. Furthermore, we investigate the influence of different factors such as mass (M), density (D), helium void fraction (HVF), accessible pore volume (APV), gravimetric surface area (GSA), and largest overall cavity diameter (Di) on the hydrogen storage performance of zeolites. The results show that Di, D, and M have a negative effect on the percentage weight capacity, while GSA and VSA have the highest positive contribution to the percentage weight. This study, therefore, provides new insights into the factors that affect their hydrogen storage capacity by exhibiting the importance of considering multiple factors when evaluating the performance of zeolites and demonstrates the potential of combining different computational methods to provide a more comprehensive understanding of materials. The current study contributes to the understanding of zeolite-based materials for hydrogen storage applications, aiding in the development of more efficient and practical hydrogen storage systems., Methods: Computational techniques were employed to investigate the hydrogen storage properties of zeolites. Molecular simulations were performed using classical force fields and molecular dynamics methods. The calculations were carried out at a force field level of theory with the GGA functional. To accurately capture the thermodynamics and kinetics of hydrogen adsorption, enhanced sampling techniques such as Monte Carlo simulations and molecular dynamics with metadynamics were utilized. We employed Grand Canonical Monte Carlo (GCMC) simulations to model hydrogen adsorption in zeolite structures for hydrogen storage. Our approach involved performing a substantial number of Monte Carlo steps (10,000) to ensure system equilibration and precise results. We defined a cutoff distance for particle interactions as 12.5 Ǻ and considered 0.000e framework charge per cell and 0.000e sorbate charge in energy calculations. The choice of an appropriate simulation cell size (50 × 50 × 50) Ǻ was crucial, mirroring real-world conditions. We specified lower and upper fugacity values (1 to 10 atm) to capture the range of gas pressures in the simulations. These methodical steps collectively enabled us to accurately model hydrogen adsorption within zeolites, forming the core of our hydrogen storage evaluation. In this research, we utilized DFT calculations to thoroughly investigate the interactions between zeolites and hydrogen. We employed pseudopotentials to describe electron behavior in zeolite systems, choosing them in line with DFT norms and basis set compatibility. Our simulation cell design replicated zeolite periodicity and eliminated boundary effects. Pre-geometry optimization was performed with HyperChem29, ensuring stable conformations with strict convergence criteria. We utilized 6-31 + G(d) and LanL2DZ basis sets for light and heavy atoms, aligning with field standards for computational efficiency and precision. A machine learning algorithm was used to rank the importance of various structural features such as mass (M), density (D), helium void fraction (HVF), accessible pore volume (APV), gravimetric surface area (GSA), and largest overall cavity diameter (Di) and how they affect the capacity of the zeolites. Machine learning analysis was performed with the Scikit-learn library, an open-source Python tool. We employed a range of machine learning models, including SVMs, random forests, and neural networks, primarily for data analysis and feature extraction. Pearson correlation analysis, a classical statistical technique, was used to evaluate linear relationships between variables and assess the strength and direction of these relationships. It served as a complementary tool to understand the interplay of variables in our dataset, distinguishing it from machine learning algorithms. Further quantum chemical calculations were also performed to calculate the adsorption energy, global reactivity electronic descriptors, and natural bond orbital analysis in order to provide insights into the interaction of the zeolites with hydrogen. The simulations and data analysis were performed using BIOVIA material studio software, Gaussian, and Origin Pro software., (© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.)