Back to Search Start Over

ydata-profiling: Accelerating data-centric AI with high-quality data.

Authors :
Clemente, Fabiana
Ribeiro, Gonçalo Martins
Quemy, Alexandre
Santos, Miriam Seoane
Pereira, Ricardo Cardoso
Barros, Alex
Source :
Neurocomputing. Oct2023, Vol. 554, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

ydata-profiling is an open-source Python package for advanced exploratory data analysis that enables users to generate data profiling reports in a simple, fast, and efficient manner, fostering a standardized and visual understanding of the data. Beyond traditional descriptive properties and statistics, ydata-profiling follows a Data-Centric AI approach to exploratory analysis, as it focuses on the automatic detection and highlighting of complex data characteristics often associated with potential data quality issues, such as high ratios of missing or imbalanced data, infinite, unique, or constant values, skewness, high correlation, high cardinality, non-stationarity, seasonality, duplicate records, and other inconsistencies. The source code, documentation, and examples are available in the GitHub repository: https://github.com/ydataai/ydata-profiling. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
554
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
170047139
Full Text :
https://doi.org/10.1016/j.neucom.2023.126585