1. Wikidata Completeness Profiling Using ProWD
- Author
-
Simon Razniewski, Avicenna Wisesa, Fariz Darari, Adila Krisnadhi, and Werner Nutt
- Subjects
Information retrieval ,Computer science ,010401 analytical chemistry ,02 engineering and technology ,computer.file_format ,01 natural sciences ,0104 chemical sciences ,Data profiling ,020204 information systems ,Falsity ,Data quality ,0202 electrical engineering, electronic engineering, information engineering ,SPARQL ,Profiling (information science) ,RDF ,Date of birth ,Completeness (statistics) ,computer - Abstract
Completeness is a crucial data quality aspect that deals with the question: do we have all the data we need? The lack of awareness on the completeness state of a knowledge graph (KG) may result in bias or even falsity for any decisions made based on the KG. Given a KG, one may be wondering how its completeness may vary across different topics. In this paper, we present ProWD, a framework and tool for profiling the completeness of Wikidata, a central KG on the (Semantic) Web that is open and free to use. ProWD measures the degree of completeness based on the Class-Facet-Attribute (CFA) profiles. A class denotes a collection of entities, which can be of multiple facets, allowing attribute completeness to be analyzed and compared, e.g., how does the completeness of the attribute "educated at" and "date of birth" compare between male, German computer scientists, and female, Indonesian computer scientists? ProWD generates summaries and visualizations for such analysis, giving insights into the KG completeness. ProWD is available online at~\urlhttp://prowd.id.
- Published
- 2019
- Full Text
- View/download PDF