Experiences of Using WDumper to Create Topical Subsets from Wikidata

Seyed Amir Hosseini Beghaeiraveri, Alasdair J. G. Gray, Fiona J. McNeill

Research output: Contribution to journalConference articlepeer-review

3 Downloads (Pure)

Abstract

Wikidata is a general-purpose knowledge graph covering a wide variety of topics with content being crowd-sourced through an open wiki. There are now over 90M interrelated data items in Wikidata which are accessible through a public query endpoint and data dumps. However, execution timeout limits and the size of data dumps make it difficult to use the data. The creation of arbitrary topical subsets of Wikidata, where only the relevant data is kept, would enable reuse of that data with the benefits of cost reduction, ease of access, and flexibility. In this paper, we provide a working definition for topical subsets over the Wikidata Knowledge Graph and evaluate a third-party tool (WDumper) to extract these topical subsets from Wikidata.
Original languageEnglish
Article number13
JournalCEUR Workshop Proceedings
Volume2873
Publication statusPublished - 2 Jun 2021
Event2nd International Workshop On Knowledge Graph Construction: Co-located with the ESWC 2021 - online, Hersonissos, Greece
Duration: 6 Jun 20216 Jun 2021

Keywords

  • Knowledge graph subsetting
  • Topical subset
  • Wdumper
  • Wikidata

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Experiences of Using WDumper to Create Topical Subsets from Wikidata'. Together they form a unique fingerprint.

Cite this