Reference Statistics in Wikidata Topical Subsets

Seyed Amir Hosseini Beghaeiraveri, Alasdair J. G. Gray, Fiona McNeill

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)
55 Downloads (Pure)


Wikidata is the only general-purpose open knowledge graph
with the capability of specifying references for every single statement. Currently, about 68% of Wikidata statements have at least one reference but the quality of these references is rarely covered in data quality studies. There is also a lack of a comprehensive framework for evaluating references. In this paper, we investigate the statistics of Wikidata references in 6 topical subsets of Wikidata. We compare these statistics over two Wikidata dumps; one from 2016 and one from 2021.
Original languageEnglish
Article number3
JournalCEUR Workshop Proceedings
Publication statusPublished - 14 Oct 2021
Event2nd Wikidata Workshop 2021 -
Duration: 24 Oct 202124 Oct 2021


  • Data quality
  • Gene Wiki
  • Reference quality
  • Topical subset
  • WikiProject
  • Wikidata

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'Reference Statistics in Wikidata Topical Subsets'. Together they form a unique fingerprint.

Cite this