Data Quality Issues in Current Nanopublications

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Nanopublications are a granular way of publishing scientific claims together with their associated provenance and publication information. More than 10 million nanopublications have been published by a handful of researchers covering a wide range of topics within the life sciences. We were motivated to replicate an existing analysis of these nanopublications, but then went deeper into the structure of the existing nanopublications. In this paper, we analyse the usage of nanopublications by investigating the distribution of triples in each part and discuss the data quality issues that were subsequently revealed. We argue that there is a need for the community to develop a set of guidelines for the modelling of nanopublications.
Original languageEnglish
Title of host publication2019 IEEE 15th International Conference on e-Science (e-Science)
PublisherIEEE
Publication statusAccepted/In press - 29 Jul 2019

Keywords

  • Semantic Publication
  • Nanopublication
  • Reproducibility
  • Provenance
  • Linked Data
  • Data Quality

Cite this

Asif, I., Chen-Burger, J., & Gray, A. J. G. (Accepted/In press). Data Quality Issues in Current Nanopublications. In 2019 IEEE 15th International Conference on e-Science (e-Science) IEEE.
Asif, Imran ; Chen-Burger, Jessica ; Gray, Alasdair J. G. / Data Quality Issues in Current Nanopublications. 2019 IEEE 15th International Conference on e-Science (e-Science). IEEE, 2019.
@inproceedings{4080f18693464efba800f0682d59ed67,
title = "Data Quality Issues in Current Nanopublications",
abstract = "Nanopublications are a granular way of publishing scientific claims together with their associated provenance and publication information. More than 10 million nanopublications have been published by a handful of researchers covering a wide range of topics within the life sciences. We were motivated to replicate an existing analysis of these nanopublications, but then went deeper into the structure of the existing nanopublications. In this paper, we analyse the usage of nanopublications by investigating the distribution of triples in each part and discuss the data quality issues that were subsequently revealed. We argue that there is a need for the community to develop a set of guidelines for the modelling of nanopublications.",
keywords = "Semantic Publication, Nanopublication, Reproducibility, Provenance, Linked Data, Data Quality",
author = "Imran Asif and Jessica Chen-Burger and Gray, {Alasdair J. G.}",
year = "2019",
month = "7",
day = "29",
language = "English",
booktitle = "2019 IEEE 15th International Conference on e-Science (e-Science)",
publisher = "IEEE",
address = "United States",

}

Asif, I, Chen-Burger, J & Gray, AJG 2019, Data Quality Issues in Current Nanopublications. in 2019 IEEE 15th International Conference on e-Science (e-Science). IEEE.

Data Quality Issues in Current Nanopublications. / Asif, Imran; Chen-Burger, Jessica; Gray, Alasdair J. G.

2019 IEEE 15th International Conference on e-Science (e-Science). IEEE, 2019.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Data Quality Issues in Current Nanopublications

AU - Asif, Imran

AU - Chen-Burger, Jessica

AU - Gray, Alasdair J. G.

PY - 2019/7/29

Y1 - 2019/7/29

N2 - Nanopublications are a granular way of publishing scientific claims together with their associated provenance and publication information. More than 10 million nanopublications have been published by a handful of researchers covering a wide range of topics within the life sciences. We were motivated to replicate an existing analysis of these nanopublications, but then went deeper into the structure of the existing nanopublications. In this paper, we analyse the usage of nanopublications by investigating the distribution of triples in each part and discuss the data quality issues that were subsequently revealed. We argue that there is a need for the community to develop a set of guidelines for the modelling of nanopublications.

AB - Nanopublications are a granular way of publishing scientific claims together with their associated provenance and publication information. More than 10 million nanopublications have been published by a handful of researchers covering a wide range of topics within the life sciences. We were motivated to replicate an existing analysis of these nanopublications, but then went deeper into the structure of the existing nanopublications. In this paper, we analyse the usage of nanopublications by investigating the distribution of triples in each part and discuss the data quality issues that were subsequently revealed. We argue that there is a need for the community to develop a set of guidelines for the modelling of nanopublications.

KW - Semantic Publication

KW - Nanopublication

KW - Reproducibility

KW - Provenance

KW - Linked Data

KW - Data Quality

UR - https://doi.org/10.5281/zenodo.3358983

M3 - Conference contribution

BT - 2019 IEEE 15th International Conference on e-Science (e-Science)

PB - IEEE

ER -

Asif I, Chen-Burger J, Gray AJG. Data Quality Issues in Current Nanopublications. In 2019 IEEE 15th International Conference on e-Science (e-Science). IEEE. 2019