Interoperability and FAIRness through a novel combination of Web technologies

Mark D. Wilkinson, Ruben Verborgh, Luiz Olavo Bonino da Silva Santos, Tim Clark, Morris A. Swertz, Fleur D. L. Kelpin, Alasdair J. G. Gray, Erik A. Schultes, Erik M. van Mulligen, Paolo Ciccarese, Arnold Kuzniar, Anand Gavai, Mark Thompson, Rajaram Kaliyaperumal, Jerven T. Bolleman, Michel Dumontier

Research output: Contribution to journalArticle

Abstract

Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved at the level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs.
LanguageEnglish
Article numbere110
Number of pages34
JournalPeerJ Computer Science
Volume3
DOIs
Publication statusPublished - 24 Apr 2017

Fingerprint

Interoperability
Biodiversity
Spreadsheets
Metadata
Data structures
Scalability
Availability
Proteins

Keywords

  • FAIR Data
  • Interoperability
  • Data Integration
  • Semantic Web
  • Linked Data
  • REST

Cite this

Wilkinson, M. D., Verborgh, R., Olavo Bonino da Silva Santos, L., Clark, T., Swertz, M. A., Kelpin, F. D. L., ... Dumontier, M. (2017). Interoperability and FAIRness through a novel combination of Web technologies. PeerJ Computer Science, 3, [e110]. https://doi.org/10.7717/peerj-cs.110
Wilkinson, Mark D. ; Verborgh, Ruben ; Olavo Bonino da Silva Santos, Luiz ; Clark, Tim ; Swertz, Morris A. ; Kelpin, Fleur D. L. ; Gray, Alasdair J. G. ; Schultes, Erik A. ; van Mulligen, Erik M. ; Ciccarese, Paolo ; Kuzniar, Arnold ; Gavai, Anand ; Thompson, Mark ; Kaliyaperumal, Rajaram ; Bolleman, Jerven T. ; Dumontier, Michel. / Interoperability and FAIRness through a novel combination of Web technologies. In: PeerJ Computer Science. 2017 ; Vol. 3.
@article{c369773e891047df8448340ce2665148,
title = "Interoperability and FAIRness through a novel combination of Web technologies",
abstract = "Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved at the level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs.",
keywords = "FAIR Data, Interoperability, Data Integration, Semantic Web, Linked Data, REST",
author = "Wilkinson, {Mark D.} and Ruben Verborgh and {Olavo Bonino da Silva Santos}, Luiz and Tim Clark and Swertz, {Morris A.} and Kelpin, {Fleur D. L.} and Gray, {Alasdair J. G.} and Schultes, {Erik A.} and {van Mulligen}, {Erik M.} and Paolo Ciccarese and Arnold Kuzniar and Anand Gavai and Mark Thompson and Rajaram Kaliyaperumal and Bolleman, {Jerven T.} and Michel Dumontier",
year = "2017",
month = "4",
day = "24",
doi = "10.7717/peerj-cs.110",
language = "English",
volume = "3",
journal = "PeerJ Computer Science",
issn = "2376-5992",
publisher = "PeerJ Inc.",

}

Wilkinson, MD, Verborgh, R, Olavo Bonino da Silva Santos, L, Clark, T, Swertz, MA, Kelpin, FDL, Gray, AJG, Schultes, EA, van Mulligen, EM, Ciccarese, P, Kuzniar, A, Gavai, A, Thompson, M, Kaliyaperumal, R, Bolleman, JT & Dumontier, M 2017, 'Interoperability and FAIRness through a novel combination of Web technologies', PeerJ Computer Science, vol. 3, e110. https://doi.org/10.7717/peerj-cs.110

Interoperability and FAIRness through a novel combination of Web technologies. / Wilkinson, Mark D.; Verborgh, Ruben; Olavo Bonino da Silva Santos, Luiz; Clark, Tim; Swertz, Morris A.; Kelpin, Fleur D. L.; Gray, Alasdair J. G.; Schultes, Erik A.; van Mulligen, Erik M.; Ciccarese, Paolo; Kuzniar, Arnold; Gavai, Anand; Thompson, Mark; Kaliyaperumal, Rajaram; Bolleman, Jerven T.; Dumontier, Michel.

In: PeerJ Computer Science, Vol. 3, e110, 24.04.2017.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Interoperability and FAIRness through a novel combination of Web technologies

AU - Wilkinson, Mark D.

AU - Verborgh, Ruben

AU - Olavo Bonino da Silva Santos, Luiz

AU - Clark, Tim

AU - Swertz, Morris A.

AU - Kelpin, Fleur D. L.

AU - Gray, Alasdair J. G.

AU - Schultes, Erik A.

AU - van Mulligen, Erik M.

AU - Ciccarese, Paolo

AU - Kuzniar, Arnold

AU - Gavai, Anand

AU - Thompson, Mark

AU - Kaliyaperumal, Rajaram

AU - Bolleman, Jerven T.

AU - Dumontier, Michel

PY - 2017/4/24

Y1 - 2017/4/24

N2 - Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved at the level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs.

AB - Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved at the level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs.

KW - FAIR Data

KW - Interoperability

KW - Data Integration

KW - Semantic Web

KW - Linked Data

KW - REST

U2 - 10.7717/peerj-cs.110

DO - 10.7717/peerj-cs.110

M3 - Article

VL - 3

JO - PeerJ Computer Science

T2 - PeerJ Computer Science

JF - PeerJ Computer Science

SN - 2376-5992

M1 - e110

ER -

Wilkinson MD, Verborgh R, Olavo Bonino da Silva Santos L, Clark T, Swertz MA, Kelpin FDL et al. Interoperability and FAIRness through a novel combination of Web technologies. PeerJ Computer Science. 2017 Apr 24;3. e110. https://doi.org/10.7717/peerj-cs.110