Creating and Exploiting the Intrinsically Disordered Protein Knowledge Graph (IDP-KG)

Alasdair J. G. Gray, Petros Papadopoulos, Imran Asif, Ivan Mičetić, Andás Hatos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Downloads (Pure)

Abstract

There are many data sources containing overlapping information about Intrinsically Disordered Proteins (IDP). IDPcentral aims to be a registry to aid the discovery of data about proteins known to be intrinsically disordered by aggregating the content from these sources. Traditional ETL approaches for populating IDPcentral require the API and data model of each source to be wrapped and then transformed into a common model.

In this paper, we investigate using Bioschemas markup as a mechanism to populate the IDPcentral registry by constructing the Intrinsically Disordered Protein Knowledge Graph (IDP-KG). Bioschemas markup is a machine-readable, lightweight representation of the content of each page in the site that is embedded in the HTML.
For any site it is accessible through a HTTP request. We harvest the Bioschemas markup in three IDP sources and show the resulting IDP-KG has the same breadth of proteins available as the original sources, and can be used to gain deeper insight into their content by querying them as a single, consolidated knowledge graph.
Original languageEnglish
Title of host publication13th International Semantic Web Applications and Tools for Health Care and Life Sciences Conference (SWAT4HCLS 2022)
Publication statusPublished - 11 Jan 2022
Event13th International Semantic Web Applications and Tools for Health Care and Life Sciences Conference 2022 - Leiden, Netherlands
Duration: 10 Jan 202213 Jan 2022
http://www.swat4ls.org/

Conference

Conference13th International Semantic Web Applications and Tools for Health Care and Life Sciences Conference 2022
Abbreviated titleSWAT4HCLS 2022
Country/TerritoryNetherlands
CityLeiden
Period10/01/2213/01/22
Internet address

Fingerprint

Dive into the research topics of 'Creating and Exploiting the Intrinsically Disordered Protein Knowledge Graph (IDP-KG)'. Together they form a unique fingerprint.

Cite this