Carole Goble, Rafael C. Jimenez, Alasdair J. G. Gray, Niall Beard, Giuseppe Profiti, Norman Morrison, Bioschemas Consortium

Research output: Contribution to conferencePosterpeer-review

10 Downloads (Pure)


Project Website:
Source Code:
License: Creative Commons Attribution-ShareAlike License (version 3.0)

Abstract provides a way to add semantic markup to web pages. It describes ‘types’ of information, which then have ‘properties’. For example, ‘Event’ is a type that has properties like ‘startDate’, ‘endDate’ and ‘description’. Bioschemas aims to improve data interoperability in life sciences by encouraging people in life science to use markup. This structured information then makes it easier to discover, collate and analyse distributed data. Bioschemas reuses and extends in a number of ways: defining a minimum information model for the datatype being described using as few concepts as possible and only where necessary adding new properties, and the introduction of cardinalities and controlled vocabularies. The main outcome of Bioschemas is a collection of specifications that provide guidelines to facilitate a more consistent adoption of markup within the life sciences for the “Find” part of the FAIR (Findable, Accessible, Interoperable, Reusable) principles.

In 2016 Bioschemas successfully piloted with training materials and events to enable the EU ELIXIR Research Infrastructure Training Portal (TeSS) to rapidly and simply harvest metadata from community sites. Encouraged by this in March 2017 we launched a 12 month project to pilot Bioschemas for data repositories and datasets. Specifically we are working on:
General descriptions for datasets and data repositories
Specific descriptions for prioritised datatypes: Samples, Human Beacons, Plant phenotypes and Protein annotations
Facilitating discovery by registries and data aggregators, and by general search engines
Facilitate tool development for annotation and validation of compliant resources
All work is grounded on describing real data resources for real use cases: to this end large and small dataset are part of the project: Pfam, Interpro, PDBe, UniProt, BRENDA, EGA, COPaKB, and Gene3D. Data aggregators participating include: InterMine, BioSamples and OmicsDI. Registries include, DataMed, Biosharing and the Beacon Network. Bioschemas operates as an open community initiative, sponsored by the EU ELIXIR Research Infrastructure and is supported by the NIH BD2K programme and Google.


ConferenceJoint 25th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and 16th European Conference on Computational Biology (ECCB) 2017
Abbreviated titleISMB/ECCB
Period25/07/17 → …


Dive into the research topics of ''. Together they form a unique fingerprint.

Cite this