Abstract
Project Website: http://bioschemas.org/
Source Code: https://github.com/BioSchemas/bioschemas
License: Creative Commons Attribution-ShareAlike License (version 3.0)
Abstract
Schema.org provides a way to add semantic markup to web pages. It describes ‘types’ of information, which then have ‘properties’. For example, ‘Event’ is a type that has properties like ‘startDate’, ‘endDate’ and ‘description’. Bioschemas aims to improve data interoperability in life sciences by encouraging people in life science to use schema.org markup. This structured information then makes it easier to discover, collate and analyse distributed data. Bioschemas reuses and extends schema.org in a number of ways: defining a minimum information model for the datatype being described using as few concepts as possible and only where necessary adding new properties, and the introduction of cardinalities and controlled vocabularies. The main outcome of Bioschemas is a collection of specifications that provide guidelines to facilitate a more consistent adoption of schema.org markup within the life sciences for the “Find” part of the FAIR (Findable, Accessible, Interoperable, Reusable) principles.
In 2016 Bioschemas successfully piloted with training materials and events to enable the EU ELIXIR Research Infrastructure Training Portal (TeSS) to rapidly and simply harvest metadata from community sites. Encouraged by this in March 2017 we launched a 12 month project to pilot Bioschemas for data repositories and datasets. Specifically we are working on:
General descriptions for datasets and data repositories
Specific descriptions for prioritised datatypes: Samples, Human Beacons, Plant phenotypes and Protein annotations
Facilitating discovery by registries and data aggregators, and by general search engines
Facilitate tool development for annotation and validation of compliant resources
All work is grounded on describing real data resources for real use cases: to this end large and small dataset are part of the project: Pfam, Interpro, PDBe, UniProt, BRENDA, EGA, COPaKB, and Gene3D. Data aggregators participating include: InterMine, BioSamples and OmicsDI. Registries include Identifiers.org, DataMed, Biosharing and the Beacon Network. Bioschemas operates as an open community initiative, sponsored by the EU ELIXIR Research Infrastructure and is supported by the NIH BD2K programme and Google.
Source Code: https://github.com/BioSchemas/bioschemas
License: Creative Commons Attribution-ShareAlike License (version 3.0)
Abstract
Schema.org provides a way to add semantic markup to web pages. It describes ‘types’ of information, which then have ‘properties’. For example, ‘Event’ is a type that has properties like ‘startDate’, ‘endDate’ and ‘description’. Bioschemas aims to improve data interoperability in life sciences by encouraging people in life science to use schema.org markup. This structured information then makes it easier to discover, collate and analyse distributed data. Bioschemas reuses and extends schema.org in a number of ways: defining a minimum information model for the datatype being described using as few concepts as possible and only where necessary adding new properties, and the introduction of cardinalities and controlled vocabularies. The main outcome of Bioschemas is a collection of specifications that provide guidelines to facilitate a more consistent adoption of schema.org markup within the life sciences for the “Find” part of the FAIR (Findable, Accessible, Interoperable, Reusable) principles.
In 2016 Bioschemas successfully piloted with training materials and events to enable the EU ELIXIR Research Infrastructure Training Portal (TeSS) to rapidly and simply harvest metadata from community sites. Encouraged by this in March 2017 we launched a 12 month project to pilot Bioschemas for data repositories and datasets. Specifically we are working on:
General descriptions for datasets and data repositories
Specific descriptions for prioritised datatypes: Samples, Human Beacons, Plant phenotypes and Protein annotations
Facilitating discovery by registries and data aggregators, and by general search engines
Facilitate tool development for annotation and validation of compliant resources
All work is grounded on describing real data resources for real use cases: to this end large and small dataset are part of the project: Pfam, Interpro, PDBe, UniProt, BRENDA, EGA, COPaKB, and Gene3D. Data aggregators participating include: InterMine, BioSamples and OmicsDI. Registries include Identifiers.org, DataMed, Biosharing and the Beacon Network. Bioschemas operates as an open community initiative, sponsored by the EU ELIXIR Research Infrastructure and is supported by the NIH BD2K programme and Google.
Original language | English |
---|---|
Publication status | Published - Jul 2017 |
Event | Joint 25th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and 16th European Conference on Computational Biology (ECCB) 2017 - Duration: 25 Jul 2017 → … |
Conference
Conference | Joint 25th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and 16th European Conference on Computational Biology (ECCB) 2017 |
---|---|
Abbreviated title | ISMB/ECCB |
Period | 25/07/17 → … |