Abstract
Natural language understanding is a key task in a wide range of applications targeting data interoperability or analytics. For the analysis of domain-specific data, specialised knowledge resources (terminologies, grammars, word vector models, lexical databases) are necessary. The heterogeneity of such resources is, however, a major obstacle to their efficient use, especially in combination. This paper presents the open-source Diversicon Framework that helps application developers in finding, integrating, and accessing lexical domain knowledge, both symbolic and statistical, in a unified manner. The major components of the framework are: (1) an API and domain knowledge model that allow applications to retrieve domain knowledge through a common interface from a diversity of resource types, (2) implementations of the API for some of the most commonly used symbolic and statistical knowledge sources, (3) a domain-aware knowledge base that helps integrate static lexico-semantic resources, and (4) an online catalogue that either hosts or links to the existing resources from multiple domains. Support for Diversicon is already integrated into two of the most popular ontology matcher applications, a fact that we exploit to validate the framework and demonstrate its use on a example study that evaluates the effect of several common-sense and domain knowledge resources on a medical ontology matching task.
Original language | English |
---|---|
Pages (from-to) | 219-234 |
Number of pages | 16 |
Journal | Journal on Data Semantics |
Volume | 8 |
Issue number | 4 |
Early online date | 4 Sept 2019 |
DOIs | |
Publication status | Published - Dec 2019 |
Keywords
- Domain knowledge
- Knowledge framework
- Lexical knowledge
- Natural language understanding
- Word vector models
ASJC Scopus subject areas
- Information Systems
- Computer Networks and Communications
- Artificial Intelligence