LexModel: Core Terminology for Lexicography

Research output: Other contribution


This article is the result of the work that has been undertaken in the context of the ELEXIS project European Lexicographic Infrastructure [Horizon 2020 – ID 731015] related to the WP5 which aim is to propose an ELEXIS curriculum2. In this ELEXIS curriculum design, we were responsible for the module “Standards for Representing Lexical Data: An Overview”.
Through this article, we propose the definition of a core terminology for lexicography aiming to achieve a data model. It comprises two main parts: in the first one, Standards and Formats, we describe the most common standards and formats used by the lexicographic community in order to become familiar with the best practices for representing lexicographic data; in the second part, we present definitions of concepts designated by the terms that make up the core terminology to lexicographic work. After defining the concept in formal language using UML (Unified Modelling Language) diagrams, the proposed definitions are given in natural language. Finally, we illustrate a lexicographic article modelled with Ontolex Lemon, TEI/TEI Lex-0 and LMF.
The main objective is to give the formal and natural language definitions of the core concepts, putting them side by side, to facilitate the understanding of a larger community dealing with dictionaries, traditional lexicographers, and NLP and ontology communities. The set of definitions is expressed by resorting to context-free grammar and to ISO 704 (2009) best practices such as the UML notation for defining concepts or ISO 24156-1 (2014).
Original languageEnglish
PublisherArchive ouverte HAL
Number of pages9
Publication statusPublished - 2022


  • Lexicograpy
  • Data modelling
  • Standards


Dive into the research topics of 'LexModel: Core Terminology for Lexicography'. Together they form a unique fingerprint.

Cite this