Building a Portuguese oenological dictionary: from corpus to terminology via co-occurrence networks

William Martinez, Sílvia Barbosa

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

70 Downloads (Pure)


This paper focuses on the elaboration of a dictionary of terms in the Portuguese language which describe the wine-tasting experience. We present a corpus-based analysis aimed at designing an electronic dictionary: on the basis of a compilation of approximately 21,000 wine descriptions downloaded from a dozen Portuguese websites, we estimated both by frequency analysis and lexicographical study which terms were recurrent, relevant and representative of the “hard to put into words” occupation that is oenology. From the results thus obtained, a list was made of words that describe the sensory analysis in its three main aspects: visual, olfactive and gustatory. An exhaustive co-occurrence analysis then identified those terms which contribute most to structuring the text by way of their tendency to attract other words against statistical odds. When displayed in a co-occurrence network, these anchors emerge from the mesh as the foundational lexicon for wine tasting, and can be evaluated as prime candidates for a distributional thesaurus.
Original languageEnglish
Title of host publicationProceedings of the XVIII EURALEX International Congress
Subtitle of host publicationLexicography in Global Contexts
EditorsJaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek
Place of PublicationLjubljana
PublisherLjubljana University Press, Faculty of Arts
Number of pages10
ISBN (Electronic)978-961-06-0097-8
Publication statusPublished - 17 Jul 2018
EventXVIII EURALEX International Congress: Lexicography in Global Contexts - he Centre for Language Resources and Technologies at the University of Ljubljana and Trojina, Institute for Applied Slovene Studies, Ljubljana, Slovenia
Duration: 17 Jul 201821 Jul 2018


ConferenceXVIII EURALEX International Congress: Lexicography in Global Contexts
Internet address


  • Collocations
  • Co-occurrences
  • Word network
  • Corpus linguistics
  • Oenology
  • Terminology


Dive into the research topics of 'Building a Portuguese oenological dictionary: from corpus to terminology via co-occurrence networks'. Together they form a unique fingerprint.

Cite this