Bilingually Learning Word Senses for Translation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

All words in every natural language are ambiguous, specially when translation is at stake. In translation tasks, there is the need for finding out adequate translations for such words in the contexts where they occur. In this article, a bilingual strategy to cluster words according to their meanings is described. A publicly available parallel corpora sen- tence aligned is used. Word senses are discriminated by their translations and by the words occurring in a window, both in the source and target language parallel sentences. This strategy is language independent and uses a correlation algorithm for filtering out irrelevant features. Clus- ters obtained were evaluated in terms of F-measure (getting an average rating of 94%) and their homogeneity and completeness was determined using V-Measure (getting an average rating of 83%). Learned clusters are then used to train a support vector machine to tag ambiguous words with their translations in the contexts where they occur. This task was also evaluated in terms of F-measure and confronted with a baseline.
Original languageEnglish
Title of host publicationLecture Notes in Computer Science
Pages283 to 295
Publication statusPublished - 1 Jan 2014
EventCICLing -
Duration: 1 Jan 2014 → …

Conference

ConferenceCICLing
Period1/01/14 → …

Fingerprint Dive into the research topics of 'Bilingually Learning Word Senses for Translation'. Together they form a unique fingerprint.

Cite this