Parallel Texts Alignment

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Alignment of parallel texts (texts that are a translation of each other) is a step required by many applications that use parallel texts, including statistical machine translation, automatic extraction of translation equivalents, automatic creation of concordances, etc. Most of existing methods for parallel texts alignment try to infer simultaneously a bilingual word lexicon and a set of correspondences between the occurrences of those words in the texts. Some authors suggest that an external lexicon can be used to complement the inferred one, but they tend to consider it secondary/optional. We defend that lexicon inference should not be embedded in the alignment process, and present LEXIC-AL, a new alignment method that relies exclusively on externally managed lexicons. In our experiments with the European Constitution corpus, LEXIC-AL achieves 84.45% precision and 84.55% recall.
Original languageUnknown
Title of host publicationNew Trends in Artificial Intelligence
EditorsUnknown Unknown
PublisherUniversidade de Aveiro
Pages513-524
Publication statusPublished - 1 Jan 2009
EventEPIA 2009, Portuguese Conference on Artificial Inteligence -
Duration: 1 Jan 2009 → …

Conference

ConferenceEPIA 2009, Portuguese Conference on Artificial Inteligence
Period1/01/09 → …

Cite this