Alignment of parallel texts (texts that are a translation of each other) is a step required by many applications that use parallel texts, including statistical machine translation, automatic extraction of translation equivalents, automatic creation of concordances, etc. Most of existing methods for parallel texts alignment try to infer simultaneously a bilingual word lexicon and a set of correspondences between the occurrences of those words in the texts. Some authors suggest that an external lexicon can be used to complement the inferred one, but they tend to consider it secondary/optional. We defend that lexicon inference should not be embedded in the alignment process, and present LEXIC-AL, a new alignment method that relies exclusively on externally managed lexicons. In our experiments with the European Constitution corpus, LEXIC-AL achieves 84.45% precision and 84.55% recall.
|Title of host publication||New Trends in Artificial Intelligence|
|Publisher||Universidade de Aveiro|
|Publication status||Published - 1 Jan 2009|
|Event||EPIA 2009, Portuguese Conference on Artificial Inteligence - |
Duration: 1 Jan 2009 → …
|Conference||EPIA 2009, Portuguese Conference on Artificial Inteligence|
|Period||1/01/09 → …|