Abstract
O presente relatório resulta de um estágio de 400h realizado na Unbabel. O estágio teve como base um objetivo que consiste na redução do custo por palavra (CPP) de acordo com a necessidade real da edição. Para atingir esse objetivo, foram realizadas experiências para orientar o foco dos editores de diminuir o tempo gasto por tarefa, sendo a experiência presente neste relatório, o bloqueio de memórias de tradução em textos para pós-edição. De forma a compreender o impacto desta ideia na qualidade da TA foi necessário analisar a pós-edição através da anotação de erros. A análise apontou para todo o tipo de inconsistências estipuladas na tipologia desta experiência, indicando uma grande discrepância nas inconsistências de pontuação comparativamente às outras. Contudo, revelou que apesar das inconsistências encontradas, a qualidade após o bloqueio de MT aumentou comparativamente à qualidade antes da experiência. Este trabalho também teve como objetivo analisar um pouco os métodos de controlo da qualidade na TA e o quão esta é importante na manutenção e no desenvolvimento da mesma.
This report is the result of a 400h internship in Unbabel.The report was based on a goal that consists of reducing the cost per word (CPP) according to the real need for editing. To achieve this goal, some experiments were created to guide the editors' focus on reducing the time spent per task. The experiment in this report is the blocking of TM in texts for post-editing. In order to understand the impact of this idea on the quality of MT, it was necessary to analyze the post-editing through the annotation of errors. The analysis pointed to all kinds of inconsistencies stipulated in the typology of this experiment, indicating a large difference in the amount of punctuation inconsistencies compared to the others. However, it revealed that despite the inconsistencies that were found, the quality after the TM were blocked increased compared to the quality before the experiment. This work also aimed to analyze the methods of quality control in MT and how important it is in its maintenance and development.
This report is the result of a 400h internship in Unbabel.The report was based on a goal that consists of reducing the cost per word (CPP) according to the real need for editing. To achieve this goal, some experiments were created to guide the editors' focus on reducing the time spent per task. The experiment in this report is the blocking of TM in texts for post-editing. In order to understand the impact of this idea on the quality of MT, it was necessary to analyze the post-editing through the annotation of errors. The analysis pointed to all kinds of inconsistencies stipulated in the typology of this experiment, indicating a large difference in the amount of punctuation inconsistencies compared to the others. However, it revealed that despite the inconsistencies that were found, the quality after the TM were blocked increased compared to the quality before the experiment. This work also aimed to analyze the methods of quality control in MT and how important it is in its maintenance and development.
Translated title of the contribution | Machine translation (English-Portuguese) analysis: lexical issues |
---|---|
Original language | Portuguese |
Qualification | Master of Philosophy |
Supervisors/Advisors |
|
Award date | 22 Oct 2021 |
Publication status | Published - 22 Oct 2021 |
Keywords
- Tradução Automática
- Memória de Tradução
- Anotação
- Pós-edição
- Machine Translation
- Translation Memory
- Annotation
- Post-editing