TY - GEN
T1 - Dwell in the Beginning
T2 - 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
AU - Coelho, João
AU - Martins, Bruno
AU - Magalhães, João
AU - Callan, Jamie
AU - Xiong, Chenyan
N1 - info:eu-repo/grantAgreement/FCT/Concurso de avaliação no âmbito do Programa Plurianual de Financiamento de Unidades de I&D (2017%2F2018) - Financiamento Base/UIDB%2F50021%2F2020/PT#
info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F04516%2F2020/PT#
info:eu-repo/grantAgreement/FCT/OE/PRT%2FBD%2F153683%2F2021/PT#
Funding Information:
We thank the anonymous reviewers for their valuable comments and suggestions. This research was supported by the Portuguese Recovery and Resilience Plan through project C645008882-00000055 (i.e., the Center For Responsible AI).
Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - This study investigates the existence of positional biases in Transformer-based language models for text representation learning, particularly in the context of web document retrieval. We build on previous research that demonstrated loss of information in the middle of input sequences for causal language models, extending it to the domain of embedding learning. We examine positional biases at multiple stages of the training pipeline for an encoder-decoder neural retrieval model, namely language model pre-training, contrastive pre-training, and contrastive fine-tuning. Experiments with the MS-MARCO document collection reveal that after contrastive pre-training the model already generates embeddings that better capture the beginning of the input content, with fine-tuning further aggravating this effect.
AB - This study investigates the existence of positional biases in Transformer-based language models for text representation learning, particularly in the context of web document retrieval. We build on previous research that demonstrated loss of information in the middle of input sequences for causal language models, extending it to the domain of embedding learning. We examine positional biases at multiple stages of the training pipeline for an encoder-decoder neural retrieval model, namely language model pre-training, contrastive pre-training, and contrastive fine-tuning. Experiments with the MS-MARCO document collection reveal that after contrastive pre-training the model already generates embeddings that better capture the beginning of the input content, with fine-tuning further aggravating this effect.
UR - http://www.scopus.com/inward/record.url?scp=85203841391&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85203841391
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 370
EP - 377
BT - Short Papers
A2 - Ku, Lun-Wei
A2 - Martins, Andre F. T.
A2 - Srikumar, Vivek
PB - Association for Computational Linguistics (ACL)
Y2 - 11 August 2024 through 16 August 2024
ER -