Master's Thesis
Diving into Gender Translation Bias for the Portuguese Language - A Comparative Analysis of Commercial Machine Translation Systems, General-Purpose LLMs, and Non-Commercial Translation-Specific Models
2025
—Key information
Authors:
Supervisors:
Published in
06/29/2025
Abstract
Bias in Machine Translation models is a growing concern, particularly when translating into more gender-marked languages, where gender assumptions may be necessary. In such cases, Machine Translation models can perpetuate and even amplify stereotypes, reinforcing harmful patterns of societal discrimination. This work aims to address a gap in current research by investigating gender bias in English-to-Portuguese translation. In the first stage of our work, we conduct several experiments to comparatively analyze commercial Machine Translation systems, general-purpose LLMs, and non-commercial translation-specific models across different contexts and dimensions of gender bias. We evaluate how gender bias manifests in both single- and multi-sentence contexts and assess whether sentence sentiment impacts gender assignment. Additionally, we compare Portuguese results to those from other Romance languages (French, Spanish, Italian). Our findings show that commercial MT systems still lead in producing unbiased translations, although significant biases persist, particularly in inter-sentence contexts. Moreover, while these systems have improved, other Romance languages still exhibit greater gender bias than Portuguese. In the second stage of our work, we explore bias mitigating strategies, particularly fine-tuning. We adapt an existing model using a small, gender-balanced dataset and demonstrate that fine-tuning is a promising and efficient approach for mitigating bias. This work advances the state-of-the-art by offering a detailed evaluation of the current gender bias landscape in the English-Portuguese language pair and underscores the need for more equitable language technologies.
Publication details
Authors in the community:
Sofia Seabra Bonifácio
ist192559
Supervisors of this institution:
Fields of Science and Technology (FOS)
electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering
Publication language (ISO code)
eng - English
Rights type:
Embargoed access
Date available:
03/29/2026
Institution name
Instituto Superior Técnico