Dissertação de Mestrado
Standardization of Portuguese Addresses: Transformer-Based Architecture
2025
—Informações chave
Autores:
Orientadores:
Publicado em
26/05/2025
Resumo
Standardizing addresses is crucial for various applications, including geocoding, logistics, navigation, and data management. However, inconsistent and incomplete address data pose significant challenges to the accuracy and efficiency of address processing systems. Traditional rule-based and heuristic-based models often struggle to address these challenges effectively due to their limited capacity to capture complex patterns and variations in real-world address data. First, emphasis was given to the detailed understanding of the problem at hand and to the review of the existing literature, addressing important concepts related to the topic. To solve the problem, this dissertation proposes an innovative approach to standardization using deep learning techniques, a Seq2Seq model with an attention mechanism adapted for Portuguese addresses. The methodology involves implementing a deep learning architecture capable of learning and capturing complex relationships between address components. Unlike traditional models that rely on predefined rules or patterns, the proposed deep learning model learns directly from the data, allowing it to adapt to the diverse and evolving nature of address formats. Given the lack of Portuguese standardized addresses, it was necessary to use non-standardized data and standardize it as much as possible. However, the available data was insufficient for all cases, resulting in an accuracy rate of 71.4%. The findings suggest that with a more extensive dataset, the accuracy could exceed 90%.
Detalhes da publicação
Autores da comunidade :
Fátima Agostinho Napoleão
ist191605
Orientadores desta instituição:
João Luís Gustavo de Matos
ist13346
Domínio Científico (FOS)
electrical-engineering-electronic-engineering-information-engineering - Engenharia Eletrotécnica, Eletrónica e Informática
Idioma da publicação (código ISO)
eng - Inglês
Acesso à publicação:
Acesso Embargado
Data do fim do embargo:
29/03/2026
Nome da instituição
Instituto Superior Técnico