Dissertação de Mestrado
Sparse Transformers for High Order Epistasis Detection
2022
—Informações chave
Autores:
Orientadores:
Publicado em
23/11/2022
Resumo
Genome-Wide Association Studies (GWAS) aim to identify relations between Single Nucleotide Polymorphisms (SNPs) and the manifestation of certain diseases, which is an important challenge in biomedicine. However, most genetic diseases are not only explained by the effects of individual SNPs, but by the interactions between several SNPs, known as epistasis. Detecting high order epistasis is a very computationally demanding task, due to the exponential increase in evaluated combinations of SNPs. Recently, deep learning has emerged as a possible solution for genomic prediction, but the black-box nature of neural networks and lack of explainability is a drawback yet to be solved. In this dissertation, a new framework for interpreting neural networks for epistasis detection is presented. Using sparse transformers, a technique not yet employed for epistasis detection, SNPs can be assigned attention scores to quantify their relevance for predicting a phenotype. This new methodology is proposed to be tested on IPUs, a recent massively parallel processor aimed at machine learning workloads and efficient processing of sparse data. The results on simulated datasets show that the proposed framework outperforms state-of-the-art methods for explainability, identifying SNP interactions in various epistasis scenarios. Furthermore, training on IPUs provides higher performance than GPUs and TPUs, achieving reasonable speedups up to 2.79x. To conclude, the proposed framework is validated on a real breast cancer dataset, identifying second to fifth order interactions in the top 40% most relevant SNPs.
Detalhes da publicação
Autores da comunidade :
Miguel Ângelo da Silva Graça
ist190142
Orientadores desta instituição:
Aleksandar Ilic
ist166430
Domínio Científico (FOS)
electrical-engineering-electronic-engineering-information-engineering - Engenharia Eletrotécnica, Eletrónica e Informática
Idioma da publicação (código ISO)
eng - Inglês
Acesso à publicação:
Embargo levantado
Data do fim do embargo:
30/08/2023
Nome da instituição
Instituto Superior Técnico