Master's Thesis

Single and Multi-Objective Epistasis Scoring: A Matter of Frequency

Madalena Carvalho de Azevedo Moreira2020

Key information

Authors:

Madalena Carvalho de Azevedo Moreira (Madalena Carvalho de Azevedo Moreira)

Supervisors:

Sergio Santander-Jiménez; Aleksandar Ilic (Aleksandar Ilic)

Published in

September 30, 2020

Abstract

Epistasis detection studies focus on finding interactions between Single Nucleotide Polymorphisms (SNPs) that may be linked with susceptibility to and development of complex disease states. Since existing search and score methods for detecting significant SNP combinations focus heavily on the search algorithm, the question of how to best evaluate the epistatic contribution of these interactions still lacks a satisfactory answer. This dissertation proposes a novel methodology for evaluating the performance of six widely used objective functions for epistasis detection based on genotype distribution in the dataset. This analysis reveals a correlation between high scoring power and extreme frequency table values, defined by two parameters. The first is based on genotypes with extreme differences between counts of cases and controls and the second is a simplified heritability formulation taking into account the total number of observations and cases for each genotype. A threshold is defined for these parameters above which, for the simulated datasets analysed, an objective function can correctly and single-handedly identify associated SNP combinations. Below this threshold, the combination of two and three complementary objective functions in a multi-objective approach demonstrates an increase in scoring power. This frequency table based approach is innovative in the sense that there is not currently a defined methodology for evaluating and comparing the performance of objective functions. The defined parameters can be applied to real datasets, representing a first step in validating the results of existing epistasis detection methods and promoting the choice of the least complex scoring method possible for specific datasets.

Publication details

Authors in the community:

Supervisors of this institution:

Fields of Science and Technology (FOS)

electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering

Publication language (ISO code)

eng - English

Rights type:

Embargo lifted

Date available:

July 26, 2021

Institution name

Instituto Superior Técnico