Master's Thesis
Classification of microcytic anaemias using machine learning methods
2021
—Key information
Authors:
Supervisors:
Published in
11/10/2021
Abstract
The prevalence of anaemia in the world population is 24.8%. Proper discrimination between microcytic anaemias is essential to provide the right treatment and genetic counselling. As the most reliable methods to diagnose thalassemias and IDA (iron deficiency anaemia), some of the most common microcytic anaemias are expensive and time-consuming, many indexes have been developed through the years. These indexes, however, have not been revealed to be 100% accurate. In this thesis, haematological data from a sample of the Portuguese population constituted by 390 individuals and their diagnosis was used to train and test different machine learning algorithms. The objective was to develop a binary classifier, specifically adapted to the Portuguese population, to discriminate β-thalassemia carriers from IDA patients. Beyond that, a multi-class classifier capable of distinguishing between β-thalassemia carriers, α-thalassemia carriers, IDA patients, and healthy subjects was also developed. In order not to compromise the main objective, to obtain a quick and accessible diagnosis, the classifiers developed were only based on information obtained through a complete blood count test, one of the most common laboratory tests in medicine. Although it was not possible to surpass the performance with the binary classifiers created of the most reliable index for the Portuguese population, RDWI (red cell distribution width index), which presented a median accuracy of 95.4%, it was possible to match it with the random forest algorithm. This algorithm showed an excellent performance in the binary and in the multi-class classification, where it achieved promising results, revelling a median accuracy of 93.0%.
Publication details
Authors in the community:
Beatriz Neves Leitão
ist197309
Supervisors of this institution:
Fields of Science and Technology (FOS)
industrial-biotechnology - Industrial Biotechnology
Publication language (ISO code)
eng - English
Rights type:
Embargo lifted
Date available:
09/19/2022
Institution name
Instituto Superior Técnico