Master's Thesis
Personality Trait Prediction Using Textual data from Social Media
2024
—Key information
Authors:
Supervisors:
Published in
06/25/2024
Abstract
Personality assessment is a topic in psychology that has been popular since the early days of the field. Many personality models were created since then, but one model that is, arguably, the most dominant in the field of psychology is the Big Five model, which identifies 5 traits that compose human personality. With the rise of computers in the past decades, automatic personality assessment models have been gaining popularity among the psychology and machine learning communities. Additionally, with the rise of social media, there is a large amount of data that, arguably, contains much information regarding the users' personalities. Many machine learning models have been used to perform this task, but, when regarding text-based personality prediction, large pre-trained language models have been gaining popularity in regards to this task, proving to achieve state-of-the-art results in many cases. In this work, two datasets containing textual data, from the Facebook and Twitter social media platforms, were used to predict the Big Five personality scores of the users. The proposed model resorted to the large pre-trained language model, Sentence-BERT, to extract sentence embeddings and used a neural network as a regression model. With this architecture, state-of-the-art results were achieved in the Twitter dataset, outperforming previous state-of-the-art results by around 16-18%, but underperfoming compared to the state-of-the-art results in the Facebook dataset. Additionally, a comparative analysis was performed of how data from different sources can be combined and applied to one another, in the scope of personality trait prediction.
Publication details
Authors in the community:
Supervisors of this institution:
Fields of Science and Technology (FOS)
electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering
Publication language (ISO code)
eng - English
Rights type:
Embargoed access
Date available:
04/29/2025
Institution name
Instituto Superior Técnico