Master's Thesis

PROLEGIS – Intelligent Search in Legislation Databases

Hugo Miguel de Jesus Lopes2016

Key information

Authors:

Hugo Miguel de Jesus Lopes (Hugo Miguel de Jesus Lopes)

Supervisors:

Carlos Alberto Pinto Ferreira (Carlos Alberto Pinto Ferreira); Luís Manuel Marques Custódio (Luís Manuel Marques Custódio)

Published in

05/31/2016

Abstract

Portuguese legislation, similarly to other countries, is not published in an organized way, being it by topics or concepts. Instead, it is organized by a numbering system which follows the publication order. For a common citizen or even researchers, searching for information about a subject or a specific problem is an hard and complex task. The categorization of legal texts, besides requiring specialized labour, is a task which would need a great amount of time due to the quantity of published documents. The purpose of this work focuses in evaluating the possibility of automatically assign to this legislative documents a category using Machine Learning algorithms. The focus of this work will be on the supervised domain, nevertheless, an unsupervised clustering analysis is also explored. Multiple supervised classification algorithms are experimented, using a set of pre-classified documents, in order to comparatively evaluate their classification performances. Support Vector Machines, K-Nearest Neighbours, Multinomial Naive Bayes and Decision-Trees were used individually and, in order to seek to enhance the results, in conjunction with various techniques for pre-processing features. Latent Semantic Indexing, feature selection with different metrics and stemming were analysed.

Publication details

Authors in the community:

Supervisors of this institution:

Fields of Science and Technology (FOS)

electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering

Publication language (ISO code)

eng - English

Rights type:

Embargo lifted

Date available:

04/08/2017

Institution name

Instituto Superior Técnico