Investigação de técnicas de classificação hierárquica para problemas de bioinformática / Investigation of hierarchial classification techniques for bioinformatics problems

AUTOR(ES)
DATA DE PUBLICAÇÃO

2008

RESUMO

In Machine Learning and Data Mining, most of the research in classification reported in the literature involve flat classification, where each example is assigned to one class out of a finite (and usually small) set of flat classes. Nevertheless, there are more complex classification problems in which the classes to be predicted can be disposed in a hierarchy. In this context, the use of hierarchical classification techniques and concepts have been shown to be useful. One research with great potential is the application of hierarchical classification techniques to Bioinformatics problems. Therefore, this MSc thesis presents a study involving hierarchical classification techniques applied to the prediction of functional classes of proteins. Twelve different algorithms were investigated - eleven of them based on the Top-Down approach, which was the focus of this study. The other investigated algorithm was HC4.5, an algorithm based on the Big-Bang approach. Part of these algorithms are based on a variation of the Top-Down approach, named Top-Down Ensembles, proposed in this study. Some of the algorithms based on this new approach presented promising results, which were better than the results presented by other algorithms. A specific evaluation measure for hierarchical classification, named depth-dependent accuracy, was used to evaluate the classification models. Besides, other three evaluation measures were used in order to compare the results reported by them

ASSUNTO(S)

data mining hierarchical classification aprendizado de máquina machine learning bioinformática classificação hierárquica bioinformatics mineração de dados

Documentos Relacionados