CobMiner - Mineração de Padrões Arborescentes com Restrições
AUTOR(ES)
Nyara de Araújo Silva
DATA DE PUBLICAÇÃO
2007
RESUMO
Most work on pattern mining focus on simple data structures like itemsets or sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structure, social network, XML and Web Log databases, require much more sophisticated data structures (trees or graphs) for their specification. Here, interesting patterns involve not only frequent object values (labels) appearing in the trees (or graphs) but also frequent specific topologies found in these structures. Mining frequent tree patterns have been extensively studied, motivated by the increasing interest and applicability in different areas (Web Mining, Bioinformatics, etc). However, conventional tree mining systems normally consider only minimum support criterium as a mechanism for filtering patterns to be mined. After mining process, hard work is requiring to filter patterns concerned with user interests. In this dissertation, we propose CobMiner, Constrained-based Miner, a tree pattern mining algorithm which incorporates tree automata into the mining process in order to restrict the mining scope and to generate frequent patterns more closely related to user interests. We compare two methods for introducing user constraints into the discovery process: the first one is CobMiner which incorporates tree automata constraints as an intra-mining mechanism, the second one is TreeMinerPP which consists of a well-known tree pattern mining algorithm, TreeMiner, followed by a post-processing phase, where patterns are filtered using a tree automatum. An extensive set of experiments executed over synthetic and real data (XML documents) allow us to conclude that incorporating constraints during the mining process is far better effective than filtering the frequent and interesting patterns after the mining process.
ASSUNTO(S)
mineração de padrões arborescentes mineração de documentos xml mineração de dados (computação) tree automata web mining descoberta de padrões freqüentes mineração de dados com restrições ciencia da computacao tree pattern mining xml mining autômato de Árvore mineração na web frequent pattern descovery constraint-based data mining
ACESSO AO ARTIGO
http://www.bdtd.ufu.br//tde_busca/arquivo.php?codArquivo=1588Documentos Relacionados
- CobMiner - MineraÃÃo de PadrÃes Arborescentes com RestriÃÃes
- VISTREE: uma linguagem visual para análise de padrões arborescentes e para especificação de restrições em um ambiente de mineração de árvores
- Mineração de padrões seqüênciais múltiplos
- Mineração de padrões no gênero textual blog
- Descoberta de padrões de alarme redundantes com técnicas de mineração de dados e redes complexas