Framework híbrido para integração de ferramentas e reuso do conhecimento em problemas binários de mineração de dados

AUTOR(ES)
DATA DE PUBLICAÇÃO

2009

RESUMO

Data Mining appeared with the need of knowledge extraction of massive quantities of data generated by companies / institutions. With the growth in the area and the rising power of computer processing, the organizations providing services in KDD (Knowledge Discovery in Database) have increasingly saved a large number of documents and files related to projects implemented in the past. On the other hand, nowadays the development of Data Mining projects requires the specialist to use a variety of tools, programming languages and methodologies associated with their experience to solve the problem. One of the biggest practical problems of KDD is how to provide interoperability among the different existing platforms, so that the processes remain centralized and documented in a single environment. Another major problem today is the lack of knowledge reuse due to the complexity and heavy dependence on the user. In this context, the experience acquired in previous projects are not properly documented, managed or controlled, resulting in the repetition of errors from previous projects. In other words, another major practical problem is the lack of platforms able to reuse the knowledge acquired in projects from the past. The main objective of this work is to create a hybrid framework for the development of solutions in Data Mining, which includes several available tools and provides an integrated environment for the reuse of knowledge in KDD. This environment enables the centralization and standardization of artifacts generated along the KDD process, as well as taking the best features of every marketing tool available. To validate the framework, metadata for 69 real data mining projects were collected, 61 lessons learned from the professionals who worked on these projects and 654 bodies of knowledge (conferences, software, publications, etc.) about the KDD considered. The studies presented, mainly in the definition of the beginning of the project, proved to make it possible, through the framework, to understand the characteristics that led the projects to be a success or a failure. Thus, the framework is an environment that ensures the development of high quality projects in KDD, meeting customer expectations within the anticipated time and budget.

ASSUNTO(S)

mineração de dados framework descoberta do conhecimento em base de dados reuso do conhecimento metamineração de dados ciencia da computacao kdd process data mining knowledge reuse knowledge discovery in database interoperability processos de kdd framework interoperabilidade metadatamining

Documentos Relacionados