Nuvem de partículas aplicada à seleção de atributos

Viviane Dal Molin de Souza

Abstract: The particle swarm optimization (PSO) algorithm is a recently developed metaheuristic technique and belongs to the category of swarm intelligence techniques. The swarm intelligence concepts are inspired by the social behavior of flocking animals such as swarms of birds, ants and fish school. PSO is a population based algorithm that exploits a population of individuals to probe promising regions of the search space. The individual behavior is affected either by the best-local or best-global individual. The performance of each individual is measured using fitness function similar to evolutionary algorithms. The population is referred as a swarm and individuals are called particles. The particles move in a multi-dimensional search space with adaptable velocity. In PSO, the particles remember the best position in the past and the best position ever attained by the particles. This property helps the particles to search the multi-dimensional space faster. PSO has been found to be useful in a wide variety of optimization tasks in many fields, such as nonlinear function optimization, artificial neural network training, fuzzy system design, scheduling problems, traveling salesman problem, among others. Due to its natural ability to converge faster, PSO algorithm is also used to solve multi-objective optimization problems. Several problems involve simultaneous optimization of multiple objectives that often are competing. In a multi-objective optimization problem, there may not exist one solution that is best with respect to all objectives. Usually, the aim is to determine the tradeoff surface, which is a set of nondominated solution points, known as Pareto-optimal or noninferior solutions. In view of the fact that none of the solutions in the nondominated set is absolutely better then any other, any one of them is an acceptable solution. The choice of one solution over the other requires problem knowledge and a number of problem-related factors. Recently, any investigations have been undertaken to apply of PSO approaches in Knowledge Discovery in Database (KDD) procedures. The KDD procedure can present the following steps: selection of database, attributes selection, data pre-processing, data mining, and data pos-processing. The objective of this dissertation is the attributes selection using a PSO approach based on multi-objective optimization and integer variables for selection and evaluation of selected attributes. Furthermore, this dissertation presents also an analysis of attributes selection influence in data mining tasks. The proposed attributes selection method based on multi-objective PSO approach was evaluated to ten databases obtained of UCI (Machine Learning Repository - University of California - Irvine) repository. In this context, the multi-objective problem solved by PSO considered two different objectives: i) minimization of error rate and ii) minimization of trees size obtained by C4.5 algorithm. In addition, the C4.5 algorithm was defined as comparison criterion for the solutions found by multi-objective PSO approach. Simulation results showed that the proposed PSO approach found better solutions than the full solution (solution based on all attributes) in six of ten databases of UCI. Moreover, the quality of simulation results obtained by PSO were similar to the full solution in two databases, but in other two databases the solution presented by PSO were worst than the full solution.

Nuvem de partículas aplicada à seleção de atributos

AUTOR(ES)

DATA DE PUBLICAÇÃO

RESUMO

ASSUNTO(S)

ACESSO AO ARTIGO

Documentos Relacionados