Particle swarm optimization

﻿
Particle swarm optimization

Particle swarm optimization (PSO) is a swarm intelligence based algorithm to find a solution to an optimization problem in a search space, or model and predict social behavior in the presence of objectives.

Overview

Particle swarm optimization is a stochastic, population-based evolutionary computer algorithm for problem solving. It is a kind of swarm intelligence that is based on social-psychological principles and provides insights into social behavior, as well as contributing to engineering applications. The particle swarm optimization algorithm was first described in 1995 by James Kennedy and Russell C. Eberhart. The techniques have evolved greatly since then, and the original version of the algorithm is barely recognizable in the current ones.

Social influence and social learning enable a person to maintain cognitive consistency. People solve problems by talking with other people about them, and as they interact their beliefs, attitudes, and behaviors change; the changes could typically be depicted as the individuals moving toward one another in a sociocognitive space.

The particle swarm simulates this kind of social optimization. A problem is given, and some way to evaluate a proposed solution to it exists in the form of a fitness function. A communication structure or social network is also defined, assigning neighbors for each individual to interact with. Then a population of individuals defined as random guesses at the problem solutions is initialized. These individuals are candidate solutions. They are also known as the particles, hence the name particle swarm. An iterative process to improve these candidate solutions is set in motion. The particles iteratively evaluate the fitness of the candidate solutions and remember the location where they had their best success. The individual's best solution is called the particle best or the local best. Each particle makes this information available to their neighbors. They are also able to see where their neighbors have had success. Movements through the search space are guided by these successes, with the population usually converging, by the end of a trial, on a problem solution better than that of non-swarm approach using the same methods.

The swarm is typically modelled by particles in multidimensional space that have a position and a velocity. These particles fly through hyperspace (i.e., $mathbb\left\{R\right\}^n$) and have two essential reasoning capabilities: their memory of their own best position and knowledge of the global or their neighborhood's best. In a minimization optimization problem, "best" simply meaning the position with the smallest objective value. Members of a swarm communicate good positions to each other and adjust their own position and velocity based on these good positions. So a particle has the following information to make a suitable change in its position and velocity:
* A global best that is known to all and immediately updated when a new best position is found by any particle in the swarm.
* Neighborhood best that the particle obtains by communicating with a subset of the swarm.
* The local best, which is the best solution that the particle has seen.

The particle position and velocity update equations in the simplest form that govern the PSO are given by
* $v_\left\{i,j\right\} leftarrow c_0 v_\left\{i\right\} + c_1 r_1 \left( global best_j - x_\left\{i,j\right\} \right) + c_2 r_2 \left( local best_\left\{i,j\right\} - x_\left\{i,j\right\} \right) + c_3 r_3 \left( neighborhood best_j - x_\left\{i,j\right\} \right)$
* $x_\left\{i,j\right\} leftarrow x_\left\{i,j\right\} + v_\left\{i,j\right\}$

As the swarm iterates, the fitness of the global best solution improves (decreases for minimization problem). It could happen that all particles being influenced by the global best eventually approach the global best, and there on the fitness never improves despite however many runs the PSO is iterated thereafter. The particles also move about in the search space in close proximity to the global best and not exploring the rest of search space. This phenomenon is called 'convergence'. If the inertial coefficient of the velocity is small, all particles could slow down until they approach zero velocity at the global best. The selection of coefficients in the velocity update equations affects the convergence and the ability of the swarm to find the optimum. One way to come out of the situation is to reinitialize the particles positions at intervals or when convergence is detected.

Some research approaches investigated the application of constriction coefficients and inertia weights. Numerous techniques for preventing premature convergence. The introduction of the fully informed particle swarm. Many variations on the social network topology, parameter-free, fully adaptive swarms, and some highly simplified models have been created. The algorithm has been analyzed as a dynamical system, and has been used in hundreds of engineering applications; it is used to compose music, to model markets and organizations, and in art installations.

A basic, canonical PSO algorithm

The algorithm presented below uses the global best and local bests but no neighborhood bests. Neighborhood bests allow parallel exploration of the search space and reduce the susceptibility of PSO to falling into local minima, but slow down convergence speed. Note that neighborhoods merely slow down the proliferation of new bests, rather than creating isolated subswarms because of the overlapping of neighborhoods: to make neighborhoods of size 3, say, particle 1 would only communicate with particles 2 through 5, particle 2 with 3 through 6, and so on. But then a new best position discovered by particle 2's neighborhood would be communicated to particle 1's neighborhood at the next iteration of the PSO algorithm presented below. Smaller neighborhoods lead to slower convergence, while larger neighborhoods to faster convergence, with a global best representing a neighborhood consisting of the entire swarm. The tendency is now to use partly random neighborhoods (see Standard PSO on the Particle Swarm Central).

A single particle by itself is unable to accomplish anything. The power is in interactive collaboration.

Let $f : mathbb\left\{R\right\}^m ightarrow mathbb\left\{R\right\}$ be the fitness function that takes a particle's solution with several components in higher dimensional space and maps it to a single dimension metric. Let there be $n$ particles, each with associated positions $mathbf\left\{x\right\}_i in mathbb\left\{R\right\}^m$ and velocities $mathbf\left\{v\right\}_i in mathbb\left\{R\right\}^m$, $i = 1, ldots, n$. Let $hat\left\{mathbf\left\{x_i$ be the current best position of each particle and let $hat\left\{mathbf\left\{g$ be the global best.

* Initialize $mathbf\left\{x\right\}_i$ and $mathbf\left\{v\right\}_i$ for all $i$. One common choice is to take $mathbf\left\{x\right\}_\left\{ij\right\} in U \left[a_j, b_j\right]$ and $mathbf\left\{v\right\}_i = mathbf\left\{0\right\}$ for all $i$ and $j = 1, ldots, m$, where $a_j, b_j$ are the limits of the search domain in each dimension, and $U$ represents the Uniform distribution (continuous).
* $hat\left\{mathbf\left\{x_i leftarrow mathbf\left\{x\right\}_i$ and $hat\left\{mathbf\left\{g leftarrow argmin_\left\{mathbf\left\{x\right\}_i\right\} f\left(mathbf\left\{x\right\}_i\right), i = 1, ldots, n$.

* While not converged:
** For each particle $1 leq i leq n$:
*** Create random vectors $mathbf\left\{r\right\}_\left\{1\right\}$, $mathbf\left\{r\right\}_\left\{2\right\}$: $mathbf\left\{r\right\}_\left\{1 j\right\}$ and $mathbf\left\{r\right\}_\left\{2 j\right\}$ for all $j$,by taking $mathbf\left\{r\right\}_\left\{1 j\right\},mathbf\left\{r\right\}_\left\{2 j\right\} in U \left[0, 1\right]$ for $j = 1, ldots, m$
*** Update the particle positions: $mathbf\left\{x\right\}_i leftarrow mathbf\left\{x\right\}_i+mathbf\left\{v\right\}_i$.
*** Update the particle velocities: $mathbf\left\{v\right\}_i leftarrow \left\{omega\right\}mathbf\left\{v\right\}_i+c_1mathbf\left\{r\right\}_1circ\left(hat\left\{mathbf\left\{x_i-mathbf\left\{x\right\}_i\right)+c_2mathbf\left\{r\right\}_2circ\left(hat\left\{mathbf\left\{g-mathbf\left\{x\right\}_i\right)$.
*** Update the local bests: If $f\left(mathbf\left\{x\right\}_i\right) < f\left(hat\left\{mathbf\left\{x_i\right)$, $hat\left\{mathbf\left\{x_i leftarrow mathbf\left\{x\right\}_i$.
*** Update the global best If $f\left(mathbf\left\{x\right\}_i\right) < f\left(hat\left\{mathbf\left\{g\right)$, $hat\left\{mathbf\left\{g leftarrow mathbf\left\{x\right\}_i$.
* $hat\left\{mathbf\left\{g$ is the optimal solution with fitness $f\left(hat\left\{mathbf\left\{g\right)$.

Note the following about the above algorithm:
* $omega$ is an inertial constant. Good values are usually slightly less than 1.
* $c_1$ and $c_2$ are constants that say how much the particle is directed towards good positions. They represent a "cognitive" and a "social" component, respectively, in that they affect how much the particle's personal best and the global best (respectively) influence its movement. Usually we take $c_1, c_2 approx 2$.
* $mathbf\left\{r\right\}_1, mathbf\left\{r\right\}_2$ are two random vectors with each component generally a uniform random number between 0 and 1.
* $circ$ operator indicates element-by-element multiplication i.e. the Hadamard matrix multiplication operator.

*** Note that there is a misconception arising from the tendency to write the velocity formula in a "vector notation" (see for example D.N. Wilke's papers). The original intent (see M.C.'s "Particle Swarm Optimization, 2006") was to multiply a NEW random component per dimension, rather than multiplying the same component with each dimension per particle. Moreover, r1 and r2 are supposed to consist of a single number, defined as Cmax, which normally has a relationship with omega (defined as C1 in the literature) through a transcendental function, given the value 'phi': C1 = 1.0 / (phi - 1.0 + (v_phi * v_phi) - (2.0 * v_phi)) - and - Cmax = C1 * phi. Optimal "confidence coefficients" are therefore approximately in the ratio scale of C1=0.7 and Cmax=1.43. The Pseudo code shown below however, describes the intent correctly - mishka

Pseudo code

Here follows a pseudo code example of the basic PSO.Note that the random vectors $mathbf\left\{r\right\}_1, mathbf\left\{r\right\}_2$are implemented as scalars inside the dimension loop which is equivalentto the mathematical description given above.

// Initialize the particle positions and their velocities for I = 1 to number of particles n do for J = 1 to number of dimensions m do X [I] [J] = lower limit + (upper limit - lower limit) * uniform random number V [I] [J] = 0 enddo enddo // Initialize the global and local fitness to the worst possible fitness_gbest = inf; for I = 1 to number of particles n do fitness_lbest [I] = inf enddo // Loop until convergence, in this example a finite number of iterations chosen for k = 1 to number of iterations t do // evaluate the fitness of each particle fitness_X = evaluate_fitness(X) // Update the local bests and their fitness for I = 1 to number of particles n do if (fitness_X(I) < fitness_lbest(I)) fitness_lbest [I] = fitness_X(I) for J = 1 to number of dimensions m do X_lbest [I,J] = X(I,J) enddo endif enddo // Update the global best and its fitness [min_fitness, min_fitness_index] = min(fitness_X(I)) if (min_fitness < fitness_gbest) fitness_gbest = min_fitness for J = 1 to number of dimensions m do X_gbest [J] = X(min_fitness_index,J) enddo endif // Update the particle velocity and position for I = 1 to number of particles n do for J = 1 to number of dimensions m do R1 = uniform random number R2 = uniform random number V [I] [J] = w*V [I] [J] + C1*R1*(X_lbest [I] [J] - X [I] [J] ) + C2*R2*(X_gbest [J] - X [I] [J] ) X [I] [J] = X [I] [J] + V [I] [J] enddo enddo enddo

Discussion

By studying this algorithm, we see that we are essentially carrying out something like a discrete-time simulation where each iteration of it represents a "tick" of time. The particles "communicate" information they find about each other by updating their velocities in terms of local and global bests. When a new best is found, the particles will change their positions accordingly so that the new information is "broadcast" to the swarm. The particles are always drawn back both to their own personal best positions and also to the best position of the entire swarm. They also have stochastic exploration capability via the use of the random multipliers $r_1, r_2$. The vector, floating-point nature of the algorithm suggests that high-performance implementations could be created that take advantage of modern hardware extensions pertaining to vectorization, such as Streaming SIMD Extensions and Altivec.

Typical convergence conditions include reaching a certain number of iterations, reaching a certain fitness value, and so on.

Variations and practicalities

There are a number of considerations in using PSO in practice; one might wish to clamp the velocities to a certain maximum amount, for instance. The considerable adaptability of PSO to variations and hybrids is seen as a strength over other robust evolutionary optimization mechanisms, such as genetic algorithms. For example, one common, reasonable modification is to add a probabilistic bit-flipping local search heuristic to the loop. Normally, a stochastic hill-climber risks getting stuck at local maxima, but the stochastic exploration and communication of the swarm overcomes this. Thus, PSO can be seen as a basic search "workbench" that can be adapted as needed for the problem at hand.

Note that the research literature has uncovered many heuristics and variants determined to be better with respect to convergence speed and robustness, such as clever choices of $omega$, $c_i$, and $r_i$. There are also other variants of the algorithm, such as discretized versions for searching over subsets of $mathbb\left\{Z\right\}^n$ rather than $mathbb\left\{R\right\}^n$. There has also been experimentation with coevolutionary versions of the PSO algorithm with good results reported. Very frequently the value of $omega$ is taken to decrease over time; e.g., one might have the PSO run for a certain number of iterations and DECREASE linearly from a starting value (0.9, say) to a final value (0.4, say) in order to facilitate exploitation over exploration in later states of the search. The literature is full of such heuristics. In other words, the canonical PSO algorithm is not as strong as various improvements which have been developed on several common function optimization benchmarks and consulting the literature for ideas on parameter choices and variants for particular problems is likely to be helpful.

Significant, non-trivial modifications have been developed for multi-objective optimization, versions designed to find solutions satisfying linear or non-linear constraints, as well as "niching" versions designed to find multiple solutions to problems where it is believed or known that there are multiple global minima which ought to be located.

There is also a modified version of the algorithm called repulsive particle swarm optimization, in which a new factor, called repulsion, is added to the basic algorithm step.

Applications

Although a relatively new paradigm, PSO has been applied to a variety of tasks, such as the training of artificial neural networks and for finite element updating. Very recently, PSO has been applied in combination with grammatical evolution to create a hybrid optimization paradigm called "grammatical swarms".

* Repulsive particle swarm optimization
* Ant colony optimization
* Optimization (mathematics)
* Differential evolution

References

* J. Kennedy and R. C. Eberhart. "Swarm Intelligence." Morgan Kaufmann. 2001

* M. Clerc. "Particle Swarm Optimization." ISTE, 2006.

* D. N. Wilke, S. Kok, and A. A. Groenwold, "Comparison of linear and classical velocity update rules in particle swarm optimization: notes on diversity", International Journal for Numerical Methods in Engineering, Vol. 70, No. 8, pp. 962&ndash;984, 2007.

* A. Chatterjee, P. Siarry, "Nonlinear inertia variation for dynamic adaptation in particle swarm optimization", Computers and Operations Research, Vol. 33, No. 3, pp. 859&ndash;871, 2006.

* A. P. Engelbrecht. "Fundamentals of Computational Swarm Intelligence." Wiley, 2005. [http://si.cs.up.ac.za/]

* D. N. Wilke. "Analysis of the particle swarm optimization algorithm", Master's Dissertation, University of Pretoria, 2005. [http://upetd.up.ac.za/thesis/available/etd-01312006-125743/]

* T. Marwala. Finite element model updating using particle swarm optimization. International Journal of Engineering Simulation, 2005, 6(2), pp. 25-30. ISSN: 1468-1137.

* M. Clerc, and J. Kennedy, "The Particle Swarm-Explosion, Stability, and Convergence in a Multidimensional Complex Space", IEEE Transactions on Evolutionary Computation, 2002, 6, 58-73

* J. Kennedy, and R. Eberhart, "Particle swarm optimization", in Proc. of the IEEE Int. Conf. on Neural Networks, Piscataway, NJ, pp. 1942&ndash;1948, 1995.

* [http://www.particleswarm.info Particle Swarm Central] . News, people, places, programs, papers, etc. See in particular the current Standard PSO.

* [http://paradiseo.gforge.inria.fr/ ParadisEO is a powerful C++ framework dedicated to the reusable design of metaheuristics including PSO algorithms] . Ready-to-use algorithms, many tutorials to easily implement your PSO.

* [http://www1.webng.com/economics FORTRAN Codes Particle Swarm Optimization] Performance on Benchmark functions

* [http://jswarm-pso.sourceforge.net JSwarm-PSO Particle swarm optimization package]

* [http://search.cpan.org/~kylesch/AI-PSO-0.86/lib/AI/PSO.pm Perl PSO Module]

* [http://gecco.org.chemie.uni-frankfurt.de/PsoVis/index.html Java Applet for 3D-visualisation of PSO]

* [http://www.projectcomputing.com/resources/psovis/index.html Java Applet 3D-visualisation of PSO with source code]

* [http://cilib.sourceforge.net CILib] - GPLed computational intelligence simulation and research environment written in Java, includes various PSO implementations

* [http://www.cs.up.ac.za/cs/fvdbergh/publications/phd_thesis.ps.gz An Analysis of Particle Swarm Optimizers] - F. van den Bergh 2002

Wikimedia Foundation. 2010.

Look at other dictionaries:

• Repulsive particle swarm optimization — In mathematics, specifically in optimization, repulsive particle swarm optimization (RPSO) is a global optimization algorithm. It belongs to the class of stochastic evolutionary global optimizers, and is a variant of particle swarm optimization… …   Wikipedia

• Swarm intelligence — (SI) is artificial intelligence based on the collective behavior of decentralized, self organized systems. The expression was introduced by Gerardo Beni and Jing Wang in 1989, in the context of cellular robotic systems [ Beni, G., Wang, J. Swarm… …   Wikipedia

• Optimization (mathematics) — In mathematics, the term optimization, or mathematical programming, refers to the study of problems in which one seeks to minimize or maximize a real function by systematically choosing the values of real or integer variables from within an… …   Wikipedia

• Ant colony optimization algorithms — Ant behavior was the inspiration for the metaheuristic optimization technique. In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be… …   Wikipedia

• Meta-optimization — concept. In numerical optimization, meta optimization is the use of one optimization method to tune another optimization method. Meta optimization is reported to have been used as early as in the late 1970s by Mercer and Sampson [1] for finding… …   Wikipedia

• Ant colony optimization — The ant colony optimization algorithm (ACO), introduced by Marco Dorigo in 1992 in his PhD thesis, is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. They are inspired by the …   Wikipedia

• Category:Optimization algorithms — An optimization algorithm is an algorithm for finding a value x such that f(x) is as small (or as large) as possible, for a given function f, possibly with some constraints on x. Here, x can be a scalar or vector of continuous or discrete values …   Wikipedia

• Global optimization — is a branch of applied mathematics and numerical analysis that deals with the optimization of a function or a set of functions to some criteria. General The most common form is the minimization of one real valued function f in the parameter space …   Wikipedia

• Mathematical optimization — For other uses, see Optimization (disambiguation). The maximum of a paraboloid (red dot) In mathematics, computational science, or management science, mathematical optimization (alternatively, optimization or mathematical programming) refers to… …   Wikipedia

• Multidisciplinary design optimization — Multi disciplinary design optimization (MDO) is a field of engineering that uses optimization methods to solve design problems incorporating a number of disciplines. As defined by Prof. Carlo Poloni, MDO is the art of finding the best compromise …   Wikipedia