## Tuesday, March 18, 2014

### Gaussian Perturbation Optimization

This type of optimization I'll be discussing is meant to be good for very quickly finding a close to optimal solution to an optimization problem, and not suffer from becoming stuck at a local optimum.
As an example problem I'll take trying to maximize the total pairwise distance between a collection of points on a sphere by moving the points anywhere on the sphere.
These are in spherical coordinates
points = [[0,0],[1.5,0],[3,0], [2, 1], [3,2], [1,5]]
My algorithm does the following:
1. Perturb* the points to get newpoints
2. If the perturbed points have a higher pairwise total distance make points=newpoints
3. repeat the above over a certain number of iterations, I'll use 1000.

*Perturb: By perturb I mean to change the value of a variable to a new value determined by a Gaussian distribution with mean of the original value and deviation in this case .1.
After 1000 iterations this algorithm gives (rounded to the nearest 100th:
final = [[2.98, 0.42], [0.02, 5.64], [0.027, 5.88], [0.16, 0.22], [0.05, 2.84], [3.12, 5.62]]
with a total pairwise sum of distances of 27.66
It is known that the best that can be done is vertices of an octahedron which has a total pairwise sum of distances of: 28.26, so this algorithm came within 2% of the best possible value in much less than a second in python.

**Reasoning behind the algorithm**
The reason the algorithm perturbs the points according to a gaussian distribution is that it will generally look at values for the variables close to where the last good value was but at the same time it will occasionally try values quite a ways from the last best value, this has the effect of both generally fine-tuning the best known value but also avoids getting "stuck" in local maximums.

It's possible to run the algorithm for longer like when I run it for 1,000,000 iterations it came to 28.1 out of 28.2 possible, but most of the improvement is early on so I think it's best used to get close quickly.

It's kind of interesting to look more in depth, starting from the exact worse solution:
points = [[0,0],[0,0],[0,0], [0, 0], [0,0], [0,0]]
26.9520852441
1 after just one perturbation
27.0974971069
4 after four
27.4152852654
30
27.5816941025
291
27.6328574385
361
27.7552858729
432
27.7695129873
646
27.9358914081
751
27.9447444767
2187
27.9689427837
2193
27.9997490744
3163
28.0452579514
7390
28.0455509183
79179
28.0788724788
100877
28.0916073445
300182
28.1013096908
327265
28.1079556186
339103
28.142734259
563316
[[3.141504678216728, 5.578547112469853], [0.0035838527843349106, 6.106448209218512], [3.1264686990942385, 0.27331447965308603], [0.006630475839969886, 5.600804904080315], [0.0379240041525053, 0.3160069253736454], [3.13340891247032, 6.14277187932977]]

So there is a very long tail of improvements but they get more and more expensive as far as iterations.