Sunday, February 16, 2014

Parallelizable sorting algorithm (Array Sort)


**I noticed after review a counter example, and that this may only work for square matrices, the change from what I write below would be always to use a square matrix with null elements to pad it out.
There are a lot of sorting algorithms, some provably as fast on one processor as possible. But lately with the possibility of gpu and multiple core cpu architectures it is interesting to find one that can easily be split across the individual compute units acting independently. This one I call array sort which I think fits the need.
1. Start with your list of numbers or whatever is to be sorted, this is the data for my example:
In this example there are 20 items, but there can be any number. The idea for the next step is to fill an array that is as square as possible for you items.Here that would be a 4x5 array, if there had been 21 items we could use either a 4x6 array or 5x5 and fill the extra spaces with null items that would be discarded at the end. 
Now the next step is to sort each row. Since each row is sorted independently this step can be distributed across many processors for large lists. 
Now the next step is to sort by columns, this also can be distributed across many compute units. 
The third step is to sort by diagonals, where by diagonals I mean as in the linear algebra sense that each diagonal wraps around, for example the third diagonal is 3,12,1,3,15 because the third item in row one is three the fourth item in row 2 is twelve, and we wrap around to the 1 in the third row and first column and continue. This step can also be parralelized across man processors for large lists. 
Now the array can be read off by rows in sorted order. I haven't seen a case where this wasn't already sorted but if it turns out this isn't enough maybe another step where sort the top right to bottom left diagonals could be added. 
The algorithm can also be used recursively, if when we originally had a matrix that was very large, we can use the array algorithm on each row to get each of those in order etc...
*Proof*
This is my best attempt so far...
Suppose in our array we follow the item starting in cell [i,j] and the item starting in cell [k,l]. If they are in the same row then the larger will be put further to the right. And further sorting by column and then diagonal won't move the lesser further to the right unless it is to a row further up. If the two values don't start on the same row then they may be on the same column, in which case it will be put on a higher row and sorting by diagonals won't move the lesser to  a row further down than the greater because it would have to be greater than the number in its diagonal on the same row as the greater number but it is clearly less than that number after sorting by rows and columns.  If they start in different columns and different rows they may be on the same diagonal which will be sorted so the greater is on a further down row and further column to the right. If they aren't on the same diagonal, the lesser can't end up on a row farther down than the greater because it's been sorted by columns, and it can only move further to the right if it's on a lesser row because it's been sorted by rows. Since this is true for an arbitrary pair of numbers all the numbers will be arranged so that a number that's greater will be either on the same row and to the right of lesser numbers or on a row further down. Hence the numbers are sorted read across by rows. 
**Further idea**
Because of the recursive nature it would be sufficient if I prove by exhaustion that it works for a certain size matrix like say 4x4, then the problem could be recursively cut down until it became solving many 4x4 matrices.

No comments:

Post a Comment