Wednesday, December 26, 2012

Kinetic Hash

There are a lot of times in computer science you want to "hash" a certain piece of information. The important thing about a hash is it always produces the same output for a given input, and it's really hard to go from the output and figure out what the input was. Also sometimes you'd like the amount of computation to be very little so you can generate them quickly, other times you'd like it to be expensive to compute so someone can't just check every possible input rapidly to figure it out. I think this one does a very good job at all those criteria.
 For example imagine you had this data:
(14)(5)(12)(7) (3)(8)(8)(15) (2)(1)(5)(12) (5)(3)(8)(7) (7)(10)(4)(2)
This data can be in pretty much any form for this hash as you'll see but I'm assuming base 16 numbers in groups of 4.
The way this hash works is you imagine numbered balls on a grid with velocity vectors, the first two numbers of each group of four in the above give the x,y coordinates, the second two give the velocity vector from 0 to 16 being from -8 to 8 squares in the x direction and 0 to 16 being -8 to 8 squares in the y direction. I've shortened the vectors here for clarity. If two end up on the same square simply shift the second ball a certain amount in the x and y direction until you find an empty square :
Then the algorithm simply runs the kinetic simulation constrained to the integers. The balls can bounce off each other and change direction or off the "walls". This can be iterated a certain number of times or the number of "seconds" of the physics simulation. After say 10,000 seconds which can be done in an instant on a computer, the output is the x and y coordinate but not the velocity vector of the first n balls where 2*n is the desired size of the output. If we wanted 10 hashed numbers we'd use all 5. 
(15)(2)(2)(3)(5)(11)(16)(8)(5)(2)
The velocities are left off to make it untraceable going in the reverse direction. So this process is deterministic but really chaotic. After a certain time period there is really just a uniform probability of a ball being anywhere on the grid. And whether you want this to take a lot of cpu time or a little you would just vary the amount of time the simulation is run. 



No comments:

Post a Comment