analytics

Sunday, December 30, 2012

Returning 1 if two parametric functions cross

I found this equation which given two parametric equations t and u both of x and y, evaluates to 1 if the two parametric equations are equal or alternatively whether they cross:
For instance:
It equals 1 because the functions cross at some point:

The proof is:
Inside the integral are two Dirac functions, set up so that they are both Dirac(0) only when x(t)=x(u) and y(t)=y(u). The Dirac function is only non-zero (infinite) at Dirac(0). The product of the two will only be non-zero if both are non-zero. Hence the integral over the domain of u and t will be greater than 0 if the two parametric equations are ever equal. The Heaviside function returns 1 if x is greater than or equal to 0 so the function will be one when the parametric equations are ever equal over the two intervals. 

This one was kind of frustrating to develop because when I had the idea this is one of the forms of it I tried first, but I had a small mistake so I thought it didn't work and then worked on it for several days trying all kinds of different things until finally I tried it again and got it to work. It can be adapted to other types of equations. In practice it takes Maple a second to evaluate the integral so it might be faster to solve the equations, but maybe the form of this is useful or interesting besides the strict application of it. 
You might wonder why I didn't do it in one of several other ways, like the Dirac of the distance function between the two equations... It turned out that those were impossible for Maple to solve while this one works well. 



Friday, December 28, 2012

Picture of Rose

For a break from the math a minute I did this picture of my parent's dog Rose. I did the red, green, and blue channels separately and then combined them to see how it would work.
So the way I did this was I wrote a program that looks at the red, green and blue components and rounds them to the nearest of 8 shades each to make these:

Then I paint over each one to smooth out all the speckly borders you can see for instance in the above to get this:
The other two channels look like:
Then I recombine the 3 components to get the final image.
For instance the heart shape on her head is orange which is a mixture of a certain shade of red and a certain shade of green.

Wednesday, December 26, 2012

Kinetic Hash

There are a lot of times in computer science you want to "hash" a certain piece of information. The important thing about a hash is it always produces the same output for a given input, and it's really hard to go from the output and figure out what the input was. Also sometimes you'd like the amount of computation to be very little so you can generate them quickly, other times you'd like it to be expensive to compute so someone can't just check every possible input rapidly to figure it out. I think this one does a very good job at all those criteria.
 For example imagine you had this data:
(14)(5)(12)(7) (3)(8)(8)(15) (2)(1)(5)(12) (5)(3)(8)(7) (7)(10)(4)(2)
This data can be in pretty much any form for this hash as you'll see but I'm assuming base 16 numbers in groups of 4.
The way this hash works is you imagine numbered balls on a grid with velocity vectors, the first two numbers of each group of four in the above give the x,y coordinates, the second two give the velocity vector from 0 to 16 being from -8 to 8 squares in the x direction and 0 to 16 being -8 to 8 squares in the y direction. I've shortened the vectors here for clarity. If two end up on the same square simply shift the second ball a certain amount in the x and y direction until you find an empty square :
Then the algorithm simply runs the kinetic simulation constrained to the integers. The balls can bounce off each other and change direction or off the "walls". This can be iterated a certain number of times or the number of "seconds" of the physics simulation. After say 10,000 seconds which can be done in an instant on a computer, the output is the x and y coordinate but not the velocity vector of the first n balls where 2*n is the desired size of the output. If we wanted 10 hashed numbers we'd use all 5. 
(15)(2)(2)(3)(5)(11)(16)(8)(5)(2)
The velocities are left off to make it untraceable going in the reverse direction. So this process is deterministic but really chaotic. After a certain time period there is really just a uniform probability of a ball being anywhere on the grid. And whether you want this to take a lot of cpu time or a little you would just vary the amount of time the simulation is run. 



Friday, December 21, 2012

Inverse polynomial approximation

Consider the limiting process of using the Simpson's rule to approximate say e^x from x=0..5 using n rectangles:
n=2    165.11927912116
n=4    149.09326952253
n=6    147.77767796066
n=8    147.53254079859
n=10  147.46285985231

Intuitively you can imagine the graph going to a limit as n goes to infinity in this process, it will just get closer and closer from above to the real value of e^x over the interval.

But  if you try to model this process with polynomial interpolation it does very poorly.

See it actually tends towards infinity after the part where it is forced to match the values given instead of a limit.

I found that if you instead solve for what I'm calling an inverse polynomial, that is of the form:
you can get much closer. I'll go into detail how this is done in a minute but the polynomial you end up with using the data in the example I first mentioned is:
The graph of this looks like:
This looks a lot more like it models the process. In fact the next couple values of this inverse polynomial at x=10 and x=12 are:
147.4628599
147.4349611
so it is approaching something the actual limit of the process that is the integral of e^x for x=0..5 is:
147.4131591
So by modeling the algorithm and then looking ahead a few values we could "see" closer to the limit than actually running the algorithm for higher values of n. Which for a very difficult algorithm would be very handy. 

Now the way I solve for this polynomial is consider this matrix where x, x+2, x+4, etc. are the given values of the function for a certain value of x:
In my case I evaluate the above matrix at x=2 and that gives this to solve:

Solving these 5 equations for the 5 variables gives the inverse polynomial coefficents. 
Then these are plugged into:
to get: 

I think this process should work well to model functions that tend towards a limit, for instance many algorithms. 




Tuesday, December 11, 2012

Tourney ranking test

Suppose the actual strength of 25 teams that we're trying to rank that will be hidden from the algorithm is:
1.A
2.B
3.C
4.D
5.E
6.F
7.G
8.H
9.I
10.J
11.K
12.L
13.M
14.N
15.O
16.P
17.Q
18.R
19.S
20.T
21.U
22.V
23.W
24.X
25.Y

Because we don't know ahead of time the actual ranking we just have them listed randomly:
1. M
2. V
3. G
4. F
5. L
6. Y
7.R
8.U
9.T
10.W
11.B
12.S
13.P
14.A
15.X
16.C
17.H
18.J
19.N
20.K
21.O
22.E
23.D
24.I
25.Q
Now a matchup grid is made.Every team plays 5 other teams. It actually took a little while to think of a pattern that had 5 x's in each row, each column, where if there was an x in a row, column pair then it was also in the transpose, and that nothing was on the top left to bottom right diagonal which would mean a team was playing itself.

Using the fact that earlier letters in the alphabet beat later ones a 1 is put for every win of a row versus a column. 



So after 10 iterations of putting the sum of a rows entries into the non-zero column entries it gives the ranking of:
1. A
2. D
3. F
4. B
5. G
6. H
6. E
7. C
8. O
8.M
9. L
9.K
9.Q
10. C
10. N
11. J
12. P
13. T
14. U
15. R
15. S
16. V
17. Y
18. X

So it recovered most of the original alphabetical order.

Monday, December 10, 2012

Ranking system

This method would work for all the teams in a league but I just did this one for the top 25 teams in the NCAA college football rankings.
I'll be using letters to represent teams according to this:

First you make a grid where there is a 1 if the team in row X beats team in column Y. 
So a 1 in row F and Column D means Georgia beat Florida. 

Now the algorithm simply adds the ones in row X, adds 1 to that, and puts them for each non-zero entry in column X. So in row D there are four 1's, so 5 replaces all the 1's in column D. So after one iteration it looks like: 

Now on the next iteration of the algorithm, every team that beat Florida which is team D, will get 5 points for beating them because Florida beat 4 teams. On the next iteration a team will also get points for the teams that the teams that Florida beat beat. And so on...
After 10 iterations the algorithm says that the ranking of these teams should be:
1. Florida (1244)
2. Alabama (999)
3. Georgia (799)
4. LSU (644)
5. Texas A&M (517)
6. Ohio State (20)
It dropped off quickly here because Ohio State only beat Michigan and Nebraska in the top 25 and they together only beat  Northwestern who didn't beat any top 25 team. 
7. Utah State (18)
8. Nebraska (18)
9. Notre Dame (13)
10. Oklahoma (12)
11. Stanford (9)
12. Oregon (3)
13. Oregon State (2)
14. Kansas State (1)
15. Northern Illinois (1)
16. Michigan (1)

The rest of the teams never beat a top 25 team. 

Now I admit that to be fair the whole league would have to be taken into account not just the top 25 teams but I didn't feel like entering in all that information :) 

Thursday, December 6, 2012

wave of waves

I was working on something and came across this wave:
It has a really interesting Fourier transform if one considers just the part of the wave around 0 and exponentially tapering off towards infinity and -infinity:
So it has an interesting shape in the complex plane. 


Power series from points

Here's how you can find the power series of some unknown function through some points that you know the value of the function at. 
The main equation is: 
Say we're given the points:
Let's make the center point where we want the power series to be centered. This makes x=0, f(x)=2 because that's the y value at x=0, and f(x-2)=-6 from the leftmost point and so on. So:
after multiplying:

Now this can be solved for b, c, d. 

See what b,c,d represent is the value of the first, second, and third derivative at x=0 to substitute into the power series equation about x=0.