The formula I found to decompose a distribution into normal components, similar to how the Fourier transform decomposes a wave into sine and cosine components is:
Where n is the number of points to be interpolated... So if we know the value of the distribution say at x=1, 2, and 3 we evaluate the above with n=3 and those values for x1, x2, x3:
Now we can evaluate the above at each of x=1,2,3, with our y values of let's say .9, .7, and .8 respectively, and solve:
Now we can plot the solution:
Friday, November 27, 2015
Sunday, November 22, 2015
Preference Wheel on the complex plane
This example is for giving a person the task to rank their 3 favorite choices out of 10 possible choices labeled alphabetically... The data is as follows:
The above data means the first person chose their favorite to be choice e, their second favorite to be choice d, and their third favorite to be choice b... And the second person chose their favorite to be choice i, and their second favorite to be choice g, and their third choice was b... Etc...
So I thought to make a preference chart where a person's 3rd favorite always comes halfway between their first and second favorite on a circle... First I solved these equations:
So I thought to make a preference chart where a person's 3rd favorite always comes halfway between their first and second favorite on a circle... First I solved these equations:
In complex numbers on the unit circle this is saying the variable in the right hand side of each equation is the point on the circle halfway between the two variables on the left hand side around the circle... And the last equation a=1 is to disallow rotational symmetry we set one variable to equal the real axis...
There were quite a few solutions, but most were trivial that all the variables equal 1 or 0 or all real number solutions, the two interesting ones are mirror images across the real axis, one of them is:
Which plotted on the unit circle is:
And see that it encodes the information perfectly for example that f is halfway around the circle between j and e, when two variables are equal the halfway point is either equal to either or on the opposite side of the circle..
So once plotted the question is can we extrapolate that someone who listed, for example, a and e as their first two favorites, which was not information given, will usually like h as a third choice... I'll have to find a real dataset to statistically determine whether that is the case or not...
Thursday, November 19, 2015
Multidimensional projections on 2d plane
For 3 dimensions you can imagine that A,B,C are your 3 dimensional axes turned in such a way that looking at them they appear like this:
Define a point P on the plane to be P=a*A+b*B+c*C, with a,b,c values from the sliders...
Then we can look at all the ways to fix 2 of the sliders at 1 or 0 and move the 3rd, and tracing the paths they make gives:
Which is the "shadow" or projection of the 3d cube on the plane...
Now what if we have four orthogonal axes, so that their projection on the 2d plane looked like this:
Define a point P on the plane to be P=a*A+b*B+c*C, with a,b,c values from the sliders...
Then we can look at all the ways to fix 2 of the sliders at 1 or 0 and move the 3rd, and tracing the paths they make gives:
Which is the "shadow" or projection of the 3d cube on the plane...
Now what if we have four orthogonal axes, so that their projection on the 2d plane looked like this:
We can define P to be P=a*A+b*B+c*C+d*D
There are a lot of ways to hold all but one of the sliders fixed at 1 or 0 and move the other one, doing so traces this onto the plane:
This is the projection on the plane of a 4 dimensional cube!
Monday, November 16, 2015
Knowing SSA of triangle find other 2 angles approximately without inverse trig functions
This is for when you know 2 sides and an angle of a triangle and you want to find the other 2 angles approximately without using inverse trig functions...
The formula is:
In addition it gets 45-45-90 and Equilateral triangles exactly right! The 180-C can of course be replaced with pi-C when working in radians... There might be a kind of 3 variable Cauchy series that this uses averages of the first two terms of, I'll have to look further...
The formula is:
In addition it gets 45-45-90 and Equilateral triangles exactly right! The 180-C can of course be replaced with pi-C when working in radians... There might be a kind of 3 variable Cauchy series that this uses averages of the first two terms of, I'll have to look further...
Saturday, November 14, 2015
Simple point in polygon test
I've seen a few ways to test for whether a point is in a convex polygon or not that necessitated using inverse trig functions, but this one is all easily calculated the distance functions can be left without taking the square root and just comparing distance-squared instead...
R and L are exactly the points you get when you rotate he line AB 90 degrees around it's midpoint... I think the nice thing is how the reasoning might extend to 3d convex polytopes with L and R making a line orthogonal to each face...
R and L are exactly the points you get when you rotate he line AB 90 degrees around it's midpoint... I think the nice thing is how the reasoning might extend to 3d convex polytopes with L and R making a line orthogonal to each face...
Friday, November 13, 2015
A standard form for adjacency matrices to solve the graph isomorphism problem
Supposing you have some graph with no more than one edge between vertices:
Here A, B, and C are all connected to themselves but they don't have to be, the above graph has adjacency matrix:
Indicating with a 1 where a row, column pair of vertices are connected...
Now a problem is to find a standard form for this and all adjacency matrices corresponding to isomorphic graphs, though maybe the vertices are labeled differently...
First define a row's C to be a binary number corresponding to the row with the most significant digit to the right.. for example the first row above reads as 1100 so that would be the binary number for decimal 3 or 2^0 + 2^1
The algorithm I'm proposing is to simultaneously swap two rows and corresponding columns producing a new matrix if it increases a row's C with importance towards the top row. For example, swap two columns and corresponding rows if it either increases the first row's C or if that's not possible, the second rows C while also not making the first row's C less, or if that's not possible swap so the third row's C is increased while at the same time not making row 1's and row 2's C less, etc... Each iteration of the algorithm I call an "improvement" of the adjacency matrix, the algorithm is done when no more improvements are possible...
Above is 3 steps of the algorithm describing which rows and columns are to be swapped highlighted in red and which row is improved on the new matrix after the swapping... The final matrix on the right can not be improved by the above definition by swapping any pair of rows and corresponding columns... I think the exact specifics of how to choose which rows and corresponding columns is not important to the final result which by definition can not be improved...
So the final matrix is I believe in a standard form for all adjacency matrices corresponding to isomorphic graphs. In other words all adjacency matrices corresponding to isomorphic graphs and only to isomorphic graphs will improve to a particular unique standard matrix at the end of the algorithm...
Here A, B, and C are all connected to themselves but they don't have to be, the above graph has adjacency matrix:
Indicating with a 1 where a row, column pair of vertices are connected...
Now a problem is to find a standard form for this and all adjacency matrices corresponding to isomorphic graphs, though maybe the vertices are labeled differently...
First define a row's C to be a binary number corresponding to the row with the most significant digit to the right.. for example the first row above reads as 1100 so that would be the binary number for decimal 3 or 2^0 + 2^1
The algorithm I'm proposing is to simultaneously swap two rows and corresponding columns producing a new matrix if it increases a row's C with importance towards the top row. For example, swap two columns and corresponding rows if it either increases the first row's C or if that's not possible, the second rows C while also not making the first row's C less, or if that's not possible swap so the third row's C is increased while at the same time not making row 1's and row 2's C less, etc... Each iteration of the algorithm I call an "improvement" of the adjacency matrix, the algorithm is done when no more improvements are possible...
So the final matrix is I believe in a standard form for all adjacency matrices corresponding to isomorphic graphs. In other words all adjacency matrices corresponding to isomorphic graphs and only to isomorphic graphs will improve to a particular unique standard matrix at the end of the algorithm...
Wednesday, November 11, 2015
Easier Bezier by 2nd derivative smoothing
So supposing you have n points on the plane with n greater than or equal to 4 and you want to find a smooth curve through them, you could do a n-degree bezier curve through them but that takes exponential time with respect to n to calculate or you could use bezier splines of a lesser degree but if you use , say, quadratic bezier splines through the n points directly there is a smoothness problem... The approach I'll outline below is to generate with a formula the 2*n-1 "halfway" points between known points then use quadratic bezier curves to interpolate but the way the "halfway" points are calculated gives a smooth final result...
So first we take our n points and label them A(2*(j)-1) for j from 0 to n, and from here on call them A(i)... Then we use this formula:
This formula is saying that the 2nd derivative by divided differences at a point should be the average of the 2nd derivative at the point before it and the point after it...
Now here is a python program to figure out what the i values should be in the formula above to get the right number of equations for the number of new "halfway" points :
n=7
d = 1.0*(2*n-6)/(n-2)
for i in range(0, n-1):
print(round(3+i*d))
For n=4 we get:
3,4,5
For n = 5 an odd number of points, we need 4 equations for the 4 in between points so we make the i's using the program::
3,4,6,7
For n = 6 there are 5 in between points we need equations for so we make the i's:
3,4,6,8,9
For n = 7:
3,5,6,8,9,11
For n = 6 we know the six points A1, A3, A5, A7, A9,A11 and we want to know A2, A4, A6, A8, A10 so we use the i values from the program {3,4,6,8,9} and get five equations to solve:
Then the third step is to use these newly generated points to compute the quadratic Bezier paths between odd labeled points (I used a graphics program to manually draw the Beziers so they don't go exactly through the center of each point but mathematically they would)...
Compare this to using quadratic Beziers directly...
And there are some issues using just Beziers with even number of points about how exactly to divide the curve into quadratics because you need groups of 3 points for quadratic Beziers...
So first we take our n points and label them A(2*(j)-1) for j from 0 to n, and from here on call them A(i)... Then we use this formula:
This formula is saying that the 2nd derivative by divided differences at a point should be the average of the 2nd derivative at the point before it and the point after it...
Now here is a python program to figure out what the i values should be in the formula above to get the right number of equations for the number of new "halfway" points :
n=7
d = 1.0*(2*n-6)/(n-2)
for i in range(0, n-1):
print(round(3+i*d))
For n=4 we get:
3,4,5
For n = 5 an odd number of points, we need 4 equations for the 4 in between points so we make the i's using the program::
3,4,6,7
For n = 6 there are 5 in between points we need equations for so we make the i's:
3,4,6,8,9
For n = 7:
3,5,6,8,9,11
For n = 6 we know the six points A1, A3, A5, A7, A9,A11 and we want to know A2, A4, A6, A8, A10 so we use the i values from the program {3,4,6,8,9} and get five equations to solve:
Then the third step is to use these newly generated points to compute the quadratic Bezier paths between odd labeled points (I used a graphics program to manually draw the Beziers so they don't go exactly through the center of each point but mathematically they would)...
Monday, November 9, 2015
4 point sigmoidal elliptical curve
H and J are used to smoothly blend from one elliptical parameterization to another to produce the curve... the final curve is parameterized over [-pi/2] to [pi]... It might be good for something like designing fonts because it can do circular arcs that are found in a lot of letters well that bezier splines don't manage very well...
Sunday, November 8, 2015
Elliptical Pizza Theorem
**UPDATE**
Actually I was suggesting my above answer to the people at geogebra.org and they told me about this nice parametric equation that only needs the center point and any 2 points on the ellipse, where A is the center:
f(t)=A+(B-A)*sin(t)+(C-A)*cos(t),
at 0 it equals C, at 1/2)*Pi it equals B, then at 2*pi back to C!
This one just wows me after as hard as I worked to figure out the one above...
And from that I derived this one:
f(t) = (1/2)*D+(1/2)*B+((1/2)*B-(1/2)*D)*sin(t)+(C-(1/2)*D-(1/2)*B)*cos(t)
Which given 3 points on the ellipse parameterizes it so that between B and D is the shorter axis...
Friday, November 6, 2015
Experimental comparison of G-decomposition of waves vs. Fourier
Here F is the Fourier style decomposition, for 5 points , and G is the new decomposition I've been working with, written also for five points:
So I tried not to cherry pick these values, I've gotten similar results with every set of points I've tried so far, but let's let the points be [1,1],[3,2],[5,5],[7,3],[9,2] and solve for the coefficients:
I think the problem with the Fourier is that it maxes to 6 height when the largest point is only 5, the G decomposition seems more reasonable,
**Note**
I don't know if the exponents in G cause the coefficients to get too large with a large number of points that will require some more investigation...And of course using powers of trigonometric functions might make them too costly to computer, but there might be some use cases where that isn't as important as the best fit to the points... There might even be some variations between these two where you use some small powers of sines and cosines and some multiples of frequencies...
So I tried not to cherry pick these values, I've gotten similar results with every set of points I've tried so far, but let's let the points be [1,1],[3,2],[5,5],[7,3],[9,2] and solve for the coefficients:
I think the problem with the Fourier is that it maxes to 6 height when the largest point is only 5, the G decomposition seems more reasonable,
**Note**
I don't know if the exponents in G cause the coefficients to get too large with a large number of points that will require some more investigation...And of course using powers of trigonometric functions might make them too costly to computer, but there might be some use cases where that isn't as important as the best fit to the points... There might even be some variations between these two where you use some small powers of sines and cosines and some multiples of frequencies...
Wednesday, November 4, 2015
algebraic boolean algebra to prove De Morgan's laws
For 2 binary variables, which may also represent truth and falseness, we can solve for algebraic expressions representing the usual boolean operations like below... we just solve
a*x+b*y+c*x*y+d
algebraically for the four conditions in the truth table to get the coefficients...
Here I solve for ~(x OR y)
which is:
So the equation above ranging over the binary parameters is:
regular OR is:
and it's easily proven that to negate any expression multiply the expression by -1 and add 1, the OR above is -1*(NOR) +1 for example, and the NOR is -1*(OR) +1 as well. Just see that any time the expression evaluates to 1, 1*(-1)+1 = 0 and when it evaluates to 0, 0*(-1)+1 = 1
So to prove de Morgan's laws we need an AND as well...
So de Morgan's laws are that ~(x AND y) = (~x OR ~y)...
So ~AND is the same as -x*y + 1 to negate x*y, and then we multiply each variable in the or expression by -1 and add 1 to get...
And simplification shows we have equality...
In the above the powers of a variable were removed to get to the bottom line to show that (x OR y) AND ~(x OR y) is always false or 0... I think this type of exponent elimination and algebraic simplification makes the 16 possible basic expressions from the 16 truth tables closed under composition...
and in general you have coefficients for multiplying the variables 0 at a time (the constant factor), 1 at a time, 2 at a time, etc. up to n at a time for n variables... This is 2^n coefficients because the number of coefficients is the same cardinality as the power set on n items which is 2^n, and there are 2^n possiblities of variables making 2^n linear equations in the coefficients so it can be solved...
That x OR y has a 52% chance of happening, I'll have to write a program and see whether empirically this checks out... For fuzzy logic you'd have to do away with the rule that exponents disappear of course...
a*x+b*y+c*x*y+d
algebraically for the four conditions in the truth table to get the coefficients...
Here I solve for ~(x OR y)
which is:
So the equation above ranging over the binary parameters is:
regular OR is:
and it's easily proven that to negate any expression multiply the expression by -1 and add 1, the OR above is -1*(NOR) +1 for example, and the NOR is -1*(OR) +1 as well. Just see that any time the expression evaluates to 1, 1*(-1)+1 = 0 and when it evaluates to 0, 0*(-1)+1 = 1
So to prove de Morgan's laws we need an AND as well...
So de Morgan's laws are that ~(x AND y) = (~x OR ~y)...
So ~AND is the same as -x*y + 1 to negate x*y, and then we multiply each variable in the or expression by -1 and add 1 to get...
And simplification shows we have equality...
**Powers and composing**
Sometimes when composing by substituting one of these expressions in for a variable in another, you get powers of a variable, but this is resolved by noting that any counting number power of 1 is 1 and any likewise for 0 is 0... For example This is substituing (X or Y) in for x and ~(x OR y) in for Y in the expression for (x AND y):In the above the powers of a variable were removed to get to the bottom line to show that (x OR y) AND ~(x OR y) is always false or 0... I think this type of exponent elimination and algebraic simplification makes the 16 possible basic expressions from the 16 truth tables closed under composition...
**3 or more variables**
For 3 variables you solve this equation for the 8 possibilities of the unknowns:and in general you have coefficients for multiplying the variables 0 at a time (the constant factor), 1 at a time, 2 at a time, etc. up to n at a time for n variables... This is 2^n coefficients because the number of coefficients is the same cardinality as the power set on n items which is 2^n, and there are 2^n possiblities of variables making 2^n linear equations in the coefficients so it can be solved...
**Fuzzy Logic**
Using the expression for OR which is x +y - x*y, I wonder if it's true that if x has a 25% chance of happening and y has a 37% chance of happening, that we can plug in those values and get:That x OR y has a 52% chance of happening, I'll have to write a program and see whether empirically this checks out... For fuzzy logic you'd have to do away with the rule that exponents disappear of course...
Sunday, November 1, 2015
median heaps and a sorting method
Suppose you have a list of some sortable items indexed naturally starting at 1:
L=2,4,7,9,3,1,5,8,6
Imagine building a heap but not a max heap or a min heap but a median heap, where the property is that position n is greater than or equal to position 2*n and less than or equal to position 2*n+1. Or thinking of it as a tree, that the left child is less than or equal to the parent which is less than or equal to the right child... So looking at the above list:
Start with the first three items:
H=2,4,7
H(1)=2, H(2)=4, H(3)=7 so this does not yet have the property that H(2*n) <= H(n) <= H(2*n+1) for n=1
so we rearrange those 3 items so the property is true:
4,2,7
Now the property is satisfied so we add two more numbers from the list
4,2,7,9,3
H(2) =2 H(4)=9 H(5)=3 so we have to rearrange those three numbers to satisfy the property
4,3,7,2,9
Now since H(2) changed we have to again see whether H(2)<= H(1) <=H(3) which it is
so add another two numbers
4,3,7,2,9,1,5
and H(3)=7 H(6)=1 H(7) = 5 so we have to rearrange to make the property true...
4,3,5,2,9,1,7
And since H(3) changed we have to check whether H(2) <= H(1) <=H(3) which it is
so we add two more
4,3,5,2,9,1,7,8,6
H(8) is not less than or equal to H(4) and H(4) less than or equal to H(9) so we rearrange:
4,3,5,6,9,1,7,2,8
Now H(2) is not between H(4) and H(5) so we rearrange:
4,6,5,3,9,1,7,2,8
And now H(1) is not between H(2) and H(3) so again rearrange:
5,4,6,3,9,1,7,2,8
Graphically it looks like this:
Now the median heap is built... notice that position n is greater than or equal to position 2*n
and less than position 2*n+1, it happens that the first number is the median and the left branches are less than the right branches, but this won't always be the case. So a second step can make it so that that is the case which may break the median heap property but is nicer for sorting, say...
It's possible to recurse the two steps on the two new half lists to eventually get a completely ordered list...
**Example**
The following is the output from the source code below for n=15, in general the list has to have an odd number of items for this code to run
input:
[86 57 24 16 53 91 2 87 83 58 22 89 91 89 33]
output:
[58 53 89 24 86 57 89 16 87 22 83 2 91 33 91]
See the root is the median of the set of numbers and all the odd positions are less than the even positions... Note that this does not have the median heap property because n=5 is 86 which is not between n=10 22 and n=11 83 because the second step broke the median heap property to make it so the root became the median and all even positions are less than the odd positions... I'll mark the spot where the second step begins in the code with a comment...
**Go source code**
L=2,4,7,9,3,1,5,8,6
Imagine building a heap but not a max heap or a min heap but a median heap, where the property is that position n is greater than or equal to position 2*n and less than or equal to position 2*n+1. Or thinking of it as a tree, that the left child is less than or equal to the parent which is less than or equal to the right child... So looking at the above list:
Start with the first three items:
H=2,4,7
H(1)=2, H(2)=4, H(3)=7 so this does not yet have the property that H(2*n) <= H(n) <= H(2*n+1) for n=1
so we rearrange those 3 items so the property is true:
4,2,7
Now the property is satisfied so we add two more numbers from the list
4,2,7,9,3
H(2) =2 H(4)=9 H(5)=3 so we have to rearrange those three numbers to satisfy the property
4,3,7,2,9
Now since H(2) changed we have to again see whether H(2)<= H(1) <=H(3) which it is
so add another two numbers
4,3,7,2,9,1,5
and H(3)=7 H(6)=1 H(7) = 5 so we have to rearrange to make the property true...
4,3,5,2,9,1,7
And since H(3) changed we have to check whether H(2) <= H(1) <=H(3) which it is
so we add two more
4,3,5,2,9,1,7,8,6
H(8) is not less than or equal to H(4) and H(4) less than or equal to H(9) so we rearrange:
4,3,5,6,9,1,7,2,8
Now H(2) is not between H(4) and H(5) so we rearrange:
4,6,5,3,9,1,7,2,8
And now H(1) is not between H(2) and H(3) so again rearrange:
5,4,6,3,9,1,7,2,8
Graphically it looks like this:
Now the median heap is built... notice that position n is greater than or equal to position 2*n
and less than position 2*n+1, it happens that the first number is the median and the left branches are less than the right branches, but this won't always be the case. So a second step can make it so that that is the case which may break the median heap property but is nicer for sorting, say...
It's possible to recurse the two steps on the two new half lists to eventually get a completely ordered list...
**Example**
The following is the output from the source code below for n=15, in general the list has to have an odd number of items for this code to run
input:
[86 57 24 16 53 91 2 87 83 58 22 89 91 89 33]
output:
[58 53 89 24 86 57 89 16 87 22 83 2 91 33 91]
See the root is the median of the set of numbers and all the odd positions are less than the even positions... Note that this does not have the median heap property because n=5 is 86 which is not between n=10 22 and n=11 83 because the second step broke the median heap property to make it so the root became the median and all even positions are less than the odd positions... I'll mark the spot where the second step begins in the code with a comment...
**Go source code**
package main import ( "fmt" "math/rand" "time" ) func t(a int, b int, c int) (int, int, int, bool){ if (b <= a && a <= c){ return a,b,c, false } if (c <= a && a <=b){ return a,c,b, true } if (a <= b && b <= c){ return b,a,c, true } if (c <=b && b <= a){ return b,c,a, true } if (a <= c && c <= b){ return c,a,b, true } return c,b,a, true } func hc(h [] int, o int) ([] int){ f := true for f == true{ a ,b ,c, f2 := t(h[o],h[2*o+1], h[2*o+2]) f = f2 h[o] = a h[2*o+1] = b h[2*o+2] = c o = int(o/2.0) } return h } func main() { l := make([] int, 15) h := make([] int, 15) rand.Seed(time.Now().UnixNano()) for i:=0; i<15; i++{ l[i] = rand.Intn(100) } fmt.Println(l)
h[0] = l[0]
for i:=0; i<int(len(l)/2.0); i++{
h[2*i+1] = l[2*i+1]
h[2*i+2] = l[2*i+2]
h = hc(h, i)
}
//Second step breaking median heap property begins
for i:=1; i< len(h); i+=2{
f :=false
h[0], h[i], h[i+1], f = t(h[0], h[i], h[i+1])
if f == true{
i =1
}
}
fmt.Println(h)
}