To make a sound for speech you move how open your mouth is, where your tongue is in your mouth, how rounded your lips are, whether your vocal cords are vibrating, and if so what tone in your vocal range they are producing. Every syllable has a beginning, middle and end time period where you move from one combination of all those variables to another, and then to a final state.
So going through these let's call how open your mouth is M(m) where m can be a number from 0 to 6, with 0 closed and 6 fully open. The tongue can be at a position T(f, u) where f is how forward the tongue is and u is how far up in the mouth it is. A good range is f can be from -2 to 2 for all the way back to all the way forward and u from 0 to 2 for bottom middle and top of the mouth. Lips could be L(l) with l from 0 to 1, for not at all rounded to fully rounded. And vocal cords could be V(o, t) with o being binary 0 or 1 off or on, and t from 0 to 1 for the lowest to highest pitched tone you can make. And B(b) is whether your breathing out or not.
So a part of a syllable looks like:
M(m), T(f, u), L(l), V(o,t)
And a whole syllable looks like:
M(m1) -> M(m2) -> M(m3)
T(f1,u1) -> T(f2,u2) -> T(f3,u3)
L(l1) -> L(l2) -> L(l3)
V(o1,t1) -> V(o2,t2) -> V(o3,t3)
B(b1) -> B(b2) -> B(b3)
For instance the syllable "boy?" where the question mark means the tone rises as you go through the word...
M(0) -> M(3) -> M(3)
T(2, 0) -> T(1, 0) -> T(1, 0)
L(1) -> L(.5) -> L(0)
V(1, .5) -> V(1,.75) -> V(1, 1)
B(1) -> B(1) -> B(1)
This basically means the mouth is going from closed to half open, the tongue goes from all the way forward on the bottom of the mouth to just a bit back still on the bottom, the lips go from rounded to not-rounded, and the voice is on and goes up in tone.
A different example might be "swipe"
M(3) -> M(1) -> M(0)
T(1, 2) -> T(1, 1) -> T(1, 1)
L(0) -> L(0) -> L(0)
V(1, .5) -> V(1,.75) -> V(0, 0)
B(1) -> B(1) -> B(1)
So really a syllable is pretty complicated...