11b. Kepler and Newton

[math] \newcommand{\mathds}{\mathbb}[/math]

This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion.

As a continuation of the above, or rather as a complement, let us do some physics. Theorem 11.5 was the foundational result of modern mathematics, and by a remarkable twist of fate, the foundational result of physics is something related to it, as follows:

Theorem

Planets and other celestial bodies move around the Sun on conics,

[[math]] C=\left\{(x,y)\in\mathbb R^2\Big|P(x,y)=0\right\} [[/math]]
with [math]P\in\mathbb R[x,y][/math] being of degree [math]2[/math], which can be ellipses, parabolas or hyperbolas.


Show Proof

This is something quite long, due to Kepler and Newton, but no fear, we know calculus, and therefore what can resist us. The proof goes as follows:


(1) According to observations and calculations performed over the centuries, since the ancient times, and first formalized by Newton, following some groundbreaking work of Kepler, the force of attraction between two bodies of masses [math]M,m[/math] is given by:

[[math]] ||F||=G\cdot\frac{Mm}{d^2} [[/math]]


Here [math]d[/math] is the distance between the two bodies, and [math]G\simeq 6.674\times 10^{-11}[/math] is a constant. Now assuming that [math]M[/math] is fixed at [math]0\in\mathbb R^3[/math], the force exterted on [math]m[/math] positioned at [math]x\in\mathbb R^3[/math], regarded as a vector [math]F\in\mathbb R^3[/math], is given by the following formula:

[[math]] \begin{eqnarray*} F &=&-||F||\cdot\frac{x}{||x||}\\ &=&-\frac{GMm}{||x||^2}\cdot\frac{x}{||x||}\\ &=&-\frac{GMmx}{||x||^3} \end{eqnarray*} [[/math]]


But [math]F=ma=m\ddot{x}[/math], with [math]a=\ddot{x}[/math] being the acceleration, second derivative of the position, so the equation of motion of [math]m[/math], assuming that [math]M[/math] is fixed at [math]0[/math], is:

[[math]] \ddot{x}=-\frac{GMx}{||x||^3} [[/math]]


Obviously, the problem happens in 2 dimensions, and you can even find, as an exercise, a formal proof of that, based on the above equation, if you really want to. But here the most convenient is to use standard [math]x,y[/math] coordinates, and denote our point as [math]z=(x,y)[/math]. With this change made, and by setting [math]K=GM[/math], the equation of motion becomes:

[[math]] \ddot{z}=-\frac{Kz}{||z||^3} [[/math]]


(2) The idea now is that the problem can be solved via some calculus. Let us write indeed our vector [math]z=(x,y)[/math] in polar coordinates, as follows:

[[math]] x=r\cos\theta\quad,\quad y=r\sin\theta [[/math]]


We have then [math]||z||=r[/math], and our equation of motion becomes:

[[math]] \ddot{z}=-\frac{Kz}{r^3} [[/math]]


Let us differentiate now [math]x,y[/math]. By using the standard calculus rules, we have:

[[math]] \dot{x}=\dot{r}\cos\theta-r\sin\theta\cdot\dot{\theta} [[/math]]

[[math]] \dot{y}=\dot{r}\sin\theta+r\cos\theta\cdot\dot{\theta} [[/math]]


Differentiating one more time gives the following formulae:

[[math]] \ddot{x}=\ddot{r}\cos\theta-2\dot{r}\sin\theta\cdot\dot{\theta}-r\cos\theta\cdot\dot{\theta}^2-r\sin\theta\cdot\ddot\theta [[/math]]

[[math]] \ddot{y}=\ddot{r}\sin\theta+2\dot{r}\cos\theta\cdot\dot{\theta}-r\sin\theta\cdot\dot{\theta}^2+r\cos\theta\cdot\ddot\theta [[/math]]


Consider now the following two quantities, appearing as coefficients in the above:

[[math]] a=\ddot{r}-r\dot{\theta}^2\quad,\quad b=2\dot{r}\dot{\theta}+r\ddot{\theta} [[/math]]


In terms of these quantities, our second derivative formulae read:

[[math]] \ddot{x}=a\cos\theta-b\sin\theta [[/math]]

[[math]] \ddot{y}=a\sin\theta+b\cos\theta [[/math]]


(3) We can now solve the equation of motion from (1). Indeed, with the formulae that we found for [math]\ddot{x},\ddot{y}[/math], our equation of motion takes the following form:

[[math]] a\cos\theta-b\sin\theta=-\frac{K}{r^2}\cos\theta [[/math]]

[[math]] a\sin\theta+b\cos\theta=-\frac{K}{r^2}\sin\theta [[/math]]


But these two formulae can be written in the following way:

[[math]] \left(a+\frac{K}{r^2}\right)\cos\theta=b\sin\theta\quad,\quad \left(a+\frac{K}{r^2}\right)\sin\theta=-b\cos\theta [[/math]]


By making now the product, and assuming that we are in a non-degenerate case, where the angle [math]\theta[/math] varies indeed, we obtain by positivity that we must have:

[[math]] a+\frac{K}{r^2}=b=0 [[/math]]


(4) Let us first examine the second equation, [math]b=0[/math]. This can be solved as follows:

[[math]] \begin{eqnarray*} b=0 &\iff&2\dot{r}\dot{\theta}+r\ddot{\theta}=0\\ &\iff&\frac{\ddot{\theta}}{\dot{\theta}}=-2\frac{\dot{r}}{r}\\ &\iff&(\log\dot{\theta})'=(-2\log r)'\\ &\iff&\log\dot{\theta}=-2\log r+c\\ &\iff&\dot{\theta}=\frac{\lambda}{r^2} \end{eqnarray*} [[/math]]


As for the first equation the we found, namely [math]a+K/r^2=0[/math], this becomes:

[[math]] \ddot{r}-\frac{\lambda^2}{r^3}+\frac{K}{r^2}=0 [[/math]]


As a conclusion to all this, in polar coordinates, [math]x=r\cos\theta[/math], [math]y=r\sin\theta[/math], our equations of motion are as follows, with [math]\lambda[/math] being a constant, not depending on [math]t[/math]:

[[math]] \ddot{r}=\frac{\lambda^2}{r^3}-\frac{K}{r^2}\quad,\quad \dot{\theta}=\frac{\lambda}{r^2} [[/math]]


Even better now, by writing [math]K=\lambda^2/c[/math], these equations read:

[[math]] \ddot{r}=\frac{\lambda^2}{r^2}\left(\frac{1}{r}-\frac{1}{c}\right)\quad,\quad \dot{\theta}=\frac{\lambda}{r^2} [[/math]]


(5) In order to study the first equation, we use a trick. Let us write:

[[math]] r(t)=\frac{1}{f(\theta(t))} [[/math]]


Abbreviated, and by reminding that [math]f[/math] takes [math]\theta=\theta(t)[/math] as variable, this reads:

[[math]] r=\frac{1}{f} [[/math]]


With the convention that dots mean as usual derivatives with respect to [math]t[/math], and that the primes will denote derivatives with respect to [math]\theta=\theta(t)[/math], we have:

[[math]] \dot{r}=-\frac{f'\dot{\theta}}{f^2}=-\frac{f'}{f^2}\cdot\frac{\lambda}{r^2}=-\lambda f' [[/math]]


By differentiating one more time with respect to [math]t[/math], we obtain:

[[math]] \ddot{r}=-\lambda f''\dot{\theta}=-\lambda f''\cdot\frac{\lambda}{r^2}=-\frac{\lambda^2}{r^2}f'' [[/math]]


On the other hand, our equation for [math]\ddot{r}[/math] found in (4) above reads:

[[math]] \ddot{r}=\frac{\lambda^2}{r^2}\left(\frac{1}{r}-\frac{1}{c}\right)=\frac{\lambda^2}{r^2}\left(f-\frac{1}{c}\right) [[/math]]


Thus, in terms of [math]f=1/r[/math] as above, our equation for [math]\ddot{r}[/math] simply reads:

[[math]] f''+f=\frac{1}{c} [[/math]]


But this latter equation is elementary to solve. Indeed, both functions [math]\cos t,\sin t[/math] satisfy [math]g”+g=0[/math], so any linear combination of them satisfies as well this equation. But the solutions of [math]f''+f=1/c[/math] being those of [math]g''+g=0[/math] shifted by [math]1/c[/math], we obtain:

[[math]] f=\frac{1+\varepsilon\cos\theta+\delta\sin\theta}{c} [[/math]]


Now by inverting, we obtain the following formula:

[[math]] r=\frac{c}{1+\varepsilon\cos\theta+\delta\sin\theta} [[/math]]


(6) But this leads to the conclusion that the trajectory is a conic. Indeed, in terms of the parameter [math]\theta[/math], the formulae of the coordinates are:

[[math]] x=\frac{c\cos\theta}{1+\varepsilon\cos\theta+\delta\sin\theta} [[/math]]

[[math]] y=\frac{c\sin\theta}{1+\varepsilon\cos\theta+\delta\sin\theta} [[/math]]


Now observe that these two functions [math]x,y[/math] satisfy the following formula:

[[math]] x^2+y^2 =\frac{c^2(\cos^2\theta+\sin^2\theta)}{(1+\varepsilon\cos\theta+\delta\sin\theta)^2} =\frac{c^2}{(1+\varepsilon\cos\theta+\delta\sin\theta)^2} [[/math]]


On the other hand, these two functions satisfy as well the following formula:

[[math]] \begin{eqnarray*} (\varepsilon x+\delta y-c)^2 &=&\frac{c^2\big(\varepsilon\cos\theta+\delta\sin\theta-(1+\varepsilon\cos\theta+\delta\sin\theta)\big)^2}{(1+\varepsilon\cos\theta+\delta\sin\theta)^2}\\ &=&\frac{c^2}{(1+\varepsilon\cos\theta+\delta\sin\theta)^2} \end{eqnarray*} [[/math]]


We conclude that our coordinates [math]x,y[/math] satisfy the following equation:

[[math]] x^2+y^2=(\varepsilon x+\delta y-c)^2 [[/math]]


But what we have here is an equation of a conic, and we are done.

The above result is not the end of the story, because there is still some discussion to be made, in relation with degeneration. There is as well a discussion concerning normalization, because in the Kepler problem we assumed [math]M[/math] to be fixed at [math]0[/math]. However, when changing coordinates via a translation, we can obtain in this way all conics.


Finally, from a physical perspective, that of concretely solving the gravity equation, there is a long discussion, and lots of additional formulae, regarding the trajectory and its parameters, as functions of the initial data. Without getting into full details here, let us record however the following result, coming as a useful version of Theorem 11.8:

Theorem

In the context of a [math]2[/math]-body problem, with [math]M[/math] fixed at [math]0[/math], and [math]m[/math] starting its movement from [math]Ox[/math], the equation of motion of [math]m[/math], namely

[[math]] \ddot{z}=-\frac{Kz}{||z||^3} [[/math]]
with [math]K=GM[/math], and [math]z=(x,y)[/math], becomes in polar coordinates, [math]x=r\cos\theta[/math], [math]y=r\sin\theta[/math],

[[math]] \ddot{r}=\frac{\lambda^2}{r^2}\left(\frac{1}{r}-\frac{1}{c}\right)\quad,\quad \dot{\theta}=\frac{\lambda}{r^2} [[/math]]
for some [math]\lambda,c\in\mathbb R[/math], related by [math]\lambda^2=Kc[/math]. The value of [math]r[/math] in terms of [math]\theta[/math] is given by

[[math]] r=\frac{c}{1+\varepsilon\cos\theta+\delta\sin\theta} [[/math]]
for some [math]\varepsilon,\delta\in\mathbb R[/math]. At the level of the affine coordinates [math]x,y[/math], this means

[[math]] x=\frac{c\cos\theta}{1+\varepsilon\cos\theta+\delta\sin\theta}\quad,\quad y=\frac{c\sin\theta}{1+\varepsilon\cos\theta+\delta\sin\theta} [[/math]]
with [math]\theta=\theta(t)[/math] being subject to [math]\dot{\theta}=\lambda^2/r[/math], as above. Finally, we have

[[math]] x^2+y^2=(\varepsilon x+\delta y-c)^2 [[/math]]
which is a degree [math]2[/math] equation, and so the resulting trajectory is a conic.


Show Proof

This is a sort of “best of” the formulae found in the proof of Theorem 11.8. And in the hope of course that we have not forgotten anything. Finally, let us mention that the simplest illustration for this is the circular motion, and for details on this, not included in the above, we refer to the proof of Theorem 11.8.

As a first concrete question, we would like to understand how the various parameters appearing above, namely [math]\lambda,c,\varepsilon,\delta[/math], which via some basic math can only tell us more about the shape of the orbit, appear from the initial data. The formulae here are as follows:

Proposition

In the context of Theorem 11.9, and in polar coordinates, [math]x=r\cos\theta[/math], [math]y=r\sin\theta[/math], the initial data is as follows, with [math]R=r_0[/math]:

[[math]] r_0=\frac{c}{1+\varepsilon}\quad,\quad\theta_0=0 [[/math]]

[[math]] \dot{r}_0=-\frac{\delta\sqrt{K}}{\sqrt{c}}\quad,\quad\dot{\theta}_0=\frac{\sqrt{Kc}}{R^2} [[/math]]

[[math]] \ddot{r}_0=\frac{\varepsilon K}{R^2}\quad,\quad\ddot{\theta}_0=\frac{4\delta K}{R^2} [[/math]]
The corresponding formulae for the affine coordinates [math]x,y[/math] can be deduced from this. Also, the various motion parameters [math]c,\varepsilon,\delta[/math] and [math]\lambda=\sqrt{Kc}[/math] can be recovered from this data.


Show Proof

We have several assertions here, the idea being as follows:


(1) As mentioned in Theorem 11.9, the object [math]m[/math] begins its movement on [math]Ox[/math]. Thus we have [math]\theta_0=0[/math], and from this we get the formula of [math]r_0[/math] in the statement.


(2) Regarding the initial speed now, the formula of [math]\dot{\theta}_0[/math] follows from:

[[math]] \dot{\theta}=\frac{\lambda}{r^2}=\frac{\sqrt{Kc}}{r^2} [[/math]]


Also, in what concerns the radial speed, the formula of [math]\dot{r}_0[/math] follows from:

[[math]] \begin{eqnarray*} \dot{r} &=&\frac{c(\varepsilon\sin\theta-\delta\cos\theta)\dot{\theta}}{(1+\varepsilon\cos\theta+\delta\sin\theta)^2}\\ &=&\frac{c(\varepsilon\sin\theta-\delta\cos\theta)}{c^2/r^2}\cdot\frac{\sqrt{Kc}}{r^2}\\ &=&\frac{\sqrt{K}(\varepsilon\sin\theta-\delta\cos\theta)}{\sqrt{c}} \end{eqnarray*} [[/math]]


(3) Regarding now the initial acceleration, by using [math]\dot{\theta}=\sqrt{Kc}/r^2[/math] we find:

[[math]] \ddot{\theta}=-2\sqrt{Kc}\cdot\frac{2r\dot{r}}{r^3}=-\frac{4\sqrt{Kc}\cdot\dot{r}}{r^2} [[/math]]


In particular at [math]t=0[/math] we obtain the formula in the statement, namely:

[[math]] \ddot{\theta}_0=-\frac{4\sqrt{Kc}\cdot\dot{r}_0}{R^2} =\frac{4\sqrt{Kc}}{R^2}\cdot\frac{\delta\sqrt{K}}{\sqrt{c}} =\frac{4\delta K}{R^2} [[/math]]


(4) Also regarding acceleration, with [math]\lambda=\sqrt{Kc}[/math] our main motion formula reads:

[[math]] \ddot{r}=\frac{Kc}{r^2}\left(\frac{1}{r}-\frac{1}{c}\right) [[/math]]


In particular at [math]t=0[/math] we obtain the formula in the statement, namely:

[[math]] \ddot{r}_0=\frac{Kc}{R^2}\left(\frac{1}{R}-\frac{1}{c}\right)=\frac{Kc}{R^2}\cdot\frac{\varepsilon}{c}=\frac{\varepsilon K}{R^2} [[/math]]


(5) Finally, the last assertion is clear, and since the formulae look better anyway in polar coordinates than in affine coordinates, we will not get into details here.

With the above formulae in hand, which are a precious complement to Theorem 11.9, we can do some reverse engineering at the level of parameters, and work out how various inital speeds and accelerations lead to various types of conics. The computations here are quite interesting, and we will leave them as an instructive exercise.

General references

Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].