11a. Equations, conics

This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion.

As an application of the theory of partial derivatives that we developed, and of calculus in general, as we know it at this point, we can now talk about some beautiful things, namely geometry and physics. As a starting point, you have surely noticed that the Sun moves around the Earth on a circle. However, when carefully measuring it, this circle is not exactly a circle, but rather an ellipsis. Also, some further possible trajectories, of one object with respect to another, due to gravity, include parabolas and hyperbolas, that you can observe with a telescope, by looking at asteroids and comets.

So, before even starting to look at the equations of gravity, and having some fun in solving them, we need a mathematical theory of curves like ellipses, parabolas and hyperbolas, which are what we can expect to find, as trajectories, from gravity computations. And, good news, this theory exists, since the ancient Greeks. Let us start with:

Definition

A conic is a plane algebraic curve of the form

[[math]] C=\left\{(x,y)\in\mathbb R^2\Big|P(x,y)=0\right\} [[/math]]

with [math]P\in\mathbb R[x,y][/math] being of degree [math]\leq2[/math].

As basic examples of conics, we have the ellipses, parabolas and hyperbolas. The simplest examples of these are as follows, with the ellipsis actually being a circle:

[[math]] x^2+y^2=1\quad,\quad x^2=y\quad,\quad xy=1 [[/math]]

Observe that, due to our assumption [math]\deg P\leq 2[/math], we have as conics some degenerate curves as well, such as lines, [math]\emptyset[/math], and [math]\mathbb R^2[/math] itself, coming from [math]\deg P\leq1[/math], as follows:

[[math]] x=0\quad,\quad 1=0\quad,\quad 0=0 [[/math]]

This might suggest to replace our assumption [math]\deg P\leq 2[/math] by [math]\deg P=2[/math], but we will not do so, because [math]\deg P=2[/math] does not rule out degenerate situations, such as:

[[math]] x^2+y^2=-1\quad,\quad x^2+y^2=0\quad,\quad x^2=0\quad,\quad xy=0 [[/math]]

In fact, what we get here are [math]\emptyset[/math], points, lines, and pairs of lines, so in the end, assuming [math]\deg P=2[/math] instead of [math]\deg P\leq 2[/math] would only rule out [math]\mathbb R^2[/math] itself, which is not worth it.

Summarizing, our notion of conic from Definition 11.1 looks quite reasonable, so let us agree on this notion. Getting now to classification matters, we first have:

Theorem

Up to non-degenerate linear transformations of the plane,

[[math]] \binom{x}{y}\to A\binom{x}{y} [[/math]]

with [math]\det A\neq0[/math], the conics fall into two classes, as follows:

Non-degenerate: circles, parabolas, hyperbolas.
Degenerate: [math]\emptyset[/math], points, lines, pairs of lines, [math]\mathbb R^2[/math].

Show Proof

As a first observation, looks like we forgot the ellipses, but via linear transformations these become circles, so things fine. As for the proof, this goes as follows:

(1) Consider an arbitrary conic, written as follows, with [math]a,b,c,d,e,f\in\mathbb R[/math]:

[[math]] ax^2+by^2+cxy+dx+ey+f=0 [[/math]]

(2) Assume first [math]a\neq0[/math]. By making a square out of [math]ax^2[/math], up to a linear transformation in [math](x,y)[/math], we can get rid of the term [math]cxy[/math], and we are left with:

[[math]] ax^2+by^2+dx+ey+f=0 [[/math]]

In the case [math]b\neq0[/math] we can make two obvious squares, and again up to a linear transformation in [math](x,y)[/math], we are left with an equation as follows:

[[math]] x^2\pm y^2=k [[/math]]

In the case of positive sign, [math]x^2+y^2=k[/math], the solutions are the circle, when [math]k\geq0[/math], the point, when [math]k=0[/math], and [math]\emptyset[/math], when [math]k \lt 0[/math]. As for the case of negative sign, [math]x^2-y^2=k[/math], which reads [math](x-y)(x+y)=k[/math], here once again by linearity our equation becomes [math]xy=l[/math], which is a hyperbola when [math]l\neq0[/math], and two lines when [math]l=0[/math].

(3) In the case [math]b\neq0[/math] the study is similar, with the same solutions, so we are left with the case [math]a=b=0[/math]. Here our conic is as follows, with [math]c,d,e,f\in\mathbb R[/math]:

[[math]] cxy+dx+ey+f=0 [[/math]]

If [math]c\neq 0[/math], by linearity our equation becomes [math]xy=l[/math], which produces a hyperbola or two lines, as explained before. As for the remaining case, [math]c=0[/math], here our equation is:

[[math]] dx+ey+f=0 [[/math]]

But this is generically the equation of a line, unless we are in the case [math]d=e=0[/math], where our equation is [math]f=0[/math], having as solutions [math]\emptyset[/math] when [math]f\neq0[/math], and [math]\mathbb R^2[/math] when [math]f=0[/math].

(4) So, this was the study of an arbitrary conic, and by putting now everything together, we are led to the conclusions in the statement.

■

In order now to plainly classify the conics, without reference to a linear transformation of the plane, we just need to apply linear transformations to the curves that we found in Theorem 11.2. This leads to the following classification result:

Theorem

The conics fall into two classes, as follows:

Non-degenerate: ellipses, parabolas, hyperbolas.
Degenerate: [math]\emptyset[/math], points, lines, pairs of lines, [math]\mathbb R^2[/math].

Also, the compact conics are [math]\emptyset[/math], the points, and the ellipses.

Show Proof

We have several assertions here, the idea being as follows:

(1) As said above, in order to get to such a classification result, we just need to apply linear transformations to the curves that we found in Theorem 11.2. But this leaves the list there unchanged, up to the circles becoming ellipses, as stated above.

(2) In what regards the last assertion, this is clear from the first one, but since this assertion is quite interesting, let us give it a quick, independent proof as well. Consider an arbitary conic, written as follows, with [math]a,b,c,d,e,f\in\mathbb R[/math]:

[[math]] ax^2+by^2+cxy+dx+ey+f=0 [[/math]]

Compacity rules then out the case [math]c\neq0[/math], and our conic must be in fact:

[[math]] ax^2+by^2+dx+ey+f=0 [[/math]]

But then with [math]a,b\neq0[/math] we must have by compacity [math]a,b \gt 0[/math] or [math]a,b \lt 0[/math], and we get an ellipsis, then with [math]a=0,b\neq0[/math] or [math]a\neq0,b=0[/math] we get by compacity either [math]\emptyset[/math] or a point, and finally with [math]a=b=0[/math] the compacity rules out again everything, except for [math]\emptyset[/math].

■

As a third main result now on the conics, also known since the ancient Greeks, and which justifies the name “conics”, coming from “cone”, we have:

Theorem

Up to some degenerate cases, the conics are exactly the curves which appear by cutting a [math]2[/math]-sided cone with a plane.

Show Proof

This is something quite tricky, the idea being as follows:

(1) By suitably choosing our coordinate axes [math](x,y,z)[/math], we can assume that our 2-sided cone is given by an equation as follows, with [math]k \gt 0[/math]:

[[math]] x^2+y^2=kz^2 [[/math]]

In order to prove the result, we must intersect this cone with an arbitrary plane, which has an equation as follows, with [math](a,b,c)\neq(0,0,0)[/math]:

[[math]] ax+by+cz=d [[/math]]

(2) However, before getting into computations, observe that what we want to find is a certain degree 2 equation in the above plane, for the intersection. Thus, it is convenient to change the coordinates, as for our plane to be given by the following equation:

[[math]] z=0 [[/math]]

(3) But with this done, what we have to do is to see how the cone equation [math]x^2+y^2=kz^2[/math] changes, under this change of coordinates, and then set [math]z=0[/math], as to get the [math](x,y)[/math] equation of the intersection. But this leads, via some thinking or computations, to the conclusion that the cone equation [math]x^2+y^2=kz^2[/math] becomes in this way a degree 2 equation in [math](x,y)[/math], which can be arbitrary, and so to the final conclusion in the statement.

(4) Alternatively, and perhaps more concretely, we can use the original coordinates, with the cone being [math]x^2+y^2=kz^2[/math], and compute the intersection, with the conclusion that what we get, depending on the slope of the cone, and modulo degenerate cases, is an ellipsis, hyperbola or parabola. So, by invoking Theorem 11.3, we obtain the result.

(5) Summarizing, we have proved the result, modulo some details and interesting computations which are left to you, reader. Left to you as well is the full discussion concerning degree 2 curve degeneracy vs cone cutting degeneracy, with the remark that in what regards the cone cuts, the degenerate cases are very easy to identity and list, with the list consisting of [math]\emptyset[/math], the points, the lines, the pairs of lines, and [math]\mathbb R^2[/math] itself.

■

All this is very nice, and as a conclusion to what we have so far about conics, we have the following statement, containing all the needed essentials:

Theorem

The conics, which are the algebraic curves of degree [math]2[/math] in the plane,

[[math]] C=\left\{(x,y)\in\mathbb R^2\Big|P(x,y)=0\right\} [[/math]]

with [math]\deg P\leq 2[/math], appear modulo degeneration by cutting a [math]2[/math]-sided cone with a plane, and can be classified into ellipses, parabolas and hyperbolas.

Show Proof

This follows indeed by putting together the above results, and with the discussion concerning degeneration being left, as usual, as an instructive exercise.

■

Moving ahead now, the most interesting conics, which are both compact and non-degenerate, are the ellipses. So, let us study them more in detail. As a starting point, we have the following statement, summarizing our knowledge about ellipses:

Theorem

The compact non-degenerate conics are the ellipses, which can be written, modulo rotations and translations in the plane, as

[[math]] \left(\frac{x}{a}\right)^2+\left(\frac{y}{b}\right)^2=1 [[/math]]

with [math]a,b \gt 0[/math] being half the size of a box containing the ellipsis. These ellipses also appear by compactly cutting a cone with a plane. The area of such an ellipsis is [math]A=\pi ab[/math].

Show Proof

In this statement most of the mathematics is from above, and with our explanations regarding the parameters [math]a,b \gt 0[/math] coming from the following picture:

[[math]] \xymatrix@R=6.7pt@C=8.7pt{ &&&&&&\ar@{-}[d]\\ &&&&&&\bullet_b\ar@{-}[dddddd]\ar@{-}@/_/[dllll]&\\ &&\ar@{-}@/_/[ddl]&&&&&&&&\ar@{-}@/_/[ullll]\\ &&&&&&&&&&\\ \ar@{-}[r]&\bullet_{-a}\ar@{-}@/_/[ddr]\ar@{-}[rrrrr]&&&&&\ar@{-}[rrrrr]&&&&&\bullet_a\ar@{-}@/_/[uul]\ar@{-}[r]&\\ &&&&&&&&&&\\ &&\ar@{-}@/_/[drrrr]&&&&&&&&\ar@{-}@/_/[uur]\\ &&&&&&\bullet_{-b}\ar@{-}@/_/[urrrr]\ar@{-}[d]\\ &&&&&&} [[/math]]

As for the formula [math]A=\pi ab[/math], this comes from a computation from chapter 4, namely:

[[math]] \begin{eqnarray*} A &=&2\int_{-a}^ab\sqrt{1-\frac{x^2}{a^2}}\,dx\\ &=&\frac{4b}{a}\int_0^a\sqrt{a^2-x^2}\,dx\\ &=&4ab\int_0^1\sqrt{1-y^2}\,dy\\ &=&4ab\cdot\frac{\pi}{4}\\ &=&\pi ab \end{eqnarray*} [[/math]]

Finally, as a verification, for [math]a=b=1[/math] we get [math]A=\pi[/math], as we should.

■

The above result is not the end of the story with ellipses, because we have as well, as a complement to it, or even as a rival result, which is just fine on its own:

Theorem

The ellipses appear via equations of the following type, with [math]p,q[/math] being two points in the plane, and with [math]l\geq d(p,q)[/math] being a certain length:

[[math]] d(z,p)+d(z,q)=l [[/math]]

For an ellipsis parametrized as before, [math](x/a)^2+(y/b)^2=1[/math] with [math]a\geq b\geq 0[/math], the focal points are [math]p=(0,-r)[/math] and [math]q=(0,r)[/math], with [math]r=\sqrt{a^2-b^2}[/math], and the length is [math]l=2a[/math].

Show Proof

As already mentioned, it is possible to take [math]d(z,p)+d(z,q)=l[/math] as a definition for the ellipses, which is nice because all you need for drawing such an ellipsis is a string and a pencil, and then work out all the theory starting from this. In what concerns us, we will rather further build on what we know from Theorem 11.6, as follows:

(1) After some routine thinking, in order to fully prove the result, what we have to do is to take an ellipsis as parametrized in Theorem 11.6, and look for the focal points:

[[math]] \xymatrix@R=7pt@C=7pt{ &&&&&&\ar@{-}[d]\\ &&&&&&\bullet_b\ar@{-}[dddddd]\ar@{-}@/_/[dllll]&\\ &&\ar@{-}@/_/[ddl]&&&&&&&&\ar@{-}@/_/[ullll]\\ &&&&&&&&&&\\ \ar@{-}[r]&\bullet_{-a}\ar@{-}@/_/[ddr]\ar@{-}[rrr]&&&\bullet_{-r}\ar@{-}[rr]&&\ar@{-}[rr]&&\bullet_r\ar@{-}[rrr]&&&\bullet_a\ar@{-}@/_/[uul]\ar@{-}[r]&\\ &&&&&&&&&&\\ &&\ar@{-}@/_/[drrrr]&&&&&&&&\ar@{-}@/_/[uur]\\ &&&&&&\bullet_{-b}\ar@{-}@/_/[urrrr]\ar@{-}[d]\\ &&&&&&} [[/math]]

To be more precise, we are looking for a number [math]r \gt 0[/math], and a number [math]l \gt 0[/math], such that our ellipsis appears as [math]d(z,p)+d(z,q)=l[/math], with [math]p=(0,-r)[/math] and [math]q=(0,r)[/math].

(2) Let us first compute these numbers [math]r,l \gt 0[/math]. Assuming that our result holds indeed as stated, by taking [math]z=(0,a)[/math], we see that the length [math]l[/math] is:

[[math]] l=(a-r)+(a+r)=2a [[/math]]

As for the parameter [math]r[/math], by taking [math]z=(b,0)[/math], we conclude that we must have:

[[math]] 2\sqrt{b^2+r^2}=2a\implies r=\sqrt{a^2-b^2} [[/math]]

(3) With these observations made, let us prove the result. Given [math]l,r \gt 0[/math], and setting [math]p=(0,-r)[/math] and [math]q=(0,r)[/math], we have the following computation, with [math]z=(x,y)[/math]:

[[math]] \begin{eqnarray*} &&d(z,p)+d(z,q)=l\\ &\iff&\sqrt{(x+r)^2+y^2}+\sqrt{(x-r)^2+y^2}=l\\ &\iff&\sqrt{(x+r)^2+y^2}=l-\sqrt{(x-r)^2+y^2}\\ &\iff&(x+r)^2+y^2=(x-r)^2+y^2+l^2-2l\sqrt{(x-r)^2+y^2}\\ &\iff&2l\sqrt{(x-r)^2+y^2}=l^2-4xr\\ &\iff&4l^2(x^2+r^2-2xr+y^2)=l^4+16x^2r^2-8l^2xr\\ &\iff&4l^2x^2+4l^2r^2+4l^2y^2=l^4+16x^2r^2\\ &\iff&(4x^2-l^2)(4r^2-l^2)=4l^2y^2 \end{eqnarray*} [[/math]]

(4) Now observe that we can further process the equation that we found as follows:

[[math]] \begin{eqnarray*} (4x^2-l^2)(4r^2-l^2)=4l^2y^2 &\iff&\frac{4x^2-l^2}{l^2}=\frac{4y^2}{4r^2-l^2}\\ &\iff&\frac{4x^2-l^2}{l^2}=\frac{y^2}{r^2-l^2/4}\\ &\iff&\left(\frac{x}{2l}\right)^2-1=\left(\frac{y}{\sqrt{r^2-l^2/4}}\right)^2\\ &\iff&\left(\frac{x}{2l}\right)^2+\left(\frac{y}{\sqrt{r^2-l^2/4}}\right)^2=1 \end{eqnarray*} [[/math]]

(5) Thus, our result holds indeed, and with the numbers [math]l,r \gt 0[/math] appearing, and no surprise here, via the formulae [math]l=2a[/math] and [math]r=\sqrt{a^2-b^2}[/math], found in (2) above.

■

The above results, which are old as modern mathematics itself, are foundational for both algebraic and differential geometry. We will be back to them later, after doing some physics, following Kepler and Newton, making the link with calculus.

General references

Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].