Complex numbers

[math] \newcommand{\mathds}{\mathbb}[/math]

This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion.

5a. Complex numbers

In this second part of the present book we discuss the functions of one complex variable [math]f:\mathbb C\to\mathbb C[/math]. This is certainly something to be done before going to several variables, be them real or complex, because we have some unfinished business with [math]\sin,\cos,\exp,\log[/math], which remain quite mysterious objects. We will see here that these functions are much better understood as complex functions [math]f:\mathbb C\to\mathbb C[/math]. In fact, even a dumb polynomial [math]P\in\mathbb R[X][/math] is better understood as complex polynomial, [math]P\in\mathbb C[X][/math], because its roots, that might not always exist as real numbers, always exist as complex numbers.


On top of this, we have physics. You might know this or not, but physics is all about waves, which waves require complex numbers for their understanding. And there is even more, because the [math]\mathbb R[/math] that we are so used to, from classical mechanics, dealing with questions at our usual, human scale, naturally becomes, via some sort of complicated procedure, [math]\mathbb C[/math] in quantum mechanics, that is, when zooming down, where the new science and technologies are. But more on this later, towards the end of the present book.


In short, many interesting things to be discussed, and this not necessarily for the sake of doing complex functions, but also with the goal of better understanding the real functions themselves. Let us begin with the complex numbers. There is a lot of magic here, and we will carefully explain this material. Their definition is as follows:

Definition

The complex numbers are variables of the form

[[math]] x=a+ib [[/math]]
with [math]a,b\in\mathbb R[/math], which add in the obvious way, and multiply according to the following rule:

[[math]] i^2=-1 [[/math]]
Each real number can be regarded as a complex number, [math]a=a+i\cdot 0[/math].

In other words, we consider variables as above, without bothering for the moment with their precise meaning. Now consider two such complex numbers:

[[math]] x=a+ib\quad,\quad y=c+id [[/math]]


The formula for the sum is then the obvious one, as follows:

[[math]] x+y=(a+c)+i(b+d) [[/math]]


As for the formula of the product, by using the rule [math]i^2=-1[/math], we obtain:

[[math]] \begin{eqnarray*} xy &=&(a+ib)(c+id)\\ &=&ac+iad+ibc+i^2bd\\ &=&ac+iad+ibc-bd\\ &=&(ac-bd)+i(ad+bc) \end{eqnarray*} [[/math]]


Thus, the complex numbers as introduced above are well-defined. The multiplication formula is of course quite tricky, and hard to memorize, but we will see later some alternative ways, which are more conceptual, for performing the multiplication.


The advantage of using the complex numbers comes from the fact that the equation [math]x^2=1[/math] has now a solution, [math]x=i[/math]. In fact, this equation has two solutions, namely:

[[math]] x=\pm i [[/math]]


This is of course very good news. More generally, we have the following result, regarding the arbitrary degree 2 equations, with real coefficients:

Theorem

The complex solutions of [math]ax^2+bx+c=0[/math] with [math]a,b,c\in\mathbb R[/math] are

[[math]] x_{1,2}=\frac{-b\pm\sqrt{b^2-4ac}}{2a} [[/math]]
with the square root of negative real numbers being defined as

[[math]] \sqrt{-m}=\pm i\sqrt{m} [[/math]]
and with the square root of positive real numbers being the usual one.


Show Proof

We can write our equation in the following way:

[[math]] \begin{eqnarray*} ax^2+bx+c=0 &\iff&x^2+\frac{b}{a}x+\frac{c}{a}=0\\ &\iff&\left(x+\frac{b}{2a}\right)^2-\frac{b^2}{4a^2}+\frac{c}{a}=0\\ &\iff&\left(x+\frac{b}{2a}\right)^2=\frac{b^2-4ac}{4a^2}\\ &\iff&x+\frac{b}{2a}=\pm\frac{\sqrt{b^2-4ac}}{2a} \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

We will see later that any degree 2 complex equation has solutions as well, and that more generally, any polynomial equation, real or complex, has solutions. Moving ahead now, we can represent the complex numbers in the plane, in the following way:

Proposition

The complex numbers, written as usual

[[math]] x=a+ib [[/math]]
can be represented in the plane, according to the following identification:

[[math]] x=\binom{a}{b} [[/math]]
With this convention, the sum of complex numbers is the usual sum of vectors.


Show Proof

Consider indeed two arbitrary complex numbers:

[[math]] x=a+ib\quad,\quad y=c+id [[/math]]


Their sum is then by definition the following complex number:

[[math]] x+y=(a+c)+i(b+d) [[/math]]


Now let us represent [math]x,y[/math] in the plane, as in the statement:

[[math]] x=\binom{a}{b}\quad,\quad y=\binom{c}{d} [[/math]]


In this picture, their sum is given by the following formula:

[[math]] x+y=\binom{a+c}{b+d} [[/math]]


But this is indeed the vector corresponding to [math]x+y[/math], so we are done.

Here we have assumed that you are a bit familiar with vector calculus. If not, no problem, the idea is simply that vectors add by forming a parallelogram, as follows:

[[math]] \xymatrix@R=10pt@C=15pt{ &&&\\ b+d&&&&\bullet^{x+y}\\ d&&\bullet^y\ar@{-}[urr]&&\\ b&&&\bullet_x\ar@{-}[uur]&\\ &\bullet\ar@{-}[urr]\ar[rrrr]\ar[uuuu]\ar@{-}[uur]&&&&\\ &&\ c\ &\ a\ &a+c} [[/math]]


Observe that in our geometric picture from Proposition 5.3, the real numbers correspond to the numbers on the [math]Ox[/math] axis. As for the purely imaginary numbers, these lie on the [math]Oy[/math] axis, with the number [math]i[/math] itself being given by the following formula:

[[math]] i=\binom{0}{1} [[/math]]


As an illustration for this, let us record now a basic picture, with some key complex numbers, namely [math]1,i,-1,-i[/math], represented according to our conventions:

[[math]] \xymatrix@R=15pt@C=13pt{ &&&\\ &&\\ &&&\bullet^i\ar@{-}[d]\ar@{.}@/^/[dr]\ar[uu]\\ \ar@{-}[rr]&&\bullet^{-1}\ar@{-}[r]\ar@{.}@/^/[ur]&\ar@{-}[r]\ar@{-}[d]&\bullet^1\ar@{.}@/^/[dl]\ar[rr]&&\\ &&&\bullet_{-i}\ar@{.}@/^/[ul]\ar@{-}[dd]\\ &&\\ &&&&} [[/math]]


You might perhaps wonder why I chose to draw that circle, connecting the numbers [math]1,i,-1,-i[/math], which does not look very useful. More on this in a moment, the idea being that that circle can be immensely useful, and coming in advance, some advice: \begin{advice} When drawing complex numbers, always begin with the coordinate axes [math]Ox,Oy[/math], and with a copy of the unit circle. \end{advice} We have so far a quite good understanding of their complex numbers, and their addition. In order to understand now the multiplication operation, we must do something more complicated, namely using polar coordinates. Let us start with:

Definition

The complex numbers [math]x=a+ib[/math] can be written in polar coordinates,

[[math]] x=r(\cos t+i\sin t) [[/math]]
with the connecting formulae being as follows,

[[math]] a=r\cos t\quad,\quad b=r\sin t [[/math]]
and in the other sense being as follows,

[[math]] r=\sqrt{a^2+b^2}\quad,\quad \tan t=\frac{b}{a} [[/math]]
and with [math]r,t[/math] being called modulus, and argument.

There is a clear relation here with the vector notation from Proposition 5.3, because [math]r[/math] is the length of the vector, and [math]t[/math] is the angle made by the vector with the [math]Ox[/math] axis. To be more precise, the picture for what is going on in Definition 5.5 is as follows:

[[math]] \xymatrix@R=10pt@C=15pt{ &&&\\ &&&\\ b&\ar@{.}[rrr]&&&\bullet^x\ar@{.}[ddd]\\ &&&&\\ &&&&\\ &\bullet\ar[rrrrr]\ar[uuuuu]\ar@{.}[uuurrr]^r&&\ar@{.}@/_/[ul]_t&&&\\ &&&&a} [[/math]]


As a basic example here, the number [math]i[/math] takes the following form:

[[math]] i=\cos\left(\frac{\pi}{2}\right)+i\sin\left(\frac{\pi}{2}\right) [[/math]]


The point now is that in polar coordinates, the multiplication formula for the complex numbers, which was so far something quite opaque, takes a very simple form:

Theorem

Two complex numbers written in polar coordinates,

[[math]] x=r(\cos s+i\sin s)\quad,\quad y=p(\cos t+i\sin t) [[/math]]
multiply according to the following formula:

[[math]] xy=rp(\cos(s+t)+i\sin(s+t)) [[/math]]
In other words, the moduli multiply, and the arguments sum up.


Show Proof

This follows from the following formulae, that we know well:

[[math]] \cos(s+t)=\cos s\cos t-\sin s\sin t [[/math]]

[[math]] \sin(s+t)=\cos s\sin t+\sin s\cos t [[/math]]


Indeed, we can assume that we have [math]r=p=1[/math], by dividing everything by these numbers. Now with this assumption made, we have the following computation:

[[math]] \begin{eqnarray*} xy &=&(\cos s+i\sin s)(\cos t+i\sin t)\\ &=&(\cos s\cos t-\sin s\sin t)+i(\cos s\sin t+\sin s\cos t)\\ &=&\cos(s+t)+i\sin(s+t) \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

The above result, which is based on some non-trivial trigonometry, is quite powerful. As a basic application of it, we can now compute powers, as follows:

Theorem

The powers of a complex number, written in polar form,

[[math]] x=r(\cos t+i\sin t) [[/math]]
are given by the following formula, valid for any exponent [math]k\in\mathbb N[/math]:

[[math]] x^k=r^k(\cos kt+i\sin kt) [[/math]]
Moreover, this formula holds in fact for any [math]k\in\mathbb Z[/math], and even for any [math]k\in\mathbb Q[/math].


Show Proof

Given a complex number [math]x[/math], written in polar form as above, and an exponent [math]k\in\mathbb N[/math], we have indeed the following computation, with [math]k[/math] terms everywhere:

[[math]] \begin{eqnarray*} x^k &=&x\ldots x\\ &=&r(\cos t+i\sin t)\ldots r(\cos t+i\sin t)\\ &=&r^k([\cos(t+\ldots+t)+i\sin(t+\ldots+t))\\ &=&r^k(\cos kt+i\sin kt) \end{eqnarray*} [[/math]]


Thus, we are done with the case [math]k\in\mathbb N[/math]. Regarding now the generalization to the case [math]k\in\mathbb Z[/math], it is enough here to do the verification for [math]k=-1[/math], where the formula is:

[[math]] x^{-1}=r^{-1}(\cos(-t)+i\sin(-t)) [[/math]]


But this number [math]x^{-1}[/math] is indeed the inverse of [math]x[/math], as shown by:

[[math]] \begin{eqnarray*} xx^{-1} &=&r(\cos t+i\sin t)\cdot r^{-1}(\cos(-t)+i\sin(-t))\\ &=&\cos(t-t)+i\sin(t-t)\\ &=&\cos 0+i\sin 0\\ &=&1 \end{eqnarray*} [[/math]]


Finally, regarding the generalization to the case [math]k\in\mathbb Q[/math], it is enough to do the verification for exponents of type [math]k=1/n[/math], with [math]n\in\mathbb N[/math]. The claim here is that:

[[math]] x^{1/n}=r^{1/n}\left[\cos\left(\frac{t}{n}\right)+i\sin\left(\frac{t}{n}\right)\right] [[/math]]


In order to prove this, let us compute the [math]n[/math]-th power of this number. We can use the power formula for the exponent [math]n\in\mathbb N[/math], that we already established, and we obtain:

[[math]] \begin{eqnarray*} (x^{1/n})^n &=&(r^{1/n})^n\left[\cos\left(n\cdot\frac{t}{n}\right)+i\sin\left(n\cdot\frac{t}{n}\right)\right]\\ &=&r(\cos t+i\sin t)\\ &=&x \end{eqnarray*} [[/math]]


Thus, we have indeed a [math]n[/math]-th root of [math]x[/math], and our proof is now complete.

We should mention that there is a bit of ambiguity in the above, in the case of the exponents [math]k\in\mathbb Q[/math], due to the fact that the square roots, and the higher roots as well, can take multiple values, in the complex number setting. We will be back to this.


As a basic application of Theorem 5.7, we have the following result:

Proposition

Each complex number, written in polar form,

[[math]] x=r(\cos t+i\sin t) [[/math]]
has two square roots, given by the following formula:

[[math]] \sqrt{x}=\pm\sqrt{r}\left[\cos\left(\frac{t}{2}\right)+i\sin\left(\frac{t}{2}\right)\right] [[/math]]
When [math]x \gt 0[/math], these roots are [math]\pm\sqrt{x}[/math]. When [math]x \lt 0[/math], these roots are [math]\pm i\sqrt{-x}[/math].


Show Proof

The first assertion is clear indeed from the general formula in Theorem 5.7, at [math]k=1/2[/math]. As for its particular cases with [math]x\in\mathbb R[/math], these are clear from it.

As a comment here, for [math]x \gt 0[/math] we are very used to call the usual [math]\sqrt{x}[/math] square root of [math]x[/math]. However, for [math]x \lt 0[/math], or more generally for [math]x\in\mathbb C-\mathbb R_+[/math], there is less interest in choosing one of the possible [math]\sqrt{x}[/math] and calling it “the” square root of [math]x[/math], because all this is based on our convention that [math]i[/math] comes up, instead of down, which is something rather arbitrary. Actually, clocks turning clockwise, [math]i[/math] should be rather coming down. All this is a matter of taste, but in any case, for our math, the best is to keep some ambiguity, as above.


With the above results in hand, and notably with the square root formula from Proposition 5.8, we can now go back to the degree 2 equations, and we have:

Theorem

The complex solutions of [math]ax^2+bx+c=0[/math] with [math]a,b,c\in\mathbb C[/math] are

[[math]] x_{1,2}=\frac{-b\pm\sqrt{b^2-4ac}}{2a} [[/math]]
with the square root of complex numbers being defined as above.


Show Proof

This is clear, the computations being the same as in the real case. To be more precise, our degree 2 equation can be written as follows:

[[math]] \left(x+\frac{b}{2a}\right)^2=\frac{b^2-4ac}{4a^2} [[/math]]


Now since we know from Proposition 5.8 that any complex number has a square root, we are led to the conclusion in the statement.

As a last general topic regarding the complex numbers, let us discuss conjugation. This is something quite tricky, complex number specific, as follows:

Definition

The complex conjugate of [math]x=a+ib[/math] is the following number,

[[math]] \bar{x}=a-ib [[/math]]
obtained by making a reflection with respect to the [math]Ox[/math] axis.

As before with other such operations on complex numbers, a quick picture says it all. Here is the picture, with the numbers [math]x,\bar{x},-x,-\bar{x}[/math] being all represented:

[[math]] \xymatrix@R=6pt@C=10pt{ &&&&&&\\ &&&&&&\\ &\bullet^{-\bar{x}}\ar@{.}[ddd]\ar@{.}[rrr]&&&\ar@{.}[rrr]&&&\bullet^x\ar@{.}[ddd]\\ &&&&&&&\\ &&&&&&&\\ \ar@{-}[rrrr]&&&&\bullet\ar[rrrrr]\ar[uuuuu]\ar@{.}[dddlll]\ar@{.}[dddrrr]\ar@{.}[uuulll]\ar@{-}[ddddd]\ar@{.}[uuurrr]^r&&\ar@{.}@/_/[ul]_t&&&\\ &&&&&&&\\ &&&&&&&\\ &\bullet_{-x}\ar@{.}[rrr]\ar@{.}[uuu]&&&\ar@{.}[rrr]&&&\bullet_{\bar{x}}\ar@{.}[uuu]\\ &&&&&&\\ &&&&&& } [[/math]]


Observe that the conjugate of a real number [math]x\in\mathbb R[/math] is the number itself, [math]x=\bar{x}[/math]. In fact, the equation [math]x=\bar{x}[/math] characterizes the real numbers, among the complex numbers. At the level of non-trivial examples now, we have the following formula:

[[math]] \bar{i}=-i [[/math]]


There are many things that can be said about the conjugation of the complex numbers, and here is a summary of basic such things that can be said:

Theorem

The conjugation operation [math]x\to\bar{x}[/math] has the following properties:

  • [math]x=\bar{x}[/math] precisely when [math]x[/math] is real.
  • [math]x=-\bar{x}[/math] precisely when [math]x[/math] is purely imaginary.
  • [math]x\bar{x}=|x|^2[/math], with [math]|x|=r[/math] being as usual the modulus.
  • With [math]x=r(\cos t+i\sin t)[/math], we have [math]\bar{x}=r(\cos t-i\sin t)[/math].
  • We have the formula [math]\overline{xy}=\bar{x}\bar{y}[/math], for any [math]x,y\in\mathbb C[/math].
  • The solutions of [math]ax^2+bx+c=0[/math] with [math]a,b,c\in\mathbb R[/math] are conjugate.


Show Proof

These results are all elementary, the idea being as follows:


(1) This is something that we already know, coming from definitions.


(2) This is something clear too, because with [math]x=a+ib[/math] our equation [math]x=-\bar{x}[/math] reads [math]a+ib=-a+ib[/math], and so [math]a=0[/math], which amounts in saying that [math]x[/math] is purely imaginary.


(3) This is a key formula, which can be proved as follows, with [math]x=a+ib[/math]:

[[math]] \begin{eqnarray*} x\bar{x} &=&(a+ib)(a-ib)\\ &=&a^2+b^2\\ &=&|x|^2 \end{eqnarray*} [[/math]]


(4) This is clear indeed from the picture following Definition 5.10.


(5) This is something quite magic, which can be proved as follows:

[[math]] \begin{eqnarray*} \overline{(a+ib)(c+id)} &=&\overline{(ac-bd)+i(ad+bc)}\\ &=&(ac-bd)-i(ad+bc)\\ &=&(a-ib)(c-id) \end{eqnarray*} [[/math]]


However, what we have been doing here is not very clear, geometrically speaking, and our formula is worth an alternative proof. Here is that proof, which after inspection contains no computations at all, making it clear that the polar writing is the best:

[[math]] \begin{eqnarray*} &&\overline{r(\cos s+i\sin s)\cdot p(\cos t+i\sin t)}\\ &=&\overline{rp(\cos (s+t)+i\sin(s+t))}\\ &=&rp(\cos(-s-t)+i\sin(-s-t))\\ &=&r(\cos(-s)+i\sin(-s))\cdot p(\cos(-t)+i\sin(-t))\\ &=&\overline{r(\cos s+i\sin s)}\cdot\overline{p(\cos t+i\sin t)} \end{eqnarray*} [[/math]]


(6) This comes from the formula of the solutions, that we know from Theorem 5.2, but we can deduce this as well directly, without computations. Indeed, by using our assumption that the coefficients are real, [math]a,b,c\in\mathbb R[/math], we have:

[[math]] \begin{eqnarray*} ax^2+bx+c=0 &\implies&\overline{ax^2+bx+c}=0\\ &\implies&\bar{a}\bar{x}^2+\bar{b}\bar{x}+\bar{c}=0\\ &\implies&a\bar{x}^2+b\bar{x}+c=0 \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

5b. Exponential writing

We discuss now the theory of complex functions [math]f:\mathbb C\to\mathbb C[/math], in analogy with the theory of the real functions [math]f:\mathbb R\to\mathbb R[/math]. We will see that many results that we know from the real setting extend to the complex setting. Before starting, two remarks on this:


(1) Most of the real functions [math]f:\mathbb R\to\mathbb R[/math] that we know, such as [math]\sin,\cos,\exp,\log[/math], extend into complex functions [math]f:\mathbb C\to\mathbb C[/math], and the study of these latter extensions brings some new light on the original real functions. Thus, what we will be doing here will be, in a certain sense, a refinement of the theory developed in chapters 1-4.


(2) On the other hand, since we have [math]\mathbb C\simeq\mathbb R^2[/math], the complex functions [math]f:\mathbb C\to\mathbb C[/math] that we will study here can be regarded as functions [math]f:\mathbb R^2\to\mathbb R^2[/math]. This is something quite subtle, but in any case, what we will be doing here will stand as well as an introduction to the functions of type [math]f:\mathbb R^N\to\mathbb R^M[/math], that we will study in chapters 9-16 below.


In short, one complex variable is something in between one real variable, and two or more real variables, and we can only expect to end up with a mysterious mixture of surprising and unsurprising results. Welcome to complex analysis. Let us start with:

Definition

A complex function [math]f:\mathbb C\to\mathbb C[/math], or more generally [math]f:X\to\mathbb C[/math], with [math]X\subset\mathbb C[/math] being a subset, is called continuous when, for any [math]x_n,x\in X[/math]:

[[math]] x_n\to x\implies f(x_n)\to f(x) [[/math]]
where the convergence of the sequences of complex numbers, [math]x_n\to x[/math], means by definition that for [math]n[/math] big enough, the quantity [math]|x_n-x|[/math] becomes arbitrarily small.

Observe that in real coordinates, [math]x=(a,b)[/math], the distances appearing in the definition of the convergence [math]x_n\to x[/math] are given by the following formula:

[[math]] |x_n-x|=\sqrt{(a_n-a)^2+(b_n-b)^2} [[/math]]


Thus [math]x_n\to x[/math] in the complex sense means that [math](a_n,b_n)\to(a,b)[/math] in the usual, intuitive sense, with respect to the usual distance in the plane [math]\mathbb R^2[/math], and as a consequence, a function [math]f:\mathbb C\to\mathbb C[/math] is continuous precisely when it is continuous, in an intuitive sense, when regarded as function [math]f:\mathbb R^2\to\mathbb R^2[/math]. But more on this later, in chapters 9-10 below.


At the level of examples, we have the following result:

Theorem

We can exponentiate the complex numbers, according to the formula

[[math]] e^x=\sum_{k=0}^\infty\frac{x^k}{k!} [[/math]]
and the function [math]x\to e^x[/math] is continuous, and satisfies [math]e^{x+y}=e^xe^y[/math].


Show Proof

We must first prove that the series converges. But this follows from:

[[math]] \begin{eqnarray*} |e^x| &=&\left|\sum_{k=0}^\infty\frac{x^k}{k!}\right|\\ &\leq&\sum_{k=0}^\infty\left|\frac{x^k}{k!}\right|\\ &=&\sum_{k=0}^\infty\frac{|x|^k}{k!}\\ &=&e^{|x|} \lt \infty \end{eqnarray*} [[/math]]


Regarding the formula [math]e^{x+y}=e^xe^y[/math], this follows too as in the real case, as follows:

[[math]] \begin{eqnarray*} e^{x+y} &=&\sum_{k=0}^\infty\frac{(x+y)^k}{k!}\\ &=&\sum_{k=0}^\infty\sum_{s=0}^k\binom{k}{s}\cdot\frac{x^sy^{k-s}}{k!}\\ &=&\sum_{k=0}^\infty\sum_{s=0}^k\frac{x^sy^{k-s}}{s!(k-s)!}\\ &=&e^xe^y \end{eqnarray*} [[/math]]


Finally, the continuity of [math]x\to e^x[/math] comes at [math]x=0[/math] from the following computation:

[[math]] \begin{eqnarray*} |e^t-1| &=&\left|\sum_{k=1}^\infty\frac{t^k}{k!}\right|\\ &\leq&\sum_{k=1}^\infty\left|\frac{t^k}{k!}\right|\\ &=&\sum_{k=1}^\infty\frac{|t|^k}{k!}\\ &=&e^{|t|}-1 \end{eqnarray*} [[/math]]


As for the continuity of [math]x\to e^x[/math] in general, this can be deduced now as follows:

[[math]] \lim_{t\to0}e^{x+t}=\lim_{t\to0}e^xe^t=e^x\lim_{t\to0}e^t=e^x\cdot 1=e^x [[/math]]


Thus, we are led to the conclusions in the statement.

We will be back to more functions later. As an important fact, however, let us point out that, contrary to what the above might suggest, everything does not always extend trivally from the real to the complex case. For instance, we have:

Proposition

We have the following formula, valid for any [math]|x| \lt 1[/math],

[[math]] \frac{1}{1-x}=1+x+x^2+\ldots [[/math]]
but, unlike in the real case, the geometric meaning of this formula is quite unclear.


Show Proof

Here the formula in the statement holds indeed, by multiplying and cancelling terms, and with the convergence being justified by the following estimate:

[[math]] \left|\sum_{n=0}^\infty x^n\right|\leq\sum_{n=0}^\infty|x|^n=\frac{1}{1-|x|} [[/math]]


As for the last assertion, this is something quite informal. To be more precise, for [math]x=1/2[/math] our formula is clear, by cutting the interval [math][0,2][/math] into half, and so on:

[[math]] 1+\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\ldots=2 [[/math]]


More generally, for [math]x\in(-1,1)[/math] the meaning of the formula in the statement is something quite clear and intuitive, geometrically speaking, by using a similar argument. However, when [math]x[/math] is complex, and not real, we are led into a kind of mysterious spiral there, and the only case where the formula is “obvious”, geometrically speaking, is that when [math]x=rw[/math], with [math]r\in[0,1)[/math], and with [math]w[/math] being a root of unity. To be more precise here, by anticipating a bit, assume that we have a number [math]w\in\mathbb C[/math] satisfying [math]w^N=1[/math], for some [math]N\in\mathbb N[/math]. We have then the following formula, for our infinite sum:

[[math]] \begin{eqnarray*} 1+rw+r^2w^2+\ldots &=&(1+rw+\ldots+r^{N-1}w^{N-1})\\ &+&(r^N+r^{N+1}w\ldots+r^{2N-1}w^{N-1})\\ &+&(r^{2N}+r^{2N+1}w\ldots+r^{3N-1}w^{N-1})\\ &+&\ldots \end{eqnarray*} [[/math]]


Thus, by grouping the terms with the same argument, our infinite sum is:

[[math]] \begin{eqnarray*} 1+rw+r^2w^2+\ldots &=&(1+r^N+r^{2N}+\ldots)\\ &+&(r+r^{N+1}+r^{2N+1}+\ldots)w\\ &+&\ldots\\ &+&(r^{N-1}+r^{2N-1}+r^{3N-1}+\ldots)w^{N-1} \end{eqnarray*} [[/math]]


But the sums of each ray can be computed with the real formula for geometric series, that we know and understand well, and with an extra bit of algebra, we get:

[[math]] \begin{eqnarray*} 1+rw+r^2w^2+\ldots &=&\frac{1}{1-r^N}+\frac{rw}{1-r^N}+\ldots+\frac{r^{N-1}w^{N-1}}{1-r^N}\\ &=&\frac{1}{1-r^N}\left(1+rw+\ldots+r^{N-1}w^{N-1}\right)\\ &=&\frac{1}{1-r^N}\cdot\frac{1-r^N}{1-rw}\\ &=&\frac{1}{1-rw} \end{eqnarray*} [[/math]]


Summarizing, as claimed above, the geometric series formula can be understood, in a purely geometric way, for variables of type [math]x=rw[/math], with [math]r\in[0,1)[/math], and with [math]w[/math] being a root of unity. In general, however, this formula tells us that the numbers on a certain infinite spiral sum up to a certain number, which remains something quite mysterious.

Getting back now to less mysterious mathematics, which in fact will turn to be quite mysterious as well, as is often the case with things involving complex numbers, as an application of all this, let us discuss the final and most convenient writing of the complex numbers, which is a variation on the polar writing, as follows:

[[math]] x=re^{it} [[/math]]


The point with this formula comes from the following deep result:

Theorem

We have the following formula,

[[math]] e^{it}=\cos t+i\sin t [[/math]]
valid for any [math]t\in\mathbb R[/math].


Show Proof

Our claim is that this follows from the formula of the complex exponential, and for the following formulae for the Taylor series of [math]\cos[/math] and [math]\sin[/math], that we know well:

[[math]] \cos t=\sum_{l=0}^\infty(-1)^l\frac{t^{2l}}{(2l)!}\quad,\quad \sin t=\sum_{l=0}^\infty(-1)^l\frac{t^{2l+1}}{(2l+1)!} [[/math]]


Indeed, let us first recall from Theorem 5.13 that we have the following formula, for the exponential of an arbitrary complex number [math]x\in\mathbb C[/math]:

[[math]] e^x=\sum_{k=0}^\infty\frac{x^k}{k!} [[/math]]


Now let us plug [math]x=it[/math] in this formula. We obtain the following formula:

[[math]] \begin{eqnarray*} e^{it} &=&\sum_{k=0}^\infty\frac{(it)^k}{k!}\\ &=&\sum_{k=2l}\frac{(it)^k}{k!}+\sum_{k=2l+1}\frac{(it)^k}{k!}\\ &=&\sum_{l=0}^\infty(-1)^l\frac{t^{2l}}{(2l)!}+i\sum_{l=0}^\infty(-1)^l\frac{t^{2l+1}}{(2l+1)!}\\ &=&\cos t+i\sin t \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

As a main application of the above formula, we have:

Theorem

We have the following formula,

[[math]] e^{\pi i}=-1 [[/math]]
and we have [math]E=mc^2[/math] as well.


Show Proof

We have two assertions here, the idea being as follows:


(1) The first formula, [math]e^{\pi i}=-1[/math], which is actually the main formula in mathematics, comes from Theorem 5.15, by setting [math]t=\pi[/math]. Indeed, we obtain:

[[math]] \begin{eqnarray*} e^{\pi i} &=&\cos\pi+i\sin\pi\\ &=&-1+i\cdot 0\\ &=&-1 \end{eqnarray*} [[/math]]


(2) As for [math]E=mc^2[/math], which is the main formula in physics, this is something deep too. Although we will not really need it here, we recommend learning it as well, for symmetry reasons between math and physics, say from Feynman [1], [2], [3].

Now back to our [math]x=re^{it}[/math] objectives, with the above theory in hand we can indeed use from now on this notation, the complete statement being as follows:

Theorem

The complex numbers [math]x=a+ib[/math] can be written in polar coordinates,

[[math]] x=re^{it} [[/math]]
with the connecting formulae being

[[math]] a=r\cos t\quad,\quad b=r\sin t [[/math]]
and in the other sense being

[[math]] r=\sqrt{a^2+b^2}\quad,\quad \tan t=\frac{b}{a} [[/math]]
and with [math]r,t[/math] being called modulus, and argument.


Show Proof

This is a reformulation of our previous Definition 5.5, by using the formula [math]e^{it}=\cos t+i\sin t[/math] from Theorem 5.15, and multiplying everything by [math]r[/math].

With this in hand, we can now go back to the basics, namely the addition and multiplication of the complex numbers. We have the following result:

Theorem

In polar coordinates, the complex numbers multiply as

[[math]] re^{is}\cdot pe^{it}=rp\,e^{i(s+t)} [[/math]]
with the arguments [math]s,t[/math] being taken modulo [math]2\pi[/math].


Show Proof

This is something that we already know, from Theorem 5.6, reformulated by using the notations from Theorem 5.17. Observe that this follows as well directly, from the fact that we have [math]e^{a+b}=e^ae^b[/math], that we know from analysis.

The above formula is obviously very powerful. However, in polar coordinates we do not have a simple formula for the sum. Thus, this formalism has its limitations.


We can investigate as well more complicated operations, as follows:

Theorem

We have the following operations on the complex numbers, written in polar form, as above:

  • Inversion: [math](re^{it})^{-1}=r^{-1}e^{-it}[/math].
  • Square roots: [math]\sqrt{re^{it}}=\pm\sqrt{r}e^{it/2}[/math].
  • Powers: [math](re^{it})^a=r^ae^{ita}[/math].
  • Conjugation: [math]\overline{re^{it}}=re^{-it}[/math].


Show Proof

This is something that we already know, from Theorem 5.7, but we can now discuss all this, from a more conceptual viewpoint, the idea being as follows:


(1) We have indeed the following computation, using Theorem 5.18:

[[math]] \begin{eqnarray*} (re^{it})(r^{-1}e^{-it}) &=&rr^{-1}\cdot e^{i(t-t)}\\ &=&1\cdot 1\\ &=&1 \end{eqnarray*} [[/math]]


(2) Once again by using Theorem 5.18, we have:

[[math]] (\pm\sqrt{r}e^{it/2})^2 =(\sqrt{r})^2e^{i(t/2+t/2)} =re^{it} [[/math]]


(3) Given an arbitrary number [math]a\in\mathbb R[/math], we can define, as stated:

[[math]] (re^{it})^a=r^ae^{ita} [[/math]]


Due to Theorem 5.18, this operation [math]x\to x^a[/math] is indeed the correct one.


(4) This comes from the fact, that we know from Theorem 5.11, that the conjugation operation [math]x\to\bar{x}[/math] keeps the modulus, and switches the sign of the argument.

5c. Equations, roots

Getting back to algebra, recall from Theorem 5.9 that any degree 2 equation has 2 complex roots. We can in fact prove that any polynomial equation, of arbitrary degree [math]N\in\mathbb N[/math], has exactly [math]N[/math] complex solutions, counted with multiplicities:

Theorem

Any polynomial [math]P\in\mathbb C[X][/math] decomposes as

[[math]] P=c(X-a_1)\ldots (X-a_N) [[/math]]
with [math]c\in\mathbb C[/math] and with [math]a_1,\ldots,a_N\in\mathbb C[/math].


Show Proof

The problem is that of proving that our polynomial has at least one root, because afterwards we can proceed by recurrence. We prove this by contradiction. So, assume that [math]P[/math] has no roots, and pick a number [math]z\in\mathbb C[/math] where [math]|P|[/math] attains its minimum:

[[math]] |P(z)|=\min_{x\in\mathbb C}|P(x)| \gt 0 [[/math]]

Since [math]Q(t)=P(z+t)-P(z)[/math] is a polynomial which vanishes at [math]t=0[/math], this polynomial must be of the form [math]ct^k[/math] + higher terms, with [math]c\neq0[/math], and with [math]k\geq1[/math] being an integer. We obtain from this that, with [math]t\in\mathbb C[/math] small, we have the following estimate:

[[math]] P(z+t)\simeq P(z)+ct^k [[/math]]


Now let us write [math]t=rw[/math], with [math]r \gt 0[/math] small, and with [math]|w|=1[/math]. Our estimate becomes:

[[math]] P(z+rw)\simeq P(z)+cr^kw^k [[/math]]


Now recall that we assumed [math]P(z)\neq0[/math]. We can therefore choose [math]w\in\mathbb T[/math] such that [math]cw^k[/math] points in the opposite direction to that of [math]P(z)[/math], and we obtain in this way:

[[math]] \begin{eqnarray*} |P(z+rw)| &\simeq&|P(z)+cr^kw^k|\\ &=&|P(z)|(1-|c|r^k) \end{eqnarray*} [[/math]]


Now by choosing [math]r \gt 0[/math] small enough, as for the error in the first estimate to be small, and overcame by the negative quantity [math]-|c|r^k[/math], we obtain from this:

[[math]] |P(z+rw)| \lt |P(z)| [[/math]]


But this contradicts our definition of [math]z\in\mathbb C[/math], as a point where [math]|P|[/math] attains its minimum. Thus [math]P[/math] has a root, and by recurrence it has [math]N[/math] roots, as stated.

All this is very nice, and we will see applications in a moment. As a word of warning, however, we should mention that the above result remains something quite theoretical. Indeed, the proof is by contradiction, and there is no way of recycling the material there into something explicit, that can be used for effectively computing the roots.


Still talking polynomials and their roots, let us try however to understand what the analogue of [math]\Delta=b^2-4ac[/math] is, for an arbitrary polynomial [math]P\in\mathbb C[X][/math]. We will need:

Theorem

Given two polynomials [math]P,Q\in\mathbb C[X][/math], written as follows,

[[math]] P=c(X-a_1)\ldots(X-a_k)\quad,\quad Q=d(X-b_1)\ldots(X-b_l) [[/math]]
the following quantity, which is called resultant of [math]P,Q[/math],

[[math]] R(P,Q)=c^ld^k\prod_{ij}(a_i-b_j) [[/math]]
is a polynomial in the coefficients of [math]P,Q[/math], with integer coefficients, and we have

[[math]] R(P,Q)=0 [[/math]]
precisely when [math]P,Q[/math] have a common root.


Show Proof

Given [math]P,Q\in\mathbb C[X][/math], we can certainly construct the quantity [math]R(P,Q)[/math] in the statement, and we have then [math]R(P,Q)=0[/math] precisely when [math]P,Q[/math] have a common root. The whole point is that of proving that [math]R(P,Q)[/math] is a polynomial in the coefficients of [math]P,Q[/math], with integer coefficients. But this can be checked as follows:


(1) We can expand the formula of [math]R(P,Q)[/math], and in what regards [math]a_1,\ldots,a_k[/math], which are the roots of [math]P[/math], we obtain in this way certain symmetric functions in these variables, which will be therefore polynomials in the coefficients of [math]P[/math], with integer coefficients.


(2) We can then look what happens with respect to the remaining variables [math]b_1,\ldots,b_l[/math], which are the roots of [math]Q[/math]. Once again what we have here are certain symmetric functions, and so polynomials in the coefficients of [math]Q[/math], with integer coefficients.


(3) Thus, we are led to the conclusion in the statement, that [math]R(P,Q)[/math] is a polynomial in the coefficients of [math]P,Q[/math], with integer coefficients, and with the remark that the [math]c^ld^k[/math] factor is there for these latter coefficients to be indeed integers, instead of rationals.

All this might seem a bit complicated, and as an illustration, let us work out an example. Consider the case of a polynomial of degree 2, and a polynomial of degree 1:

[[math]] P=ax^2+bx+c\quad,\quad Q=dx+e [[/math]]


In order to compute the resultant, let us factorize our polynomials:

[[math]] P=a(x-p)(x-q)\quad,\quad Q=d(x-r) [[/math]]


The resultant can be then computed as follows, by using the method above:

[[math]] \begin{eqnarray*} R(P,Q) &=&ad^2(p-r)(q-r)\\ &=&ad^2(pq-(p+q)r+r^2)\\ &=&cd^2+bd^2r+ad^2r^2\\ &=&cd^2-bde+ae^2 \end{eqnarray*} [[/math]]


Finally, observe that [math]R(P,Q)=0[/math] corresponds indeed to the fact that [math]P,Q[/math] have a common root. Indeed, the root of [math]Q[/math] is [math]r=-e/d[/math], and we have:

[[math]] P(r) =\frac{ae^2}{d^2}-\frac{be}{d}+c =\frac{R(P,Q)}{d^2} [[/math]]


Thus [math]P(r)=0[/math] precisely when [math]R(P,Q)=0[/math], as predicted by Theorem 5.21.


With this, we can now talk about the discriminant of any polynomial, as follows:

Theorem

Given a polynomial [math]P\in\mathbb C[X][/math], written as

[[math]] P(X)=cX^N+dX^{N-1}+\ldots [[/math]]
its discriminant, defined as being the following quantity,

[[math]] \Delta(P)=\frac{(-1)^{\binom{N}{2}}}{c}R(P,P') [[/math]]
is a polynomial in the coefficients of [math]P[/math], with integer coefficients, and

[[math]] \Delta(P)=0 [[/math]]
happens precisely when [math]P[/math] has a double root.


Show Proof

This follows from Theorem 5.21, applied with [math]P=Q[/math], with the division by [math]c[/math] being indeed possible, under [math]\mathbb Z[/math], and with the sign being there for various reasons, including the compatibility with some well-known formulae, at small values of [math]N\in\mathbb N[/math].

As an illustration, let us see what happens in degree 2. Here we have:

[[math]] P=aX^2+bX+c\quad,\quad P'=2aX+b [[/math]]


Thus, the resultant is given by the following formula:

[[math]] \begin{eqnarray*} R(P,P') &=&ab^2-b(2a)b+c(2a)^2\\ &=&4a^2c-ab^2\\ &=&-a(b^2-4ac) \end{eqnarray*} [[/math]]


With the normalizations in Theorem 5.22 made, we obtain, as we should:

[[math]] \Delta(P)=b^2-4ac [[/math]]


As another illustration, let us work out what happens in degree 3. Here the result, which is useful and interesting, and is probably new to you, is as follows:

Theorem

The discriminant of a degree [math]3[/math] polynomial,

[[math]] P=aX^3+bX^2+cX+d [[/math]]
is the number [math]\Delta(P)=b^2c^2-4ac^3-4b^3d-27a^2d^2+18abcd[/math].


Show Proof

We need to do some tough computations here. Let us first compute resultants. Consider two polynomials, of degree 3 and degree 2, written as follows:

[[math]] P=aX^3+bX^2+cX+d=a(X-p)(X-q)(X-r) [[/math]]

[[math]] Q=eX^2+fX+g=e(X-s)(X-t) [[/math]]


The resultant of these two polynomials is then given by:

[[math]] \begin{eqnarray*} R(P,Q) &=&a^2e^3(p-s)(p-t)(q-s)(q-t)(r-s)(r-t)\\ &=&a^2\cdot e(p-s)(p-t)\cdot e(q-s)(q-t)\cdot e(r-s)(r-t)\\ &=&a^2Q(p)Q(q)Q(r)\\ &=&a^2(ep^2+fp+g)(eq^2+fq+g)(er^2+fr+g) \end{eqnarray*} [[/math]]


By expanding, we obtain the following formula for this resultant:

[[math]] \begin{eqnarray*} \frac{R(P,Q)}{a^2} &=&e^3p^2q^2r^2+e^2f(p^2q^2r+p^2qr^2+pq^2r^2)\\ &+&e^2g(p^2q^2+p^2r^2+q^2r^2)+ef^2(p^2qr+pq^2r+pqr^2)\\ &+&efg(p^2q+pq^2+p^2r+pr^2+q^2r+qr^2)+f^3pqr\\ &+&eg^2(p^2+q^2+r^2)+f^2g(pq+pr+qr)\\ &+&fg^2(p+q+r)+g^3 \end{eqnarray*} [[/math]]


Note in passing that we have 27 terms on the right, as we should, and with this kind of check being mandatory, when doing such computations. Next, we have:

[[math]] p+q+r=-\frac{b}{a}\quad,\quad pq+pr+qr=\frac{c}{a}\quad,\quad pqr=-\frac{d}{a} [[/math]]


By using these formulae, we can produce some more, as follows:

[[math]] p^2+q^2+r^2=(p+q+r)^2-2(pq+pr+qr)=\frac{b^2}{a^2}-\frac{2c}{a} [[/math]]

[[math]] p^2q+pq^2+p^2r+pr^2+q^2r+qr^2=(p+q+r)(pq+pr+qr)-3pqr=-\frac{bc}{a^2}+\frac{3d}{a} [[/math]]

[[math]] p^2q^2+p^2r^2+q^2r^2=(pq+pr+qr)^2-2pqr(p+q+r)=\frac{c^2}{a^2}-\frac{2bd}{a^2} [[/math]]


By plugging now this data into the formula of [math]R(P,Q)[/math], we obtain:

[[math]] \begin{eqnarray*} R(P,Q) &=&a^2e^3\cdot\frac{d^2}{a^2}-a^2e^2f\cdot\frac{cd}{a^2}+a^2e^2g\left(\frac{c^2}{a^2}-\frac{2bd}{a^2}\right)+a^2ef^2\cdot\frac{bd}{a^2}\\ &+&a^2efg\left(-\frac{bc}{a^2}+\frac{3d}{a}\right)-a^2f^3\cdot\frac{d}{a}\\ &+&a^2eg^2\left(\frac{b^2}{a^2}-\frac{2c}{a}\right)+a^2f^2g\cdot\frac{c}{a}-a^2fg^2\cdot\frac{b}{a}+a^2g^3 \end{eqnarray*} [[/math]]


Thus, we have the following formula for the resultant:

[[math]] \begin{eqnarray*} R(P,Q) &=&d^2e^3-cde^2f+c^2e^2g-2bde^2g+bdef^2-bcefg+3adefg\\ &-&adf^3+b^2eg^2-2aceg^2+acf^2g-abfg^2+a^2g^3 \end{eqnarray*} [[/math]]


Getting back now to our discriminant problem, with [math]Q=P'[/math], which corresponds to [math]e=3a[/math], [math]f=2b[/math], [math]g=c[/math], we obtain the following formula:

[[math]] \begin{eqnarray*} R(P,P') &=&27a^3d^2-18a^2bcd+9a^2c^3-18a^2bcd+12ab^3d-6ab^2c^2+18a^2bcd\\ &-&8ab^3d+3ab^2c^2-6a^2c^3+4ab^2c^2-2ab^2c^2+a^2c^3 \end{eqnarray*} [[/math]]


By simplifying terms, and dividing by [math]a[/math], we obtain the following formula:

[[math]] -\Delta(P)=27a^2d^2-18abcd+4ac^3+4b^3d-b^2c^2 [[/math]]


But this gives the formula in the statement, and we are done.

Still talking degree 3 equations, let us try to solve [math]P=0[/math], with [math]P=aX^3+bX^2+cX+d[/math] as above. By linear transformations we can assume [math]a=1,b=0[/math], and then it is convenient to write [math]c=3p,d=2q[/math]. Thus, our equation becomes [math]x^3+3px+2q=0[/math], and regarding such equations, we have the following famous result, due to Cardano:

Theorem

For a normalized degree [math]3[/math] equation, namely

[[math]] x^3+3px+2q=0 [[/math]]
the discriminant is [math]\Delta=-108(p^3+q^2)[/math]. Assuming [math]p,q\in\mathbb R[/math] and [math]\Delta \lt 0[/math], the number

[[math]] x=\sqrt[3]{-q+\sqrt{p^3+q^2}}+\sqrt[3]{-q-\sqrt{p^3+q^2}} [[/math]]
is a real solution of our equation.


Show Proof

The formula of [math]\Delta[/math] is clear from definitions, and with [math]108=4\times 27[/math]. Now with [math]x[/math] as in the statement, by using [math](a+b)^3=a^3+b^3+3ab(a+b)[/math], we have:

[[math]] \begin{eqnarray*} x^3 &=&\left(\sqrt[3]{-q+\sqrt{p^3+q^2}}+\sqrt[3]{-q-\sqrt{p^3+q^2}}\right)^3\\ &=&-2q+3\sqrt[3]{-q+\sqrt{p^3+q^2}}\cdot\sqrt[3]{-q-\sqrt{p^3+q^2}}\cdot x\\ &=&-2q+3\sqrt[3]{q^2-p^3-q^2}\cdot x\\ &=&-2q-3px \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

There are many more things that can be said about degree 3 equations, along these lines, and we will certainly have an exercise about this, at the end of this chapter.

5d. Roots of unity

We kept the best for the end. As a last topic regarding the complex numbers, which is something really beautiful, we have the roots of unity. Let us start with:

Theorem

The equation [math]x^N=1[/math] has [math]N[/math] complex solutions, namely

[[math]] \left\{w^k\Big|k=0,1,\ldots,N-1\right\}\quad,\quad w=e^{2\pi i/N} [[/math]]
which are called roots of unity of order [math]N[/math].


Show Proof

This follows from the general multiplication formula for complex numbers from Theorem 5.16. Indeed, with [math]x=re^{it}[/math] our equation reads:

[[math]] r^Ne^{itN}=1 [[/math]]


Thus [math]r=1[/math], and [math]t\in[0,2\pi)[/math] must be a multiple of [math]2\pi/N[/math], as stated.

As an illustration here, the roots of unity of small order, along with some of their basic properties, which are very useful for computations, are as follows:


\underline{[math]N=1[/math]}. Here the unique root of unity is 1.


\underline{[math]N=2[/math]}. Here we have two roots of unity, namely 1 and [math]-1[/math].


\underline{[math]N=3[/math]}. Here we have 1, then [math]w=e^{2\pi i/3}[/math], and then [math]w^2=\bar{w}=e^{4\pi i/3}[/math].


\underline{[math]N=4[/math]}. Here the roots of unity, read as usual counterclockwise, are [math]1,i,-1,-i[/math].


\underline{[math]N=5[/math]}. Here, with [math]w=e^{2\pi i/5}[/math], the roots of unity are [math]1,w,w^2,w^3,w^4[/math].


\underline{[math]N=6[/math]}. Here a useful alternative writing is [math]\{\pm1,\pm w,\pm w^2\}[/math], with [math]w=e^{2\pi i/3}[/math].


\underline{[math]N=7[/math]}. Here, with [math]w=e^{2\pi i/7}[/math], the roots of unity are [math]1,w,w^2,w^3,w^4,w^5,w^6[/math].


\underline{[math]N=8[/math]}. Here the roots of unity, read as usual counterclockwise, are the numbers [math]1,w,i,iw,-1,-w,-i,-iw[/math], with [math]w=e^{\pi i/4}[/math], which is also given by [math]w=(1+i)/\sqrt{2}[/math].


The roots of unity are very useful variables, and have many interesting properties. As a first application, we can now solve the ambiguity questions related to the extraction of [math]N[/math]-th roots, from Theorem 5.7 and Theorem 5.19, the statement being as follows:

Theorem

Any nonzero complex number, written as

[[math]] x=re^{it} [[/math]]
has exactly [math]N[/math] roots of order [math]N[/math], which appear as

[[math]] y=r^{1/N}e^{it/N} [[/math]]
multiplied by the [math]N[/math] roots of unity of order [math]N[/math].


Show Proof

We must solve the equation [math]z^N=x[/math], over the complex numbers. Since the number [math]y[/math] in the statement clearly satisfies [math]y^N=x[/math], our equation is equivalent to:

[[math]] z^N=y^N [[/math]]


Now observe that we can write this equation as follows:

[[math]] \left(\frac{z}{y}\right)^N=1 [[/math]]


We conclude that the solutions [math]z[/math] appear by multiplying [math]y[/math] by the solutions of [math]t^N=1[/math], which are the [math]N[/math]-th roots of unity, as claimed.

The roots of unity appear in connection with many other interesting questions, and there are many useful formulae relating them, which are good to know. Here is a basic such formula, very beautiful, to be used many times in what follows:

Theorem

The roots of unity, [math]\{w^k\}[/math] with [math]w=e^{2\pi i/N}[/math], have the property

[[math]] \sum_{k=0}^{N-1}(w^k)^s=N\delta_{N|s} [[/math]]
for any exponent [math]s\in\mathbb N[/math], where on the right we have a Kronecker symbol.


Show Proof

The numbers in the statement, when written more conveniently as [math](w^s)^k[/math] with [math]k=0,\ldots,N-1[/math], form a certain regular polygon in the plane [math]P_s[/math]. Thus, if we denote by [math]C_s[/math] the barycenter of this polygon, we have the following formula:

[[math]] \frac{1}{N}\sum_{k=0}^{N-1}w^{ks}=C_s [[/math]]


Now observe that in the case [math]N\slash\hskip-1.6mm|\,s[/math] our polygon [math]P_s[/math] is non-degenerate, circling around the unit circle, and having center [math]C_s=0[/math]. As for the case [math]N|s[/math], here the polygon is degenerate, lying at 1, and having center [math]C_s=1[/math]. Thus, we have the following formula:

[[math]] C_s=\delta_{N|s} [[/math]]


Thus, we obtain the formula in the statement.

As an interesting philosophical fact, regarding the roots of unity, and the complex numbers in general, we can now solve the following equation, in a “uniform” way:

[[math]] x_1+\ldots+x_N=0 [[/math]]


With this being not a joke. Frankly, can you find some nice-looking family of real numbers [math]x_1,\ldots,x_N[/math] satisfying [math]x_1+\ldots+x_N=0[/math]? Certainly not. But with complex numbers we have now our answer, the sum of the [math]N[/math]-th roots of unity being zero.


This was for our basic presentation of the complex numbers. We will be back to more theory regarding them, and the roots of unity, later on. Among others, we will see later some non-trivial applications of our above solution to [math]x_1+\ldots+x_N=0[/math].

General references

Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].

References

  1. R.P. Feynman, R.B. Leighton and M. Sands, The Feynman lectures on physics I: mainly mechanics, radiation and heat, Caltech (1963).
  2. R.P. Feynman, R.B. Leighton and M. Sands, The Feynman lectures on physics II: mainly electromagnetism and matter, Caltech (1964).
  3. R.P. Feynman, R.B. Leighton and M. Sands, The Feynman lectures on physics III: quantum mechanics, Caltech (1966).