guide:Cf39aef979: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<div class="d-none"><math> | |||
\newcommand{\mathds}{\mathbb}</math></div> | |||
{{Alert-warning|This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion. }} | |||
===1a. Binomials, factorials=== | |||
We denote by <math>\mathbb N</math> the set of positive integers, <math>\mathbb N=\{0,1,2,3,\ldots\}</math>, with <math>\mathbb N</math> standing for “natural”. Quite often in computations we will need negative numbers too, and we denote by <math>\mathbb Z</math> the set of all integers, <math>\mathbb Z=\{\ldots,-2,-1,0,1,2,\ldots\}</math>, with <math>\mathbb Z</math> standing from “zahlen”, which is German for “numbers”. Finally, there are many questions in mathematics involving fractions, or quotients, which are called rational numbers: | |||
{{defncard|label=|id=|The rational numbers are the quotients of type | |||
<math display="block"> | |||
r=\frac{a}{b} | |||
</math> | |||
with <math>a,b\in\mathbb Z</math>, and <math>b\neq0</math>, identified according to the usual rule for quotients, namely: | |||
<math display="block"> | |||
\frac{a}{b}=\frac{c}{d}\iff ad=bc | |||
</math> | |||
We denote the set of rational numbers by <math>\mathbb Q</math>, standing for “quotients”.}} | |||
Observe that we have inclusions <math>\mathbb N\subset\mathbb Z\subset\mathbb Q</math>. The integers add and multiply according to the rules that you know well. As for the rational numbers, these add according to the usual rule for quotients, which is as follows, and death penalty for forgetting it: | |||
<math display="block"> | |||
\frac{a}{b}+\frac{c}{d}=\frac{ad+bc}{bd} | |||
</math> | |||
Also, the rational numbers multiply according to the usual rule for quotients, namely: | |||
<math display="block"> | |||
\frac{a}{b}\cdot\frac{c}{d}=\frac{ac}{bd} | |||
</math> | |||
Beyond rationals, we have the real numbers, whose set is denoted <math>\mathbb R</math>, and which include beasts such as <math>\sqrt{3}=1.73205\ldots</math> or <math>\pi=3.14159\ldots</math> But more on these later. For the moment, let us see what can be done with integers, and their quotients. As a first theorem, solving a problem which often appears in real life, we have: | |||
{{proofcard|Theorem|theorem-1|The number of possibilities of choosing <math>k</math> objects among <math>n</math> objects is | |||
<math display="block"> | |||
\binom{n}{k}=\frac{n!}{k!(n-k)!} | |||
</math> | |||
called binomial number, where <math>n!=1\cdot2\cdot3\ldots(n-2)(n-1)n</math>, called “factorial <math>n</math>”. | |||
|Imagine a set consisting of <math>n</math> objects. We have <math>n</math> possibilities for choosing our 1st object, then <math>n-1</math> possibilities for choosing our 2nd object, out of the <math>n-1</math> objects left, and so on up to <math>n-k+1</math> possibilities for choosing our <math>k</math>-th object, out of the <math>n-k+1</math> objects left. Since the possibilities multiply, the total number of choices is: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
N | |||
&=&n(n-1)\ldots(n-k+1)\\ | |||
&=&n(n-1)\ldots(n-k+1)\cdot\frac{(n-k)(n-k-1)\ldots2\cdot1}{(n-k)(n-k-1)\ldots 2\cdot1}\\ | |||
&=&\frac{n(n-1)\ldots2\cdot 1}{(n-k)(n-k-1)\ldots 2\cdot1}\\ | |||
&=&\frac{n!}{(n-k)!} | |||
\end{eqnarray*} | |||
</math> | |||
But is this correct. Normally a mathematical theorem coming with mathematical proof is guaranteed to be <math>100\%</math> correct, and if in addition the proof is truly clever, like the above proof was, with that fraction trick, the confidence rate jumps up to <math>200\%</math>. | |||
This being said, never knows, so let us doublecheck, by taking for instance <math>n=3,k=2</math>. Here we have to choose 2 objects among 3 objects, and this is something easily done, because what we have to do is to dismiss one of the objects, and <math>N=3</math> choices here, and keep the 2 objects left. Thus, we have <math>N=3</math> choices. On the other hand our genius math computation gives <math>N=3!/1!=6</math>, which is obviously the wrong answer. | |||
So, where is the mistake? Thinking a bit, the number <math>N</math> that we computed is in fact the number of possibilities of choosing <math>k</math> ordered objects among <math>n</math> objects. Thus, we must divide everything by the number <math>M</math> of orderings of the <math>k</math> objects that we chose: | |||
<math display="block"> | |||
\binom{n}{k}=\frac{N}{M} | |||
</math> | |||
In order to compute now the missing number <math>M</math>, imagine a set consisting of <math>k</math> objects. There are <math>k</math> choices for the object to be designated <math>\#1</math>, then <math>k-1</math> choices for the object to be designated <math>\#2</math>, and so on up to 1 choice for the object to be designated <math>\#k</math>. We conclude that we have <math>M=k(k-1)\ldots 2\cdot 1=k!</math>, and so: | |||
<math display="block"> | |||
\binom{n}{k}=\frac{n!/(n-k)!}{k!}=\frac{n!}{k!(n-k)!} | |||
</math> | |||
And this is the correct answer, because, well, that is how things are. In case you doubt, at <math>n=3,k=2</math> for instance we obtain <math>3!/2!1!=3</math>, which is correct.}} | |||
All this is quite interesting, and in addition to having some exciting mathematics going on, and more on this in a moment, we have as well some philosophical conclusions. Formulae can be right or wrong, and as the above shows, good-looking, formal mathematical proofs can be right or wrong too. So, what to do? Here is my advice: | |||
\begin{advice} | |||
Always doublecheck what you're doing, regularly, and definitely at the end, either with an alternative proof, or with some numerics. | |||
\end{advice} | |||
This is something very serious. Unless you're doing something very familiar, that you're used to for at least 5-10 years or so, like doing additions and multiplications for you, or some easy calculus for me, formulae and proofs that you can come upon are by default wrong. In order to make them correct, and ready to use, you must check and doublecheck and correct them, helped by alternative methods, or numerics. | |||
Which brings us into the question on whether mathematics is an exact science or not. Not clear. Chemistry for instance is an exact science, because findings of type “a mixture of water and salt cannot explode” look rock-solid. Same for biology, with findings of type “crocodiles eat fish” being rock-solid too. In what regards mathematics however, and theoretical physics too, things are always prone to human mistake. | |||
And for ending this discussion, you might ask then, what about engineering? After all, this is mathematics and physics, which is usually <math>100\%</math> correct, because most of the bridges, buildings and other things built by engineers don't collapse. Well, this is because engineers follow, and in a truly maniac way, the above Advice 1.3. You won't declare a project for a bridge, building, engine and so on final and correct, ready for production, until you checked and doublechecked it with 10 different methods or so, won't you. | |||
Back to work now, as an important adding to Theorem 1.2, we have: | |||
\begin{convention} | |||
By definition, <math>0!=1</math>. | |||
\end{convention} | |||
This convention comes, and no surprise here, from Advice 1.3. Indeed, we obviously have <math>\binom{n}{n}=1</math>, but if we want to recover this formula via Theorem 1.2 we are a bit in trouble, and so we must declare that <math>0!=1</math>, as for the following computation to work: | |||
<math display="block"> | |||
\binom{n}{n}=\frac{n!}{n!0!}=\frac{n!}{n!\times1}=1 | |||
</math> | |||
Going ahead now with more mathematics and less philosophy, with Theorem 1.2 complemented by Convention 1.4 being in final form (trust me), we have: | |||
{{proofcard|Theorem|theorem-2|We have the binomial formula | |||
<math display="block"> | |||
(a+b)^n=\sum_{k=0}^n\binom{n}{k}a^kb^{n-k} | |||
</math> | |||
valid for any two numbers <math>a,b\in\mathbb Q</math>. | |||
|We have to compute the following quantity, with <math>n</math> terms in the product: | |||
<math display="block"> | |||
(a+b)^n=(a+b)(a+b)\ldots(a+b) | |||
</math> | |||
When expanding, we obtain a certain sum of products of <math>a,b</math> variables, with each such product being a quantity of type <math>a^kb^{n-k}</math>. Thus, we have a formula as follows: | |||
<math display="block"> | |||
(a+b)^n=\sum_{k=0}^nC_ka^kb^{n-k} | |||
</math> | |||
In order to finish, it remains to compute the coefficients <math>C_k</math>. But, according to our product formula, <math>C_k</math> is the number of choices for the <math>k</math> needed <math>a</math> variables among the <math>n</math> available <math>a</math> variables. Thus, according to Theorem 1.2, we have: | |||
<math display="block"> | |||
C_k=\binom{n}{k} | |||
</math> | |||
We are therefore led to the formula in the statement.}} | |||
Theorem 1.5 is something quite interesting, so let us doublecheck it with some numerics. At small values of <math>n</math> we obtain the following formulae, which are all correct: | |||
<math display="block"> | |||
(a+b)^0=1 | |||
</math> | |||
<math display="block"> | |||
(a+b)^1=a+b | |||
</math> | |||
<math display="block"> | |||
(a+b)^2=a^2+2ab+b^2 | |||
</math> | |||
<math display="block"> | |||
(a+b)^3=a^3+3a^2b+3ab^2+b^3 | |||
</math> | |||
<math display="block"> | |||
(a+b)^4=a^4+4a^3b+6a^2b^2+4ab^3+b^4 | |||
</math> | |||
<math display="block"> | |||
(a+b)^5=a^5+5a^4b+10a^3b^2+10a^2b^3+5a^4b+b^5 | |||
</math> | |||
<math display="block"> | |||
\vdots | |||
</math> | |||
Now observe that in these formulae, say for memorization purposes, the powers of the <math>a,b</math> variables are something very simple, that can be recovered right away. What matters are the coefficients, which are the binomial coefficients <math>\binom{n}{k}</math>, which form a triangle. So, it is enough to memorize this triangle, and this can be done by using: | |||
{{proofcard|Theorem|theorem-3|The Pascal triangle, formed by the binomial coefficients <math>\binom{n}{k}</math>, | |||
<math display="block"> | |||
1 | |||
</math> | |||
<math display="block"> | |||
1\ \ ,\ \ 1 | |||
</math> | |||
<math display="block"> | |||
1\ \ ,\ \ 2\ \ ,\ \ 1 | |||
</math> | |||
<math display="block"> | |||
1\ \ ,\ \ 3\ \ ,\ \ 3\ \ ,\ \ 1 | |||
</math> | |||
<math display="block"> | |||
1\ \ ,\ \ 4\ \ ,\ \ 6\ \ ,\ \ 4\ \ ,\ \ 1 | |||
</math> | |||
<math display="block"> | |||
1\ \ ,\ \ 5\ \ ,\ \ 10\ \ ,\ \ 10\ \ ,\ \ 5\ \ ,\ \ 1 | |||
</math> | |||
<math display="block"> | |||
\vdots | |||
</math> | |||
has the property that each entry is the sum of the two entries above it. | |||
|In practice, the theorem states that the following formula holds: | |||
<math display="block"> | |||
\binom{n}{k}=\binom{n-1}{k-1}+\binom{n-1}{k} | |||
</math> | |||
There are many ways of proving this formula, all instructive, as follows: | |||
(1) Brute-force computation. We have indeed, as desired: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\binom{n-1}{k-1}+\binom{n-1}{k} | |||
&=&\frac{(n-1)!}{(k-1)!(n-k)!}+\frac{(n-1)!}{k!(n-k-1)!}\\ | |||
&=&\frac{(n-1)!}{(k-1)!(n-k-1)!}\left(\frac{1}{n-k}+\frac{1}{k}\right)\\ | |||
&=&\frac{(n-1)!}{(k-1)!(n-k-1)!}\cdot\frac{n}{k(n-k)}\\ | |||
&=&\binom{n}{k} | |||
\end{eqnarray*} | |||
</math> | |||
(2) Algebraic proof. We have the following formula, to start with: | |||
<math display="block"> | |||
(a+b)^n=(a+b)^{n-1}(a+b) | |||
</math> | |||
By using the binomial formula, this formula becomes: | |||
<math display="block"> | |||
\sum_{k=0}^n\binom{n}{k}a^kb^{n-k}=\left[\sum_{r=0}^{n-1}\binom{n-1}{r}a^rb^{n-1-r}\right](a+b) | |||
</math> | |||
Now let us perform the multiplication on the right. We obtain a certain sum of terms of type <math>a^kb^{n-k}</math>, and to be more precise, each such <math>a^kb^{n-k}</math> term can either come from the <math>\binom{n-1}{k-1}</math> terms <math>a^{k-1}b^{n-k}</math> multiplied by <math>a</math>, or from the <math>\binom{n-1}{k}</math> terms <math>a^kb^{n-1-k}</math> multiplied by <math>b</math>. Thus, the coefficient of <math>a^kb^{n-k}</math> on the right is <math>\binom{n-1}{k-1}+\binom{n-1}{k}</math>, as desired. | |||
(3) Combinatorics. Let us count <math>k</math> objects among <math>n</math> objects, with one of the <math>n</math> objects having a hat on top. Obviously, the hat has nothing to do with the count, and we obtain <math>\binom{n}{k}</math>. On the other hand, we can say that there are two possibilities. Either the object with hat is counted, and we have <math>\binom{n-1}{k-1}</math> possibilities here, or the object with hat is not counted, and we have <math>\binom{n-1}{k}</math> possibilities here. Thus <math>\binom{n}{k}=\binom{n-1}{k-1}+\binom{n-1}{k}</math>, as desired.}} | |||
There are many more things that can be said about binomial coefficients, with all sorts of interesting formulae, but the idea is always the same, namely that in order to find such formulae you have a choice between algebra and combinatorics, and that when it comes to proofs, the brute-force computation method is useful too. In practice, the best is to master all 3 techniques. Among others, because of Advice 1.3. You will have in this way 3 different methods, for making sure that your formulae are correct indeed. | |||
===1b. Real numbers, analysis=== | |||
All the above was very nice, but remember that we are here for doing science and physics, and more specifically for mathematically understanding the numeric variables <math>x,y,z,\ldots</math> coming from real life. Such variables can be lengths, volumes, pressures and so on, which vary continuously with time, and common sense dictates that there is little to no chance for our variables to be rational, <math>x,y,z,\ldots\notin\mathbb Q</math>. In fact, we will even see soon a theorem, stating that the probability for such a variable to be rational is exactly 0. Or, to put it in a dramatic way, “rational numbers don't exist in real life''. | |||
You are certainly familiar with the real numbers, but let us review now their definition, which is something quite tricky. As a first goal, we would like to construct a number <math>x=\sqrt{2}</math> having the property <math>x^2=2</math>. But how to do this? Let us start with: | |||
{{proofcard|Proposition|proposition-1|There is no number <math>r\in\mathbb Q_+</math> satisfying <math>r^2=2</math>. In fact, we have | |||
<math display="block"> | |||
\mathbb Q_+=\left\{p\in\mathbb Q_+\Big|p^2 < 2\right\}\bigsqcup\left\{q\in\mathbb Q_+\Big|q^2 > 2\right\} | |||
</math> | |||
with this being a disjoint union. | |||
|In what regards the first assertion, assuming that <math>r=a/b</math> with <math>a,b\in\mathbb N</math> prime to each other satisfies <math>r^2=2</math>, we have <math>a^2=2b^2</math>, so <math>a\in2\mathbb N</math>. But by using again <math>a^2=2b^2</math> we obtain <math>b\in2\mathbb N</math>, contradiction. As for the second assertion, this is obvious.}} | |||
It looks like we are a bit stuck. We can't really tell who <math>\sqrt{2}</math> is, and the only piece of information about <math>\sqrt{2}</math> that we have comes from the knowledge of the rational numbers satisfying <math>p^2 < 2</math> or <math>q^2 > 2</math>. To be more precise, the picture that emerges is: | |||
\begin{conclusion} | |||
The number <math>\sqrt{2}</math> is the abstract beast which is bigger than all rationals satisfying <math>p^2 < 2</math>, and smaller than all positive rationals satisfying <math>q^2 > 2</math>. | |||
\end{conclusion} | |||
This does not look very good, but you know what, instead of looking for more clever solutions to our problem, what about relaxing, or being lazy, or coward, or you name it, and taking Conclusion 1.8 as a definition for <math>\sqrt{2}</math>. This is actually something not that bad, and leads to the following “lazy” definition for the real numbers: | |||
{{defncard|label=|id=|The real numbers <math>x\in\mathbb R</math> are formal cuts in the set of rationals, | |||
<math display="block"> | |||
\mathbb Q=\mathbb Q_{\leq x}\sqcup\mathbb Q_{ > x} | |||
</math> | |||
with such a cut being by definition subject to the following condition: | |||
<math display="block"> | |||
p\in\mathbb Q_{\leq x}\ ,\ q\in \mathbb Q_{ > x}\implies p < q | |||
</math> | |||
These numbers add and multiply by adding and multiplying the corresponding cuts.}} | |||
This might look quite original, but believe me, there is some genius behind this definition. As a first observation, we have an inclusion <math>\mathbb Q\subset\mathbb R</math>, obtained by identifying each rational number <math>r\in\mathbb Q</math> with the obvious cut that it produces, namely: | |||
<math display="block"> | |||
\mathbb Q_{\leq r}=\left\{p\in\mathbb Q\Big|p\leq r\right\} | |||
\quad,\quad | |||
\mathbb Q_{ > r}=\left\{q\in\mathbb Q\Big|q > r\right\} | |||
</math> | |||
As a second observation, the addition and multiplication of real numbers, obtained by adding and multiplying the corresponding cuts, in the obvious way, is something very simple. To be more precise, in what regards the addition, the formula is as follows: | |||
<math display="block"> | |||
\mathbb Q_{\leq x+y}=\mathbb Q_{\leq x}+\mathbb Q_{\leq y} | |||
</math> | |||
As for the multiplication, the formula here is similar, namely <math>\mathbb Q_{\leq xy}=\mathbb Q_{\leq x}\mathbb Q_{\leq y}</math>, up to some mess with positives and negatives, which is quite easy to untangle, and with this being a good exercise. We can also talk about order between real numbers, as follows: | |||
<math display="block"> | |||
x\leq y\iff\mathbb Q_{\leq x}\subset\mathbb Q_{\leq y} | |||
</math> | |||
But let us perhaps leave more abstractions for later, and go back to more concrete things. As a first success of our theory, we can formulate the following theorem: | |||
{{proofcard|Theorem|theorem-4|The equation <math>x^2=2</math> has two solutions over the real numbers, namely the positive solution, denoted <math>\sqrt{2}</math>, and its negative counterpart, which is <math>-\sqrt{2}</math>. | |||
|By using <math>x\to-x</math>, it is enough to prove that <math>x^2=2</math> has exactly one positive solution <math>\sqrt{2}</math>. But this is clear, because <math>\sqrt{2}</math> can only come from the following cut: | |||
<math display="block"> | |||
\mathbb Q_{\leq\sqrt{2}}=\mathbb Q_-\bigsqcup\left\{p\in\mathbb Q_+\Big|p^2\leq 2\right\}\quad,\quad \mathbb Q_{ > \sqrt{2}}=\left\{q\in\mathbb Q_+\Big|q^2 > 2\right\} | |||
</math> | |||
Thus, we are led to the conclusion in the statement.}} | |||
More generally, the same method works in order to extract the square root <math>\sqrt{r}</math> of any number <math>r\in\mathbb Q_+</math>, or even of any number <math>r\in\mathbb R_+</math>, and we have the following result: | |||
{{proofcard|Theorem|theorem-5|The solutions of <math>ax^2+bx+c=0</math> with <math>a,b,c\in\mathbb R</math> are | |||
<math display="block"> | |||
x_{1,2}=\frac{-b\pm\sqrt{b^2-4ac}}{2a} | |||
</math> | |||
provided that <math>b^2-4ac\geq0</math>. In the case <math>b^2-4ac < 0</math>, there are no solutions. | |||
|We can write our equation in the following way: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
ax^2+bx+c=0 | |||
&\iff&x^2+\frac{b}{a}x+\frac{c}{a}=0\\ | |||
&\iff&\left(x+\frac{b}{2a}\right)^2-\frac{b^2}{4a^2}+\frac{c}{a}=0\\ | |||
&\iff&\left(x+\frac{b}{2a}\right)^2=\frac{b^2-4ac}{4a^2}\\ | |||
&\iff&x+\frac{b}{2a}=\pm\frac{\sqrt{b^2-4ac}}{2a} | |||
\end{eqnarray*} | |||
</math> | |||
Thus, we are led to the conclusion in the statement.}} | |||
Summarizing, we have a nice definition for the real numbers, that we can certainly do some math with. However, for anything more advanced we are in need of the decimal writing for the real numbers. The result here is as follows: | |||
{{proofcard|Theorem|theorem-6|The real numbers <math>x\in\mathbb R</math> can be written in decimal form, | |||
<math display="block"> | |||
x=\pm a_1\ldots a_n.b_1b_2b_3\ldots\ldots | |||
</math> | |||
with <math>a_i,b_i\in\{0,1,\ldots,9\}</math>, with the convention <math>\ldots b999\ldots=\ldots(b+1)000\ldots</math> | |||
|This is something quite non-trivial, assuming that you already have some familiarity with such things, for the rational numbers. The idea is as follows: | |||
(1) First of all, our precise claim is that any <math>x\in\mathbb R</math> can be written in the form in the statement, with the integer <math>\pm a_1\ldots a_n</math> and then each of the digits <math>b_1,b_2,b_3,\ldots</math> providing the best approximation of <math>x</math>, at that stage of the approximation. | |||
(2) Moreover, we have a second claim as well, namely that any expression of type <math>x=\pm a_1\ldots a_n.b_1b_2b_3\ldots\ldots</math> corresponds to a real number <math>x\in\mathbb R</math>, and that with the convention <math>\ldots b999\ldots=\ldots(b+1)000\ldots\,</math>, the correspondence is bijective. | |||
(3) In order to prove now these two assertions, our first claim is that we can restrict the attention to the case <math>x\in[0,1)</math>, and with this meaning of course <math>0\leq x < 1</math>, with respect to the order relation for the reals discussed in the above. | |||
(4) Getting started now, let <math>x\in\mathbb R</math>, coming from a cut <math>\mathbb Q=\mathbb Q_{\leq x}\sqcup\mathbb Q_{ > x}</math>. Since the set <math>\mathbb Q_{\leq x}\cap\mathbb Z</math> consists of integers, and is bounded from above by any element <math>q\in\mathbb Q_{ > x}</math> of your choice, this set has a maximal element, that we can denote <math>[x]</math>: | |||
<math display="block"> | |||
[x]=\max\left(\mathbb Q_{\leq x}\cap\mathbb Z\right) | |||
</math> | |||
It follows from definitions that <math>[x]</math> has the usual properties of the integer part, namely: | |||
<math display="block"> | |||
[x]\leq x < [x]+1 | |||
</math> | |||
Thus we have <math>x=[x]+y</math> with <math>[x]\in\mathbb Z</math> and <math>y\in[0,1)</math>, and getting back now to what we want to prove, namely (1,2) above, it is clear that it is enough to prove these assertions for the remainder <math>y\in[0,1)</math>. Thus, we have proved (3), and we can assume <math>x\in[0,1)</math>. | |||
(5) So, assume <math>x\in[0,1)</math>. We are first looking for a best approximation from below of type <math>0.b_1</math>, with <math>b_1\in\{0,\ldots,9\}</math>, and it is clear that such an approximation exists, simply by comparing <math>x</math> with the numbers <math>0.0,0.1,\ldots,0.9</math>. Thus, we have our first digit <math>b_1</math>, and then we can construct the second digit <math>b_2</math> as well, by comparing <math>x</math> with the numbers <math>0.b_10,0.b_11,\ldots,0.b_19</math>. And so on, which finishes the proof of our claim (1). | |||
(6) In order to prove now the remaining claim (2), let us restrict again the attention, as explained in (4), to the case <math>x\in[0,1)</math>. First, it is clear that any expression of type <math>x=0.b_1b_2b_3\ldots</math> defines a real number <math>x\in[0,1]</math>, simply by declaring that the corresponding cut <math>\mathbb Q=\mathbb Q_{\leq x}\sqcup\mathbb Q_{ > x}</math> comes from the following set, and its complement: | |||
<math display="block"> | |||
\mathbb Q_{\leq x}=\bigcup_{n\geq1}\left\{p\in\mathbb Q\Big|p\leq 0.b_1\ldots b_n\right\} | |||
</math> | |||
(7) Thus, we have our correspondence between real numbers as cuts, and real numbers as decimal expressions, and we are left with the question of investigating the bijectivity of this correspondence. But here, the only bug that happens is that numbers of type <math>x=\ldots b999\ldots</math>, which produce reals <math>x\in\mathbb R</math> via (6), do not come from reals <math>x\in\mathbb R</math> via (5). So, in order to finish our proof, we must investigate such numbers. | |||
(8) So, consider an expression of type <math>\ldots b999\ldots</math> Going back to the construction in (6), we are led to the conclusion that we have the following equality: | |||
<math display="block"> | |||
\mathbb Q_{\leq\ldots b999\ldots}=\mathbb Q_{\leq\ldots (b+1)000\ldots} | |||
</math> | |||
Thus, at the level of the real numbers defined as cuts, we have: | |||
<math display="block"> | |||
\ldots b999\ldots=\ldots(b+1)000\ldots | |||
</math> | |||
But this solves our problem, because by identifying <math>\ldots b999\ldots=\ldots(b+1)000\ldots</math> the bijectivity issue of our correspondence is fixed, and we are done.}} | |||
The above theorem was of course quite difficult, but this is how things are. You might perhaps say why bothering with cuts, and not taking <math>x=\pm a_1\ldots a_n.b_1b_2b_3\ldots\ldots</math> as definition for the real numbers. Well, this is certainly possible, but when it comes to summing such numbers, or making products, or proving basic things such as the existence of <math>\sqrt{2}</math>, things become fairly complicated with the decimal writing picture. So, all the above is not as stupid as it seems. And we will come back anyway to all this later, with a 3rd picture for the real numbers, involving scary things like <math>\varepsilon</math> and <math>\delta</math>, and it will be up to you to decide, at that time, which picture is the one that you prefer. | |||
Moving on, we made the claim in the beginning of this chapter that “in real life, real numbers are never rational”. Here is a theorem, justifying this claim: | |||
{{proofcard|Theorem|theorem-7|The probability for a real number <math>x\in\mathbb R</math> to be rational is <math>0</math>. | |||
|This is something quite tricky, the idea being as follows: | |||
(1) Before starting, let us point out the fact that probability theory is something quite tricky, with probability 0 not necessarily meaning that the event cannot happen, but rather meaning that “better not count on that”. For instance according to my computations the probability of you winning <math>1</math> billion at the lottery is 0, but you are of course free to disagree, and prove me wrong, by playing every day at the lottery. | |||
(2) With this discussion made, and extrapolating now from finance and lottery to our question regarding real numbers, your possible argument of type “yes, but if I pick <math>x\in\mathbb R</math> to be <math>x=3/2</math>, I have proof that the probability for <math>x\in\mathbb Q</math> is nonzero” is therefore dismissed. Thus, our claim as stated makes sense, so let us try now to prove it. | |||
(3) By translation, it is enough to prove that the probability for a real number <math>x\in[0,1]</math> to be rational is 0. For this purpose, let us write the rational numbers <math>r\in[0,1]</math> in the form of a sequence <math>r_1,r_2,r_3\ldots\,</math>, with this being possible say by ordering our rationals <math>r=a/b</math> according to the lexicographic order on the pairs <math>(a,b)</math>: | |||
<math display="block"> | |||
\mathbb Q\cap[0,1]=\big\{r_1,r_2,r_3,\ldots\big\} | |||
</math> | |||
Let us also pick a number <math>c > 0</math>. Since the probability of having <math>x=r_1</math> is certainly smaller than <math>c/2</math>, then the probability of having <math>x=r_2</math> is certainly smaller than <math>c/4</math>, then the probability of having <math>x=r_3</math> is certainly smaller than <math>c/8</math> and so on, the probability for <math>x</math> to be rational satisfies the following inequality: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
P | |||
&\leq&\frac{c}{2}+\frac{c}{4}+\frac{c}{8}+\ldots\\ | |||
&=&c\left(\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\ldots\right)\\ | |||
&=&c | |||
\end{eqnarray*} | |||
</math> | |||
Here we have used the well-known formula <math>\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\ldots=1</math>, which comes by dividing <math>[0,1]</math> into half, and then one of the halves into half again, and so on, and then saying in the end that the pieces that we have must sum up to 1. Thus, we have indeed <math>P\leq c</math>, and since the number <math>c > 0</math> was arbitrary, we obtain <math>P=0</math>, as desired.}} | |||
As a comment here, all the above is of course quite tricky, and a bit bordeline in respect to what can be called “rigorous mathematics”. But we will be back to this, namely general probability theory, and in particular meaning of the mysterious formula <math>P=0</math>, countable sets, infinite sums and so on, on several occasions, throughout this book. | |||
Moving ahead now, let us construct now some more real numbers. We already know about <math>\sqrt{2}</math> and other numbers of the same type, namely roots of polynomials, and our knowledge here being quite decent, no hurry with this, we will be back to it later. So, let us get now into <math>\pi</math> and trigonometry. To start with, we have the following result: | |||
{{proofcard|Theorem|theorem-8|The following two definitions of <math>\pi</math> are equivalent: | |||
<ul><li> The length of the unit circle is <math>L=2\pi</math>. | |||
</li> | |||
<li> The area of the unit disk is <math>A=\pi</math>. | |||
</li> | |||
</ul> | |||
|In order to prove this theorem let us cut the unit disk as a pizza, into <math>N</math> slices, and forgetting about gastronomy, leave aside the rounded parts: | |||
<math display="block"> | |||
\xymatrix@R=23pt@C=8pt{ | |||
&\circ\ar@{-}[rr]\ar@{-}[dl]\ar@{-}[dr]&&\circ\ar@{-}[dl]\ar@{-}[dr]\\ | |||
\circ\ar@{-}[rr]&&\circ\ar@{-}[rr]&&\circ\\ | |||
&\circ\ar@{-}[rr]\ar@{-}[ul]\ar@{-}[ur]&&\circ\ar@{-}[ul]\ar@{-}[ur] | |||
} | |||
</math> | |||
The area to be eaten can be then computed as follows, where <math>H</math> is the height of the slices, <math>S</math> is the length of their sides, and <math>P=NS</math> is the total length of the sides: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
A | |||
&=&N\times \frac{HS}{2}\\ | |||
&=&\frac{HP}{2}\\ | |||
&\simeq&\frac{1\times L}{2} | |||
\end{eqnarray*} | |||
</math> | |||
Thus, with <math>N\to\infty</math> we obtain that we have <math>A=L/2</math>, as desired.}} | |||
In what regards now the precise value of <math>\pi</math>, the above picture at <math>N=6</math> shows that we have <math>\pi > 3</math>, but not by much. The precise figure is <math>\pi=3.14159\ldots\,</math>, but we will come back to this later, once we will have appropriate tools for dealing with such questions. It is also possible to prove that <math>\pi</math> is irrational, <math>\pi\notin\mathbb Q</math>, but this is not trivial either. | |||
Let us end this discussion about real numbers with some trigonometry. There are many things that can be said, that you certainly know, the basics being as follows: | |||
{{proofcard|Theorem|theorem-9|The following happen: | |||
<ul><li> We can talk about angles <math>x\in\mathbb R</math>, by using the unit circle, in the usual way, and in this correspondence, the right angle has a value of <math>\pi/2</math>. | |||
</li> | |||
<li> Associated to any <math>x\in\mathbb R</math> are numbers <math>\sin x,\cos x\in\mathbb R</math>, constructed in the usual way, by using a triangle. These numbers satisfy <math>\sin^2x+\cos^2x=1</math>. | |||
</li> | |||
</ul> | |||
|There are certainly things that you know, the idea being as follows: | |||
(1) The formula <math>L=2\pi</math> from Theorem 1.14 shows that the length of a quarter of the unit circle is <math>l=\pi/2</math>, and so the right angle has indeed this value, <math>\pi/2</math>. | |||
(2) As for <math>\sin^2x+\cos^2x=1</math>, called Pythagoras' theorem, this comes from the following picture, consisting of two squares and four identical triangles, as indicated: | |||
<math display="block"> | |||
\xymatrix@R=13pt@C=13pt{ | |||
\circ\ar@{-}[r]\ar@{-}[dd]&\circ\ar@{-}[rr]\ar@{-}[drr]&&\circ\ar@{-}[d]\\ | |||
&&&\circ\ar@{-}[dd]^{\sin x}&\\ | |||
\circ\ar@{-}[d]\ar@{-}[uur]\ar@{-}[drr]&&&&\\ | |||
\circ\ar@{-}[rr]&&\circ\ar@{-}[r]_{\cos x}\ar@{-}[uur]^1&\circ | |||
} | |||
</math> | |||
Indeed, when computing the area of the outer square, we obtain: | |||
<math display="block"> | |||
(\sin x+\cos x)^2=1+4\times\frac{\sin x\cos x}{2} | |||
</math> | |||
Now when expanding we obtain <math>\sin^2x+\cos^2x=1</math>, as claimed.}} | |||
It is possible to say many more things about angles and <math>\sin x</math>, <math>\cos x</math>, and also talk about some supplementary quantities, such as <math>\tan x=\sin x/\cos x</math>. But more on this later, once we will have some appropriate tools, beyond basic geometry, in order to discuss this. | |||
===1c. Sequences, convergence=== | |||
We already met, on several occasions, infinite sequences or sums, and their limits. Time now to clarify all this. Let us start with the following definition: | |||
{{defncard|label=|id=|We say that a sequence <math>\{x_n\}_{n\in\mathbb N}\subset\mathbb R</math> converges to <math>x\in\mathbb R</math> when: | |||
<math display="block"> | |||
\forall\varepsilon > 0,\exists N\in\mathbb N,\forall n\geq N,|x_n-x| < \varepsilon | |||
</math> | |||
In this case, we write <math>\lim_{n\to\infty}x_n=x</math>, or simply <math>x_n\to x</math>.}} | |||
This might look quite scary, at a first glance, but when thinking a bit, there is nothing scary about it. Indeed, let us try to understand, how shall we translate <math>x_n\to x</math> into mathematical language. The condition <math>x_n\to x</math> tells us that “when <math>n</math> is big, <math>x_n</math> is close to <math>x</math>”, and to be more precise, it tells us that “when <math>n</math> is big enough, <math>x_n</math> gets arbitrarily close to <math>x</math>”. But <math>n</math> big enough means <math>n\geq N</math>, for some <math>N\in\mathbb N</math>, and <math>x_n</math> arbitrarily close to <math>x</math> means <math>|x_n-x| < \varepsilon</math>, for some <math>\varepsilon > 0</math>. Thus, we are led to the above definition. | |||
As a basic example for all this, we have: | |||
{{proofcard|Proposition|proposition-2|We have <math>1/n\to0</math>. | |||
|This is obvious, but let us prove it by using Definition 1.16. We have: | |||
<math display="block"> | |||
\left|\frac{1}{n}-0\right| < \varepsilon | |||
\iff\frac{1}{n} < \varepsilon | |||
\iff\frac{1}{\varepsilon} < n | |||
</math> | |||
Thus we can take <math>N=[1/\varepsilon]+1</math> in Definition 1.16, and we are done.}} | |||
There are many other examples, and more on this in a moment. Going ahead with more theory, let us complement Definition 1.16 with: | |||
{{defncard|label=|id=|We write <math>x_n\to\infty</math> when the following condition is satisfied: | |||
<math display="block"> | |||
\forall K > 0,\exists N\in\mathbb N,\forall n\geq N,x_n > K | |||
</math> | |||
Similarly, we write <math>x_n\to-\infty</math> when the same happens, with <math>x_n < -K</math> at the end.}} | |||
Again, this is something very intuitive, coming from the fact that <math>x_n\to\infty</math> can only mean that <math>x_n</math> is arbitrarily big, for <math>n</math> big enough. As a basic illustration, we have: | |||
{{proofcard|Proposition|proposition-3|We have <math>n^2\to\infty</math>. | |||
|As before, this is obvious, but let us prove it using Definition 1.18. We have: | |||
<math display="block"> | |||
n^2 > K\iff n > \sqrt{K} | |||
</math> | |||
Thus we can take <math>N=[\sqrt{K}]+1</math> in Definition 1.18, and we are done.}} | |||
We can unify and generalize Proposition 1.17 and Proposition 1.19, as follows: | |||
{{proofcard|Proposition|proposition-4|We have the following convergence, with <math>n\to\infty</math>: | |||
<math display="block"> | |||
n^a\to\begin{cases} | |||
0&(a < 0)\\ | |||
1&(a=0)\\ | |||
\infty&(a > 0) | |||
\end{cases} | |||
</math> | |||
|This follows indeed by using the same method as in the proof of Proposition 1.17 and Proposition 1.19, first for <math>a</math> rational, and then for <math>a</math> real as well.}} | |||
We have some general results about limits, summarized as follows: | |||
{{proofcard|Theorem|theorem-10|The following happen: | |||
<ul><li> The limit <math>\lim_{n\to\infty}x_n</math>, if it exists, is unique. | |||
</li> | |||
<li> If <math>x_n\to x</math>, with <math>x\in(-\infty,\infty)</math>, then <math>x_n</math> is bounded. | |||
</li> | |||
<li> If <math>x_n</math> is increasing or descreasing, then it converges. | |||
</li> | |||
<li> Assuming <math>x_n\to x</math>, any subsequence of <math>x_n</math> converges to <math>x</math>. | |||
</li> | |||
</ul> | |||
|All this is elementary, coming from definitions: | |||
(1) Assuming <math>x_n\to x</math>, <math>x_n\to y</math> we have indeed, for any <math>\varepsilon > 0</math>, for <math>n</math> big enough: | |||
<math display="block"> | |||
|x-y|\leq|x-x_n|+|x_n-y| < 2\varepsilon | |||
</math> | |||
(2) Assuming <math>x_n\to x</math>, we have <math>|x_n-x| < 1</math> for <math>n\geq N</math>, and so, for any <math>k\in\mathbb N</math>: | |||
<math display="block"> | |||
|x_k| < 1+|x|+\sup\left(|x_1|,\ldots,|x_{n-1}|\right) | |||
</math> | |||
(3) By using <math>x\to-x</math>, it is enough to prove the result for increasing sequences. But here we can construct the limit <math>x\in(-\infty,\infty]</math> in the following way: | |||
<math display="block"> | |||
\bigcup_{n\in\mathbb N}(-\infty,x_n)=(-\infty,x) | |||
</math> | |||
(4) This is clear from definitions.}} | |||
Here are as well some general rules for computing limits: | |||
{{proofcard|Theorem|theorem-11|The following happen, with the conventions <math>\infty+\infty=\infty</math>, <math>\infty\cdot\infty=\infty</math>, <math>1/\infty=0</math>, and with the conventions that <math>\infty-\infty</math> and <math>\infty\cdot0</math> are undefined: | |||
<ul><li> <math>x_n\to x</math> implies <math>\lambda x_n\to\lambda x</math>. | |||
</li> | |||
<li> <math>x_n\to x</math>, <math>y_n\to y</math> implies <math>x_n+y_n\to x+y</math>. | |||
</li> | |||
<li> <math>x_n\to x</math>, <math>y_n\to y</math> implies <math>x_ny_n\to xy</math>. | |||
</li> | |||
<li> <math>x_n\to x</math> with <math>x\neq0</math> implies <math>1/x_n\to 1/x</math>. | |||
</li> | |||
</ul> | |||
|All this is again elementary, coming from definitions: | |||
(1) This is something which is obvious from definitions. | |||
(2) This follows indeed from the following estimate: | |||
<math display="block"> | |||
|x_n+y_n-x-y|\leq|x_n-x|+|y_n-y| | |||
</math> | |||
(3) This follows indeed from the following estimate: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
|x_ny_n-xy| | |||
&=&|(x_n-x)y_n+x(y_n-y)|\\ | |||
&\leq&|x_n-x|\cdot|y_n|+|x|\cdot|y_n-y| | |||
\end{eqnarray*} | |||
</math> | |||
(4) This is again clear, by estimating <math>1/x_n-1/x</math>, in the obvious way.}} | |||
As an application of the above rules, we have the following useful result: | |||
{{proofcard|Proposition|proposition-5|The <math>n\to\infty</math> limits of quotients of polynomials are given by | |||
<math display="block"> | |||
\lim_{n\to\infty}\frac{a_pn^p+a_{p-1}n^{p-1}+\ldots+a_0}{b_qn^q+b_{q-1}n^{q-1}+\ldots+b_0}=\lim_{n\to\infty}\frac{a_pn^p}{b_qn^q} | |||
</math> | |||
with the limit on the right being <math>\pm\infty</math>, <math>0</math>, <math>a_p/b_q</math>, depending on the values of <math>p,q</math>. | |||
|The first assertion comes from the following computation: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\lim_{n\to\infty}\frac{a_pn^p+a_{p-1}n^{p-1}+\ldots+a_0}{b_qn^q+b_{q-1}n^{q-1}+\ldots+b_0} | |||
&=&\lim_{n\to\infty}\frac{n^p}{n^q}\cdot\frac{a_p+a_{p-1}n^{-1}+\ldots+a_0n^{-p}}{b_q+b_{q-1}n^{-1}+\ldots+b_0n^{-q}}\\ | |||
&=&\lim_{n\to\infty}\frac{a_pn^p}{b_qn^q} | |||
\end{eqnarray*} | |||
</math> | |||
As for the second assertion, this comes from Proposition 1.20.}} | |||
Getting back now to theory, some sequences which obviously do not converge, like for instance <math>x_n=(-1)^n</math>, have however “2 limits instead of 1”. So let us formulate: | |||
{{defncard|label=|id=|Given a sequence <math>\{x_n\}_{n\in\mathbb N}\subset\mathbb R</math>, we let | |||
<math display="block"> | |||
\liminf_{n\to\infty}x_n\in[-\infty,\infty]\quad,\quad\limsup_{n\to\infty}x_n\in[-\infty,\infty] | |||
</math> | |||
to be the smallest and biggest limit of a subsequence of <math>(x_n)</math>.}} | |||
Observe that the above quantities are defined indeed for any sequence <math>x_n</math>. For instance, for <math>x_n=(-1)^n</math> we obtain <math>-1</math> and <math>1</math>. Also, for <math>x_n=n</math> we obtain <math>\infty</math> and <math>\infty</math>. And so on. Of course, and generalizing the <math>x_n=n</math> example, if <math>x_n\to x</math> we obtain <math>x</math> and <math>x</math>. | |||
Going ahead with more theory, here is a key result: | |||
{{proofcard|Theorem|theorem-12|A sequence <math>x_n</math> converges, with finite limit <math>x\in\mathbb R</math>, precisely when | |||
<math display="block"> | |||
\forall\varepsilon > 0,\exists N\in\mathbb N,\forall m,n\geq N,|x_m-x_n| < \varepsilon | |||
</math> | |||
called Cauchy condition. | |||
|In one sense, this is clear. In the other sense, we can say for instance that the Cauchy condition forces the decimal writings of our numbers <math>x_n</math> to coincide more and more, with <math>n\to\infty</math>, and so we can construct a limit <math>x=\lim_{n\to\infty}x_n</math>, as desired.}} | |||
The above result is quite interesting, and as an application, we have: | |||
{{proofcard|Theorem|theorem-13|<math>\mathbb R</math> is the completion of <math>\mathbb Q</math>, in the sense that it is the space of Cauchy sequences over <math>\mathbb Q</math>, identified when the virtual limit is the same, in the sense that: | |||
<math display="block"> | |||
x_n\sim y_n\iff |x_n-y_n|\to0 | |||
</math> | |||
Moreover, <math>\mathbb R</math> is complete, in the sense that it equals its own completion. | |||
|Let us denote the completion operation by <math>X\to\bar{X}=C_X/\sim</math>, where <math>C_X</math> is the space of Cauchy sequences over <math>X</math>, and <math>\sim</math> is the above equivalence relation. Since by Theorem 1.25 any Cauchy sequence <math>(x_n)\in C_\mathbb Q</math> has a limit <math>x\in\mathbb R</math>, we obtain <math>\bar{\mathbb Q}=\mathbb R</math>. As for the equality <math>\bar{\mathbb R}=\mathbb R</math>, this is clear again by using Theorem 1.25.}} | |||
===1d. Series, the number e=== | |||
With the above understood, we are now ready to get into some truly interesting mathematics. Let us start with the following definition: | |||
{{defncard|label=|id=|Given numbers <math>x_0,x_1,x_2,\ldots\in\mathbb R</math>, we write | |||
<math display="block"> | |||
\sum_{n=0}^\infty x_n=x | |||
</math> | |||
with <math>x\in[-\infty,\infty]</math> when <math>\lim_{k\to\infty}\sum_{n=0}^kx_n=x</math>.}} | |||
As before with the sequences, there is some general theory that can be developed for the series, and more on this in a moment. As a first, basic example, we have: | |||
{{proofcard|Theorem|theorem-14|We have the “geometric series” formula | |||
<math display="block"> | |||
\sum_{n=0}^\infty x^n=\frac{1}{1-x} | |||
</math> | |||
valid for any <math>|x| < 1</math>. For <math>|x|\geq1</math>, the series diverges. | |||
|Our first claim, which comes by multiplying and simplifying, is that: | |||
<math display="block"> | |||
\sum_{n=0}^kx^n=\frac{1-x^{k+1}}{1-x} | |||
</math> | |||
But this proves the first assertion, because with <math>k\to\infty</math> we get: | |||
<math display="block"> | |||
\sum_{n=0}^kx^n\to\frac{1}{1-x} | |||
</math> | |||
As for the second assertion, this is clear as well from our formula above.}} | |||
Less trivial now is the following result, due to Riemann: | |||
{{proofcard|Theorem|theorem-15|We have the following formula: | |||
<math display="block"> | |||
1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots=\infty | |||
</math> | |||
In fact, <math>\sum_n1/n^a</math> converges for <math>a > 1</math>, and diverges for <math>a\leq1</math>. | |||
|We have to prove several things, the idea being as follows: | |||
(1) The first assertion comes from the following computation: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots | |||
&=&1+\frac{1}{2}+\left(\frac{1}{3}+\frac{1}{4}\right) | |||
+\left(\frac{1}{5}+\frac{1}{6}+\frac{1}{7}+\frac{1}{8}\right)+\ldots\\ | |||
&\geq&1+\frac{1}{2}+\left(\frac{1}{4}+\frac{1}{4}\right) | |||
+\left(\frac{1}{8}+\frac{1}{8}+\frac{1}{8}+\frac{1}{8}\right)+\ldots\\ | |||
&=&1+\frac{1}{2}+\frac{1}{2}+\frac{1}{2}+\ldots\\ | |||
&=&\infty | |||
\end{eqnarray*} | |||
</math> | |||
(2) Regarding now the second assertion, we have that at <math>a=1</math>, and so at any <math>a\leq1</math>. Thus, it remains to prove that at <math>a > 1</math> the series converges. Let us first discuss the case <math>a=2</math>, which will prove the convergence at any <math>a\geq2</math>. The trick here is as follows: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
1+\frac{1}{4}+\frac{1}{9}+\frac{1}{16}+\ldots | |||
&\leq&1+\frac{1}{3}+\frac{1}{6}+\frac{1}{10}+\ldots\\ | |||
&=&2\left(\frac{1}{2}+\frac{1}{6}+\frac{1}{12}+\frac{1}{20}+\ldots\right)\\ | |||
&=&2\left[\left(1-\frac{1}{2}\right)+\left(\frac{1}{2}-\frac{1}{3}\right)+\left(\frac{1}{3}-\frac{1}{4}\right)+\left(\frac{1}{4}-\frac{1}{5}\right)\ldots\right]\\ | |||
&=&2 | |||
\end{eqnarray*} | |||
</math> | |||
(3) It remains to prove that the series converges at <math>a\in(1,2)</math>, and here it is enough to deal with the case of the exponents <math>a=1+1/p</math> with <math>p\in\mathbb N</math>. We already know how to do this at <math>p=1</math>, and the proof at <math>p\in\mathbb N</math> will be based on a similar trick. We have: | |||
<math display="block"> | |||
\sum_{n=0}^\infty\frac{1}{n^{1/p}}-\frac{1}{(n+1)^{1/p}}=1 | |||
</math> | |||
Let us compute, or rather estimate, the generic term of this series. By using the formula <math>a^p-b^p=(a-b)(a^{p-1}+a^{p-2}b+\ldots+ab^{p-2}+b^{p-1})</math>, we have: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\frac{1}{n^{1/p}}-\frac{1}{(n+1)^{1/p}} | |||
&=&\frac{(n+1)^{1/p}-n^{1/p}}{n^{1/p}(n+1)^{1/p}}\\ | |||
&=&\frac{1}{n^{1/p}(n+1)^{1/p}[(n+1)^{1-1/p}+\ldots+n^{1-1/p}]}\\ | |||
&\geq&\frac{1}{n^{1/p}(n+1)^{1/p}\cdot p(n+1)^{1-1/p}}\\ | |||
&=&\frac{1}{pn^{1/p}(n+1)}\\ | |||
&\geq&\frac{1}{p(n+1)^{1+1/p}} | |||
\end{eqnarray*} | |||
</math> | |||
We therefore obtain the following estimate for the Riemann sum: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
\sum_{n=0}^\infty\frac{1}{n^{1+1/p}} | |||
&=&1+\sum_{n=0}^\infty\frac{1}{(n+1)^{1+1/p}}\\ | |||
&\leq&1+p\sum_{n=0}^\infty\left(\frac{1}{n^{1/p}}-\frac{1}{(n+1)^{1/p}}\right)\\ | |||
&=&1+p | |||
\end{eqnarray*} | |||
</math> | |||
Thus, we are done with the case <math>a=1+1/p</math>, which finishes the proof.}} | |||
Here is another tricky result, this time about alternating sums: | |||
{{proofcard|Theorem|theorem-16|We have the following convergence result: | |||
<math display="block"> | |||
1-\frac{1}{2}+\frac{1}{3}-\frac{1}{4}+\ldots < \infty | |||
</math> | |||
However, when rearranging terms, we can obtain any <math>x\in[-\infty,\infty]</math> as limit. | |||
|Both the assertions follow from Theorem 1.29, as follows: | |||
(1) We have the following computation, using the Riemann criterion at <math>a=2</math>: | |||
<math display="block"> | |||
\begin{eqnarray*} | |||
1-\frac{1}{2}+\frac{1}{3}-\frac{1}{4}+\ldots | |||
&=&\left(1-\frac{1}{2}\right)+\left(\frac{1}{3}-\frac{1}{4}\right)+\ldots\\ | |||
&=&\frac{1}{2}+\frac{1}{12}+\frac{1}{30}+\ldots\\ | |||
& < &\frac{1}{1^2}+\frac{1}{2^2}+\frac{1}{3^2}+\ldots\\ | |||
& < &\infty | |||
\end{eqnarray*} | |||
</math> | |||
(2) We have the following formulae, coming from the Riemann criterion at <math>a=1</math>: | |||
<math display="block"> | |||
\frac{1}{2}+\frac{1}{4}+\frac{1}{6}+\frac{1}{8}+\ldots=\frac{1}{2}\left(1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots\right)=\infty | |||
</math> | |||
<math display="block"> | |||
1+\frac{1}{3}+\frac{1}{5}+\frac{1}{7}+\ldots\geq\frac{1}{2}+\frac{1}{4}+\frac{1}{6}+\frac{1}{8}+\ldots=\infty | |||
</math> | |||
Thus, both these series diverge. The point now is that, by using this, when rearranging terms in the alternating series in the statement, we can arrange for the partial sums to go arbitrarily high, or arbitrarily low, and we can obtain any <math>x\in[-\infty,\infty]</math> as limit.}} | |||
Back now to the general case, we first have the following statement: | |||
{{proofcard|Theorem|theorem-17|The following hold, with the converses of <math>(1)</math> and <math>(2)</math> being wrong, and with <math>(3)</math> not holding when the assumption <math>x_n\geq0</math> is removed: | |||
<ul><li> If <math>\sum_nx_n</math> converges then <math>x_n\to0</math>. | |||
</li> | |||
<li> If <math>\sum_n|x_n|</math> converges then <math>\sum_nx_n</math> converges. | |||
</li> | |||
<li> If <math>\sum_nx_n</math> converges, <math>x_n\geq0</math> and <math>x_n/y_n\to1</math> then <math>\sum_ny_n</math> converges. | |||
</li> | |||
</ul> | |||
|This is a mixture of trivial and non-trivial results, as follows: | |||
(1) We know that <math>\sum_nx_n</math> converges when <math>S_k=\sum_{n=0}^kx_n</math> converges. Thus by Cauchy we have <math>x_k=S_k-S_{k-1}\to0</math>, and this gives the result. As for the simplest counterexample for the converse, this is <math>1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots=\infty</math>, coming from Theorem 1.29. | |||
(2) This follows again from the Cauchy criterion, by using: | |||
<math display="block"> | |||
|x_n+x_{n+1}+\ldots+x_{n+k}|\leq|x_n|+|x_{n+1}|+\ldots+|x_{n+k}| | |||
</math> | |||
As for the simplest counterexample for the converse, this is <math>1-\frac{1}{2}+\frac{1}{3}-\frac{1}{4}+\ldots < \infty</math>, coming from Theorem 1.30, coupled with <math>1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots=\infty</math> from (1). | |||
(3) Again, the main assertion here is clear, coming from, for <math>n</math> big: | |||
<math display="block"> | |||
(1-\varepsilon)x_n\leq y_n\leq(1+\varepsilon)x_n | |||
</math> | |||
In what regards now the failure of the result, when the assumption <math>x_n\geq0</math> is removed, this is something quite tricky, the simplest counterexample being as follows: | |||
<math display="block"> | |||
x_n=\frac{(-1)^n}{\sqrt{n}} | |||
\quad,\quad | |||
y_n=\frac{1}{n}+\frac{(-1)^n}{\sqrt{n}} | |||
</math> | |||
To be more precise, we have <math>y_n/x_n\to1</math>, so <math>x_n/y_n\to1</math> too, but according to the above-mentioned results from (1,2), modified a bit, <math>\sum_nx_n</math> converges, while <math>\sum_ny_n</math> diverges.}} | |||
Summarizing, we have some useful positive results about series, which are however quite trivial, along with various counterexamples to their possible modifications, which are non-trivial. Staying positive, here are some more positive results: | |||
{{proofcard|Theorem|theorem-18|The following happen, and in all cases, the situtation where <math>c=1</math> is indeterminate, in the sense that the series can converge or diverge: | |||
<ul><li> If <math>|x_{n+1}/x_n|\to c</math>, the series <math>\sum_nx_n</math> converges if <math>c < 1</math>, and diverges if <math>c > 1</math>. | |||
</li> | |||
<li> If <math>\sqrt[n]{|x_n|}\to c</math>, the series <math>\sum_nx_n</math> converges if <math>c < 1</math>, and diverges if <math>c > 1</math>. | |||
</li> | |||
<li> With <math>c=\limsup_{n\to\infty}\sqrt[n]{|x_n|}</math>, <math>\sum_nx_n</math> converges if <math>c < 1</math>, and diverges if <math>c > 1</math>. | |||
</li> | |||
</ul> | |||
|Again, this is a mixture of trivial and non-trivial results, as follows: | |||
(1) Here the main assertions, regarding the cases <math>c < 1</math> and <math>c > 1</math>, are both clear by comparing with the geometric series <math>\sum_nc^n</math>. As for the case <math>c=1</math>, this is what happens for the Riemann series <math>\sum_n1/n^a</math>, so we can have both convergent and divergent series. | |||
(2) Again, the main assertions, where <math>c < 1</math> or <math>c > 1</math>, are clear by comparing with the geometric series <math>\sum_nc^n</math>, and the <math>c=1</math> examples come from the Riemann series. | |||
(3) Here the case <math>c < 1</math> is dealt with as in (2), and the same goes for the examples at <math>c=1</math>. As for the case <math>c > 1</math>, this is clear too, because here <math>x_n\to0</math> fails.}} | |||
Finally, generalizing the first assertion in Theorem 1.30, we have: | |||
{{proofcard|Theorem|theorem-19|If <math>x_n\searrow0</math> then <math>\sum_n(-1)^nx_n</math> converges. | |||
|We have the <math>\sum_n(-1)^nx_n=\sum_ky_k</math>, where: | |||
<math display="block"> | |||
y_k=x_{2k}-x_{2k+1} | |||
</math> | |||
But, by drawing for instance the numbers <math>x_i</math> on the real line, we see that <math>y_k</math> are positive numbers, and that <math>\sum_ky_k</math> is the sum of lengths of certain disjoint intervals, included in the interval <math>[0,x_0]</math>. Thus we have <math>\sum_ky_k\leq x_0</math>, and this gives the result.}} | |||
All this was a bit theoretical, and as something more concrete now, we have: | |||
{{proofcard|Theorem|theorem-20|We have the following convergence | |||
<math display="block"> | |||
\left(1+\frac{1}{n}\right)^n\to e | |||
</math> | |||
where <math>e=2.71828\ldots</math> is a certain number. | |||
|This is something quite tricky, as follows: | |||
(1) Our first claim is that the following sequence is increasing: | |||
<math display="block"> | |||
x_n=\left(1+\frac{1}{n}\right)^n | |||
</math> | |||
In order to prove this, we use the following arithmetic-geometric inequality: | |||
<math display="block"> | |||
\frac{1+\sum_{i=1}^n\left(1+\frac{1}{n}\right)}{n+1}\geq\sqrt[n+1]{1\cdot\prod_{i=1}^n\left(1+\frac{1}{n}\right)} | |||
</math> | |||
In practice, this gives the following inequality: | |||
<math display="block"> | |||
1+\frac{1}{n+1}\geq\left(1+\frac{1}{n}\right)^{n/(n+1)} | |||
</math> | |||
Now by raising to the power <math>n+1</math> we obtain, as desired: | |||
<math display="block"> | |||
\left(1+\frac{1}{n+1}\right)^{n+1}\geq\left(1+\frac{1}{n}\right)^n | |||
</math> | |||
(2) Normally we are left with proving that <math>x_n</math> is bounded from above, but this is non-trivial, and we have to use a trick. Consider the following sequence: | |||
<math display="block"> | |||
y_n=\left(1+\frac{1}{n}\right)^{n+1} | |||
</math> | |||
We will prove that this sequence <math>y_n</math> is decreasing, and together with the fact that we have <math>x_n/y_n\to1</math>, this will give the result. So, this will be our plan. | |||
(3) In order to prove now that <math>y_n</math> is decreasing, we use, a bit as before: | |||
<math display="block"> | |||
\frac{1+\sum_{i=1}^n\left(1-\frac{1}{n}\right)}{n+1}\geq\sqrt[n+1]{1\cdot\prod_{i=1}^n\left(1-\frac{1}{n}\right)} | |||
</math> | |||
In practice, this gives the following inequality: | |||
<math display="block"> | |||
1-\frac{1}{n+1}\geq\left(1-\frac{1}{n}\right)^{n/(n+1)} | |||
</math> | |||
Now by raising to the power <math>n+1</math> we obtain from this: | |||
<math display="block"> | |||
\left(1-\frac{1}{n+1}\right)^{n+1}\geq\left(1-\frac{1}{n}\right)^n | |||
</math> | |||
The point now is that we have the following inversion formulae: | |||
<math display="block"> | |||
\left(1-\frac{1}{n+1}\right)^{-1}=\left(\frac{n}{n+1}\right)^{-1}=\frac{n+1}{n}=1+\frac{1}{n} | |||
</math> | |||
<math display="block"> | |||
\left(1-\frac{1}{n}\right)^{-1}=\left(\frac{n-1}{n}\right)^{-1}=\frac{n}{n-1}=1+\frac{1}{n-1} | |||
</math> | |||
Thus by inverting the inequality that we found, we obtain, as desired: | |||
<math display="block"> | |||
\left(1+\frac{1}{n}\right)^{n+1}\leq\left(1+\frac{1}{n-1}\right)^n | |||
</math> | |||
(4) But with this, we can now finish. Indeed, the sequence <math>x_n</math> is increasing, the sequence <math>y_n</math> is decreasing, and we have <math>x_n < y_n</math>, as well as: | |||
<math display="block"> | |||
\frac{y_n}{x_n}=1+\frac{1}{n}\to1 | |||
</math> | |||
Thus, both sequences <math>x_n,y_n</math> converge to a certain number <math>e</math>, as desired. | |||
(5) Finally, regarding the numerics for our limiting number <math>e</math>, we know from the above that we have <math>x_n < e < y_n</math> for any <math>n\in\mathbb N</math>, which reads: | |||
<math display="block"> | |||
\left(1+\frac{1}{n}\right)^n < e < \left(1+\frac{1}{n}\right)^{n+1} | |||
</math> | |||
Thus <math>e\in[2,3]</math>, and with a bit of patience, or a computer, we obtain <math>e=2.71828\ldots</math> We will actually come back to this question later, with better methods.}} | |||
We should mention that there are many other ways of getting into <math>e</math>. For instance it is possible to prove that we have the following formula, which is a bit more conceptual than the formula in Theorem 1.34, and also with the convergence being very quick: | |||
<math display="block"> | |||
\sum_{n=0}^\infty\frac{1}{n!}=e | |||
</math> | |||
Importantly, all this not the end of the story with <math>e</math>. For instance, in relation with the first formula that we found, from Theorem 1.34, we have, more generally: | |||
<math display="block"> | |||
\left(1+\frac{x}{n}\right)^n\to e^x | |||
</math> | |||
Also, in relation with the second formula, from above, we have, more generally: | |||
<math display="block"> | |||
\sum_{n=0}^\infty\frac{x^n}{n!}=e^x | |||
</math> | |||
To be more precise, these latter two formulae are something that we know at <math>x=1</math>. The case <math>x=0</math> is trivial, the case <math>x=-1</math> follows from the case <math>x=1</math>, via some simple manipulations, and with a bit more work, we can get these formulae for any <math>x\in\mathbb N</math>, and then for any <math>x\in\mathbb Z</math>. However, the general case <math>x\in\mathbb R</math> is quite tricky, requiring a good knowledge of the theory of real functions. And, good news, real functions will be what we will be doing in the remainder of this first part, in chapters 2-4 below. | |||
==General references== | |||
{{cite arXiv|last1=Banica|first1=Teo|year=2024|title=Calculus and applications|eprint=2401.00911|class=math.CO}} |
Latest revision as of 15:13, 21 April 2025
1a. Binomials, factorials
We denote by [math]\mathbb N[/math] the set of positive integers, [math]\mathbb N=\{0,1,2,3,\ldots\}[/math], with [math]\mathbb N[/math] standing for “natural”. Quite often in computations we will need negative numbers too, and we denote by [math]\mathbb Z[/math] the set of all integers, [math]\mathbb Z=\{\ldots,-2,-1,0,1,2,\ldots\}[/math], with [math]\mathbb Z[/math] standing from “zahlen”, which is German for “numbers”. Finally, there are many questions in mathematics involving fractions, or quotients, which are called rational numbers:
The rational numbers are the quotients of type
Observe that we have inclusions [math]\mathbb N\subset\mathbb Z\subset\mathbb Q[/math]. The integers add and multiply according to the rules that you know well. As for the rational numbers, these add according to the usual rule for quotients, which is as follows, and death penalty for forgetting it:
Also, the rational numbers multiply according to the usual rule for quotients, namely:
Beyond rationals, we have the real numbers, whose set is denoted [math]\mathbb R[/math], and which include beasts such as [math]\sqrt{3}=1.73205\ldots[/math] or [math]\pi=3.14159\ldots[/math] But more on these later. For the moment, let us see what can be done with integers, and their quotients. As a first theorem, solving a problem which often appears in real life, we have:
The number of possibilities of choosing [math]k[/math] objects among [math]n[/math] objects is
Imagine a set consisting of [math]n[/math] objects. We have [math]n[/math] possibilities for choosing our 1st object, then [math]n-1[/math] possibilities for choosing our 2nd object, out of the [math]n-1[/math] objects left, and so on up to [math]n-k+1[/math] possibilities for choosing our [math]k[/math]-th object, out of the [math]n-k+1[/math] objects left. Since the possibilities multiply, the total number of choices is:
But is this correct. Normally a mathematical theorem coming with mathematical proof is guaranteed to be [math]100\%[/math] correct, and if in addition the proof is truly clever, like the above proof was, with that fraction trick, the confidence rate jumps up to [math]200\%[/math].
This being said, never knows, so let us doublecheck, by taking for instance [math]n=3,k=2[/math]. Here we have to choose 2 objects among 3 objects, and this is something easily done, because what we have to do is to dismiss one of the objects, and [math]N=3[/math] choices here, and keep the 2 objects left. Thus, we have [math]N=3[/math] choices. On the other hand our genius math computation gives [math]N=3!/1!=6[/math], which is obviously the wrong answer.
So, where is the mistake? Thinking a bit, the number [math]N[/math] that we computed is in fact the number of possibilities of choosing [math]k[/math] ordered objects among [math]n[/math] objects. Thus, we must divide everything by the number [math]M[/math] of orderings of the [math]k[/math] objects that we chose:
In order to compute now the missing number [math]M[/math], imagine a set consisting of [math]k[/math] objects. There are [math]k[/math] choices for the object to be designated [math]\#1[/math], then [math]k-1[/math] choices for the object to be designated [math]\#2[/math], and so on up to 1 choice for the object to be designated [math]\#k[/math]. We conclude that we have [math]M=k(k-1)\ldots 2\cdot 1=k![/math], and so:
And this is the correct answer, because, well, that is how things are. In case you doubt, at [math]n=3,k=2[/math] for instance we obtain [math]3!/2!1!=3[/math], which is correct.
All this is quite interesting, and in addition to having some exciting mathematics going on, and more on this in a moment, we have as well some philosophical conclusions. Formulae can be right or wrong, and as the above shows, good-looking, formal mathematical proofs can be right or wrong too. So, what to do? Here is my advice: \begin{advice} Always doublecheck what you're doing, regularly, and definitely at the end, either with an alternative proof, or with some numerics. \end{advice} This is something very serious. Unless you're doing something very familiar, that you're used to for at least 5-10 years or so, like doing additions and multiplications for you, or some easy calculus for me, formulae and proofs that you can come upon are by default wrong. In order to make them correct, and ready to use, you must check and doublecheck and correct them, helped by alternative methods, or numerics.
Which brings us into the question on whether mathematics is an exact science or not. Not clear. Chemistry for instance is an exact science, because findings of type “a mixture of water and salt cannot explode” look rock-solid. Same for biology, with findings of type “crocodiles eat fish” being rock-solid too. In what regards mathematics however, and theoretical physics too, things are always prone to human mistake.
And for ending this discussion, you might ask then, what about engineering? After all, this is mathematics and physics, which is usually [math]100\%[/math] correct, because most of the bridges, buildings and other things built by engineers don't collapse. Well, this is because engineers follow, and in a truly maniac way, the above Advice 1.3. You won't declare a project for a bridge, building, engine and so on final and correct, ready for production, until you checked and doublechecked it with 10 different methods or so, won't you.
Back to work now, as an important adding to Theorem 1.2, we have:
\begin{convention}
By definition, [math]0!=1[/math].
\end{convention}
This convention comes, and no surprise here, from Advice 1.3. Indeed, we obviously have [math]\binom{n}{n}=1[/math], but if we want to recover this formula via Theorem 1.2 we are a bit in trouble, and so we must declare that [math]0!=1[/math], as for the following computation to work:
Going ahead now with more mathematics and less philosophy, with Theorem 1.2 complemented by Convention 1.4 being in final form (trust me), we have:
We have the binomial formula
We have to compute the following quantity, with [math]n[/math] terms in the product:
When expanding, we obtain a certain sum of products of [math]a,b[/math] variables, with each such product being a quantity of type [math]a^kb^{n-k}[/math]. Thus, we have a formula as follows:
In order to finish, it remains to compute the coefficients [math]C_k[/math]. But, according to our product formula, [math]C_k[/math] is the number of choices for the [math]k[/math] needed [math]a[/math] variables among the [math]n[/math] available [math]a[/math] variables. Thus, according to Theorem 1.2, we have:
We are therefore led to the formula in the statement.
Theorem 1.5 is something quite interesting, so let us doublecheck it with some numerics. At small values of [math]n[/math] we obtain the following formulae, which are all correct:
Now observe that in these formulae, say for memorization purposes, the powers of the [math]a,b[/math] variables are something very simple, that can be recovered right away. What matters are the coefficients, which are the binomial coefficients [math]\binom{n}{k}[/math], which form a triangle. So, it is enough to memorize this triangle, and this can be done by using:
The Pascal triangle, formed by the binomial coefficients [math]\binom{n}{k}[/math],
In practice, the theorem states that the following formula holds:
There are many ways of proving this formula, all instructive, as follows:
(1) Brute-force computation. We have indeed, as desired:
(2) Algebraic proof. We have the following formula, to start with:
By using the binomial formula, this formula becomes:
Now let us perform the multiplication on the right. We obtain a certain sum of terms of type [math]a^kb^{n-k}[/math], and to be more precise, each such [math]a^kb^{n-k}[/math] term can either come from the [math]\binom{n-1}{k-1}[/math] terms [math]a^{k-1}b^{n-k}[/math] multiplied by [math]a[/math], or from the [math]\binom{n-1}{k}[/math] terms [math]a^kb^{n-1-k}[/math] multiplied by [math]b[/math]. Thus, the coefficient of [math]a^kb^{n-k}[/math] on the right is [math]\binom{n-1}{k-1}+\binom{n-1}{k}[/math], as desired.
(3) Combinatorics. Let us count [math]k[/math] objects among [math]n[/math] objects, with one of the [math]n[/math] objects having a hat on top. Obviously, the hat has nothing to do with the count, and we obtain [math]\binom{n}{k}[/math]. On the other hand, we can say that there are two possibilities. Either the object with hat is counted, and we have [math]\binom{n-1}{k-1}[/math] possibilities here, or the object with hat is not counted, and we have [math]\binom{n-1}{k}[/math] possibilities here. Thus [math]\binom{n}{k}=\binom{n-1}{k-1}+\binom{n-1}{k}[/math], as desired.
There are many more things that can be said about binomial coefficients, with all sorts of interesting formulae, but the idea is always the same, namely that in order to find such formulae you have a choice between algebra and combinatorics, and that when it comes to proofs, the brute-force computation method is useful too. In practice, the best is to master all 3 techniques. Among others, because of Advice 1.3. You will have in this way 3 different methods, for making sure that your formulae are correct indeed.
1b. Real numbers, analysis
All the above was very nice, but remember that we are here for doing science and physics, and more specifically for mathematically understanding the numeric variables [math]x,y,z,\ldots[/math] coming from real life. Such variables can be lengths, volumes, pressures and so on, which vary continuously with time, and common sense dictates that there is little to no chance for our variables to be rational, [math]x,y,z,\ldots\notin\mathbb Q[/math]. In fact, we will even see soon a theorem, stating that the probability for such a variable to be rational is exactly 0. Or, to put it in a dramatic way, “rational numbers don't exist in real life.
You are certainly familiar with the real numbers, but let us review now their definition, which is something quite tricky. As a first goal, we would like to construct a number [math]x=\sqrt{2}[/math] having the property [math]x^2=2[/math]. But how to do this? Let us start with:
There is no number [math]r\in\mathbb Q_+[/math] satisfying [math]r^2=2[/math]. In fact, we have
In what regards the first assertion, assuming that [math]r=a/b[/math] with [math]a,b\in\mathbb N[/math] prime to each other satisfies [math]r^2=2[/math], we have [math]a^2=2b^2[/math], so [math]a\in2\mathbb N[/math]. But by using again [math]a^2=2b^2[/math] we obtain [math]b\in2\mathbb N[/math], contradiction. As for the second assertion, this is obvious.
It looks like we are a bit stuck. We can't really tell who [math]\sqrt{2}[/math] is, and the only piece of information about [math]\sqrt{2}[/math] that we have comes from the knowledge of the rational numbers satisfying [math]p^2 \lt 2[/math] or [math]q^2 \gt 2[/math]. To be more precise, the picture that emerges is: \begin{conclusion} The number [math]\sqrt{2}[/math] is the abstract beast which is bigger than all rationals satisfying [math]p^2 \lt 2[/math], and smaller than all positive rationals satisfying [math]q^2 \gt 2[/math]. \end{conclusion} This does not look very good, but you know what, instead of looking for more clever solutions to our problem, what about relaxing, or being lazy, or coward, or you name it, and taking Conclusion 1.8 as a definition for [math]\sqrt{2}[/math]. This is actually something not that bad, and leads to the following “lazy” definition for the real numbers:
The real numbers [math]x\in\mathbb R[/math] are formal cuts in the set of rationals,
This might look quite original, but believe me, there is some genius behind this definition. As a first observation, we have an inclusion [math]\mathbb Q\subset\mathbb R[/math], obtained by identifying each rational number [math]r\in\mathbb Q[/math] with the obvious cut that it produces, namely:
As a second observation, the addition and multiplication of real numbers, obtained by adding and multiplying the corresponding cuts, in the obvious way, is something very simple. To be more precise, in what regards the addition, the formula is as follows:
As for the multiplication, the formula here is similar, namely [math]\mathbb Q_{\leq xy}=\mathbb Q_{\leq x}\mathbb Q_{\leq y}[/math], up to some mess with positives and negatives, which is quite easy to untangle, and with this being a good exercise. We can also talk about order between real numbers, as follows:
But let us perhaps leave more abstractions for later, and go back to more concrete things. As a first success of our theory, we can formulate the following theorem:
The equation [math]x^2=2[/math] has two solutions over the real numbers, namely the positive solution, denoted [math]\sqrt{2}[/math], and its negative counterpart, which is [math]-\sqrt{2}[/math].
By using [math]x\to-x[/math], it is enough to prove that [math]x^2=2[/math] has exactly one positive solution [math]\sqrt{2}[/math]. But this is clear, because [math]\sqrt{2}[/math] can only come from the following cut:
Thus, we are led to the conclusion in the statement.
More generally, the same method works in order to extract the square root [math]\sqrt{r}[/math] of any number [math]r\in\mathbb Q_+[/math], or even of any number [math]r\in\mathbb R_+[/math], and we have the following result:
The solutions of [math]ax^2+bx+c=0[/math] with [math]a,b,c\in\mathbb R[/math] are
We can write our equation in the following way:
Thus, we are led to the conclusion in the statement.
Summarizing, we have a nice definition for the real numbers, that we can certainly do some math with. However, for anything more advanced we are in need of the decimal writing for the real numbers. The result here is as follows:
The real numbers [math]x\in\mathbb R[/math] can be written in decimal form,
This is something quite non-trivial, assuming that you already have some familiarity with such things, for the rational numbers. The idea is as follows:
(1) First of all, our precise claim is that any [math]x\in\mathbb R[/math] can be written in the form in the statement, with the integer [math]\pm a_1\ldots a_n[/math] and then each of the digits [math]b_1,b_2,b_3,\ldots[/math] providing the best approximation of [math]x[/math], at that stage of the approximation.
(2) Moreover, we have a second claim as well, namely that any expression of type [math]x=\pm a_1\ldots a_n.b_1b_2b_3\ldots\ldots[/math] corresponds to a real number [math]x\in\mathbb R[/math], and that with the convention [math]\ldots b999\ldots=\ldots(b+1)000\ldots\,[/math], the correspondence is bijective.
(3) In order to prove now these two assertions, our first claim is that we can restrict the attention to the case [math]x\in[0,1)[/math], and with this meaning of course [math]0\leq x \lt 1[/math], with respect to the order relation for the reals discussed in the above.
(4) Getting started now, let [math]x\in\mathbb R[/math], coming from a cut [math]\mathbb Q=\mathbb Q_{\leq x}\sqcup\mathbb Q_{ \gt x}[/math]. Since the set [math]\mathbb Q_{\leq x}\cap\mathbb Z[/math] consists of integers, and is bounded from above by any element [math]q\in\mathbb Q_{ \gt x}[/math] of your choice, this set has a maximal element, that we can denote [math][x][/math]:
It follows from definitions that [math][x][/math] has the usual properties of the integer part, namely:
Thus we have [math]x=[x]+y[/math] with [math][x]\in\mathbb Z[/math] and [math]y\in[0,1)[/math], and getting back now to what we want to prove, namely (1,2) above, it is clear that it is enough to prove these assertions for the remainder [math]y\in[0,1)[/math]. Thus, we have proved (3), and we can assume [math]x\in[0,1)[/math].
(5) So, assume [math]x\in[0,1)[/math]. We are first looking for a best approximation from below of type [math]0.b_1[/math], with [math]b_1\in\{0,\ldots,9\}[/math], and it is clear that such an approximation exists, simply by comparing [math]x[/math] with the numbers [math]0.0,0.1,\ldots,0.9[/math]. Thus, we have our first digit [math]b_1[/math], and then we can construct the second digit [math]b_2[/math] as well, by comparing [math]x[/math] with the numbers [math]0.b_10,0.b_11,\ldots,0.b_19[/math]. And so on, which finishes the proof of our claim (1).
(6) In order to prove now the remaining claim (2), let us restrict again the attention, as explained in (4), to the case [math]x\in[0,1)[/math]. First, it is clear that any expression of type [math]x=0.b_1b_2b_3\ldots[/math] defines a real number [math]x\in[0,1][/math], simply by declaring that the corresponding cut [math]\mathbb Q=\mathbb Q_{\leq x}\sqcup\mathbb Q_{ \gt x}[/math] comes from the following set, and its complement:
(7) Thus, we have our correspondence between real numbers as cuts, and real numbers as decimal expressions, and we are left with the question of investigating the bijectivity of this correspondence. But here, the only bug that happens is that numbers of type [math]x=\ldots b999\ldots[/math], which produce reals [math]x\in\mathbb R[/math] via (6), do not come from reals [math]x\in\mathbb R[/math] via (5). So, in order to finish our proof, we must investigate such numbers.
(8) So, consider an expression of type [math]\ldots b999\ldots[/math] Going back to the construction in (6), we are led to the conclusion that we have the following equality:
Thus, at the level of the real numbers defined as cuts, we have:
But this solves our problem, because by identifying [math]\ldots b999\ldots=\ldots(b+1)000\ldots[/math] the bijectivity issue of our correspondence is fixed, and we are done.
The above theorem was of course quite difficult, but this is how things are. You might perhaps say why bothering with cuts, and not taking [math]x=\pm a_1\ldots a_n.b_1b_2b_3\ldots\ldots[/math] as definition for the real numbers. Well, this is certainly possible, but when it comes to summing such numbers, or making products, or proving basic things such as the existence of [math]\sqrt{2}[/math], things become fairly complicated with the decimal writing picture. So, all the above is not as stupid as it seems. And we will come back anyway to all this later, with a 3rd picture for the real numbers, involving scary things like [math]\varepsilon[/math] and [math]\delta[/math], and it will be up to you to decide, at that time, which picture is the one that you prefer.
Moving on, we made the claim in the beginning of this chapter that “in real life, real numbers are never rational”. Here is a theorem, justifying this claim:
The probability for a real number [math]x\in\mathbb R[/math] to be rational is [math]0[/math].
This is something quite tricky, the idea being as follows:
(1) Before starting, let us point out the fact that probability theory is something quite tricky, with probability 0 not necessarily meaning that the event cannot happen, but rather meaning that “better not count on that”. For instance according to my computations the probability of you winning [math]1[/math] billion at the lottery is 0, but you are of course free to disagree, and prove me wrong, by playing every day at the lottery.
(2) With this discussion made, and extrapolating now from finance and lottery to our question regarding real numbers, your possible argument of type “yes, but if I pick [math]x\in\mathbb R[/math] to be [math]x=3/2[/math], I have proof that the probability for [math]x\in\mathbb Q[/math] is nonzero” is therefore dismissed. Thus, our claim as stated makes sense, so let us try now to prove it.
(3) By translation, it is enough to prove that the probability for a real number [math]x\in[0,1][/math] to be rational is 0. For this purpose, let us write the rational numbers [math]r\in[0,1][/math] in the form of a sequence [math]r_1,r_2,r_3\ldots\,[/math], with this being possible say by ordering our rationals [math]r=a/b[/math] according to the lexicographic order on the pairs [math](a,b)[/math]:
Let us also pick a number [math]c \gt 0[/math]. Since the probability of having [math]x=r_1[/math] is certainly smaller than [math]c/2[/math], then the probability of having [math]x=r_2[/math] is certainly smaller than [math]c/4[/math], then the probability of having [math]x=r_3[/math] is certainly smaller than [math]c/8[/math] and so on, the probability for [math]x[/math] to be rational satisfies the following inequality:
Here we have used the well-known formula [math]\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+\ldots=1[/math], which comes by dividing [math][0,1][/math] into half, and then one of the halves into half again, and so on, and then saying in the end that the pieces that we have must sum up to 1. Thus, we have indeed [math]P\leq c[/math], and since the number [math]c \gt 0[/math] was arbitrary, we obtain [math]P=0[/math], as desired.
As a comment here, all the above is of course quite tricky, and a bit bordeline in respect to what can be called “rigorous mathematics”. But we will be back to this, namely general probability theory, and in particular meaning of the mysterious formula [math]P=0[/math], countable sets, infinite sums and so on, on several occasions, throughout this book.
Moving ahead now, let us construct now some more real numbers. We already know about [math]\sqrt{2}[/math] and other numbers of the same type, namely roots of polynomials, and our knowledge here being quite decent, no hurry with this, we will be back to it later. So, let us get now into [math]\pi[/math] and trigonometry. To start with, we have the following result:
The following two definitions of [math]\pi[/math] are equivalent:
- The length of the unit circle is [math]L=2\pi[/math].
- The area of the unit disk is [math]A=\pi[/math].
In order to prove this theorem let us cut the unit disk as a pizza, into [math]N[/math] slices, and forgetting about gastronomy, leave aside the rounded parts:
The area to be eaten can be then computed as follows, where [math]H[/math] is the height of the slices, [math]S[/math] is the length of their sides, and [math]P=NS[/math] is the total length of the sides:
Thus, with [math]N\to\infty[/math] we obtain that we have [math]A=L/2[/math], as desired.
In what regards now the precise value of [math]\pi[/math], the above picture at [math]N=6[/math] shows that we have [math]\pi \gt 3[/math], but not by much. The precise figure is [math]\pi=3.14159\ldots\,[/math], but we will come back to this later, once we will have appropriate tools for dealing with such questions. It is also possible to prove that [math]\pi[/math] is irrational, [math]\pi\notin\mathbb Q[/math], but this is not trivial either.
Let us end this discussion about real numbers with some trigonometry. There are many things that can be said, that you certainly know, the basics being as follows:
The following happen:
- We can talk about angles [math]x\in\mathbb R[/math], by using the unit circle, in the usual way, and in this correspondence, the right angle has a value of [math]\pi/2[/math].
- Associated to any [math]x\in\mathbb R[/math] are numbers [math]\sin x,\cos x\in\mathbb R[/math], constructed in the usual way, by using a triangle. These numbers satisfy [math]\sin^2x+\cos^2x=1[/math].
There are certainly things that you know, the idea being as follows:
(1) The formula [math]L=2\pi[/math] from Theorem 1.14 shows that the length of a quarter of the unit circle is [math]l=\pi/2[/math], and so the right angle has indeed this value, [math]\pi/2[/math].
(2) As for [math]\sin^2x+\cos^2x=1[/math], called Pythagoras' theorem, this comes from the following picture, consisting of two squares and four identical triangles, as indicated:
Indeed, when computing the area of the outer square, we obtain:
Now when expanding we obtain [math]\sin^2x+\cos^2x=1[/math], as claimed.
It is possible to say many more things about angles and [math]\sin x[/math], [math]\cos x[/math], and also talk about some supplementary quantities, such as [math]\tan x=\sin x/\cos x[/math]. But more on this later, once we will have some appropriate tools, beyond basic geometry, in order to discuss this.
1c. Sequences, convergence
We already met, on several occasions, infinite sequences or sums, and their limits. Time now to clarify all this. Let us start with the following definition:
We say that a sequence [math]\{x_n\}_{n\in\mathbb N}\subset\mathbb R[/math] converges to [math]x\in\mathbb R[/math] when:
This might look quite scary, at a first glance, but when thinking a bit, there is nothing scary about it. Indeed, let us try to understand, how shall we translate [math]x_n\to x[/math] into mathematical language. The condition [math]x_n\to x[/math] tells us that “when [math]n[/math] is big, [math]x_n[/math] is close to [math]x[/math]”, and to be more precise, it tells us that “when [math]n[/math] is big enough, [math]x_n[/math] gets arbitrarily close to [math]x[/math]”. But [math]n[/math] big enough means [math]n\geq N[/math], for some [math]N\in\mathbb N[/math], and [math]x_n[/math] arbitrarily close to [math]x[/math] means [math]|x_n-x| \lt \varepsilon[/math], for some [math]\varepsilon \gt 0[/math]. Thus, we are led to the above definition.
As a basic example for all this, we have:
We have [math]1/n\to0[/math].
This is obvious, but let us prove it by using Definition 1.16. We have:
Thus we can take [math]N=[1/\varepsilon]+1[/math] in Definition 1.16, and we are done.
There are many other examples, and more on this in a moment. Going ahead with more theory, let us complement Definition 1.16 with:
We write [math]x_n\to\infty[/math] when the following condition is satisfied:
Again, this is something very intuitive, coming from the fact that [math]x_n\to\infty[/math] can only mean that [math]x_n[/math] is arbitrarily big, for [math]n[/math] big enough. As a basic illustration, we have:
We have [math]n^2\to\infty[/math].
As before, this is obvious, but let us prove it using Definition 1.18. We have:
Thus we can take [math]N=[\sqrt{K}]+1[/math] in Definition 1.18, and we are done.
We can unify and generalize Proposition 1.17 and Proposition 1.19, as follows:
We have the following convergence, with [math]n\to\infty[/math]:
This follows indeed by using the same method as in the proof of Proposition 1.17 and Proposition 1.19, first for [math]a[/math] rational, and then for [math]a[/math] real as well.
We have some general results about limits, summarized as follows:
The following happen:
- The limit [math]\lim_{n\to\infty}x_n[/math], if it exists, is unique.
- If [math]x_n\to x[/math], with [math]x\in(-\infty,\infty)[/math], then [math]x_n[/math] is bounded.
- If [math]x_n[/math] is increasing or descreasing, then it converges.
- Assuming [math]x_n\to x[/math], any subsequence of [math]x_n[/math] converges to [math]x[/math].
All this is elementary, coming from definitions:
(1) Assuming [math]x_n\to x[/math], [math]x_n\to y[/math] we have indeed, for any [math]\varepsilon \gt 0[/math], for [math]n[/math] big enough:
(2) Assuming [math]x_n\to x[/math], we have [math]|x_n-x| \lt 1[/math] for [math]n\geq N[/math], and so, for any [math]k\in\mathbb N[/math]:
(3) By using [math]x\to-x[/math], it is enough to prove the result for increasing sequences. But here we can construct the limit [math]x\in(-\infty,\infty][/math] in the following way:
(4) This is clear from definitions.
Here are as well some general rules for computing limits:
The following happen, with the conventions [math]\infty+\infty=\infty[/math], [math]\infty\cdot\infty=\infty[/math], [math]1/\infty=0[/math], and with the conventions that [math]\infty-\infty[/math] and [math]\infty\cdot0[/math] are undefined:
- [math]x_n\to x[/math] implies [math]\lambda x_n\to\lambda x[/math].
- [math]x_n\to x[/math], [math]y_n\to y[/math] implies [math]x_n+y_n\to x+y[/math].
- [math]x_n\to x[/math], [math]y_n\to y[/math] implies [math]x_ny_n\to xy[/math].
- [math]x_n\to x[/math] with [math]x\neq0[/math] implies [math]1/x_n\to 1/x[/math].
All this is again elementary, coming from definitions:
(1) This is something which is obvious from definitions.
(2) This follows indeed from the following estimate:
(3) This follows indeed from the following estimate:
(4) This is again clear, by estimating [math]1/x_n-1/x[/math], in the obvious way.
As an application of the above rules, we have the following useful result:
The [math]n\to\infty[/math] limits of quotients of polynomials are given by
The first assertion comes from the following computation:
As for the second assertion, this comes from Proposition 1.20.
Getting back now to theory, some sequences which obviously do not converge, like for instance [math]x_n=(-1)^n[/math], have however “2 limits instead of 1”. So let us formulate:
Given a sequence [math]\{x_n\}_{n\in\mathbb N}\subset\mathbb R[/math], we let
Observe that the above quantities are defined indeed for any sequence [math]x_n[/math]. For instance, for [math]x_n=(-1)^n[/math] we obtain [math]-1[/math] and [math]1[/math]. Also, for [math]x_n=n[/math] we obtain [math]\infty[/math] and [math]\infty[/math]. And so on. Of course, and generalizing the [math]x_n=n[/math] example, if [math]x_n\to x[/math] we obtain [math]x[/math] and [math]x[/math].
Going ahead with more theory, here is a key result:
A sequence [math]x_n[/math] converges, with finite limit [math]x\in\mathbb R[/math], precisely when
In one sense, this is clear. In the other sense, we can say for instance that the Cauchy condition forces the decimal writings of our numbers [math]x_n[/math] to coincide more and more, with [math]n\to\infty[/math], and so we can construct a limit [math]x=\lim_{n\to\infty}x_n[/math], as desired.
The above result is quite interesting, and as an application, we have:
[math]\mathbb R[/math] is the completion of [math]\mathbb Q[/math], in the sense that it is the space of Cauchy sequences over [math]\mathbb Q[/math], identified when the virtual limit is the same, in the sense that:
Let us denote the completion operation by [math]X\to\bar{X}=C_X/\sim[/math], where [math]C_X[/math] is the space of Cauchy sequences over [math]X[/math], and [math]\sim[/math] is the above equivalence relation. Since by Theorem 1.25 any Cauchy sequence [math](x_n)\in C_\mathbb Q[/math] has a limit [math]x\in\mathbb R[/math], we obtain [math]\bar{\mathbb Q}=\mathbb R[/math]. As for the equality [math]\bar{\mathbb R}=\mathbb R[/math], this is clear again by using Theorem 1.25.
1d. Series, the number e
With the above understood, we are now ready to get into some truly interesting mathematics. Let us start with the following definition:
Given numbers [math]x_0,x_1,x_2,\ldots\in\mathbb R[/math], we write
As before with the sequences, there is some general theory that can be developed for the series, and more on this in a moment. As a first, basic example, we have:
We have the “geometric series” formula
Our first claim, which comes by multiplying and simplifying, is that:
But this proves the first assertion, because with [math]k\to\infty[/math] we get:
As for the second assertion, this is clear as well from our formula above.
Less trivial now is the following result, due to Riemann:
We have the following formula:
We have to prove several things, the idea being as follows:
(1) The first assertion comes from the following computation:
(2) Regarding now the second assertion, we have that at [math]a=1[/math], and so at any [math]a\leq1[/math]. Thus, it remains to prove that at [math]a \gt 1[/math] the series converges. Let us first discuss the case [math]a=2[/math], which will prove the convergence at any [math]a\geq2[/math]. The trick here is as follows:
(3) It remains to prove that the series converges at [math]a\in(1,2)[/math], and here it is enough to deal with the case of the exponents [math]a=1+1/p[/math] with [math]p\in\mathbb N[/math]. We already know how to do this at [math]p=1[/math], and the proof at [math]p\in\mathbb N[/math] will be based on a similar trick. We have:
Let us compute, or rather estimate, the generic term of this series. By using the formula [math]a^p-b^p=(a-b)(a^{p-1}+a^{p-2}b+\ldots+ab^{p-2}+b^{p-1})[/math], we have:
We therefore obtain the following estimate for the Riemann sum:
Thus, we are done with the case [math]a=1+1/p[/math], which finishes the proof.
Here is another tricky result, this time about alternating sums:
We have the following convergence result:
Both the assertions follow from Theorem 1.29, as follows:
(1) We have the following computation, using the Riemann criterion at [math]a=2[/math]:
(2) We have the following formulae, coming from the Riemann criterion at [math]a=1[/math]:
Thus, both these series diverge. The point now is that, by using this, when rearranging terms in the alternating series in the statement, we can arrange for the partial sums to go arbitrarily high, or arbitrarily low, and we can obtain any [math]x\in[-\infty,\infty][/math] as limit.
Back now to the general case, we first have the following statement:
The following hold, with the converses of [math](1)[/math] and [math](2)[/math] being wrong, and with [math](3)[/math] not holding when the assumption [math]x_n\geq0[/math] is removed:
- If [math]\sum_nx_n[/math] converges then [math]x_n\to0[/math].
- If [math]\sum_n|x_n|[/math] converges then [math]\sum_nx_n[/math] converges.
- If [math]\sum_nx_n[/math] converges, [math]x_n\geq0[/math] and [math]x_n/y_n\to1[/math] then [math]\sum_ny_n[/math] converges.
This is a mixture of trivial and non-trivial results, as follows:
(1) We know that [math]\sum_nx_n[/math] converges when [math]S_k=\sum_{n=0}^kx_n[/math] converges. Thus by Cauchy we have [math]x_k=S_k-S_{k-1}\to0[/math], and this gives the result. As for the simplest counterexample for the converse, this is [math]1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots=\infty[/math], coming from Theorem 1.29.
(2) This follows again from the Cauchy criterion, by using:
As for the simplest counterexample for the converse, this is [math]1-\frac{1}{2}+\frac{1}{3}-\frac{1}{4}+\ldots \lt \infty[/math], coming from Theorem 1.30, coupled with [math]1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots=\infty[/math] from (1).
(3) Again, the main assertion here is clear, coming from, for [math]n[/math] big:
In what regards now the failure of the result, when the assumption [math]x_n\geq0[/math] is removed, this is something quite tricky, the simplest counterexample being as follows:
To be more precise, we have [math]y_n/x_n\to1[/math], so [math]x_n/y_n\to1[/math] too, but according to the above-mentioned results from (1,2), modified a bit, [math]\sum_nx_n[/math] converges, while [math]\sum_ny_n[/math] diverges.
Summarizing, we have some useful positive results about series, which are however quite trivial, along with various counterexamples to their possible modifications, which are non-trivial. Staying positive, here are some more positive results:
The following happen, and in all cases, the situtation where [math]c=1[/math] is indeterminate, in the sense that the series can converge or diverge:
- If [math]|x_{n+1}/x_n|\to c[/math], the series [math]\sum_nx_n[/math] converges if [math]c \lt 1[/math], and diverges if [math]c \gt 1[/math].
- If [math]\sqrt[n]{|x_n|}\to c[/math], the series [math]\sum_nx_n[/math] converges if [math]c \lt 1[/math], and diverges if [math]c \gt 1[/math].
- With [math]c=\limsup_{n\to\infty}\sqrt[n]{|x_n|}[/math], [math]\sum_nx_n[/math] converges if [math]c \lt 1[/math], and diverges if [math]c \gt 1[/math].
Again, this is a mixture of trivial and non-trivial results, as follows:
(1) Here the main assertions, regarding the cases [math]c \lt 1[/math] and [math]c \gt 1[/math], are both clear by comparing with the geometric series [math]\sum_nc^n[/math]. As for the case [math]c=1[/math], this is what happens for the Riemann series [math]\sum_n1/n^a[/math], so we can have both convergent and divergent series.
(2) Again, the main assertions, where [math]c \lt 1[/math] or [math]c \gt 1[/math], are clear by comparing with the geometric series [math]\sum_nc^n[/math], and the [math]c=1[/math] examples come from the Riemann series.
(3) Here the case [math]c \lt 1[/math] is dealt with as in (2), and the same goes for the examples at [math]c=1[/math]. As for the case [math]c \gt 1[/math], this is clear too, because here [math]x_n\to0[/math] fails.
Finally, generalizing the first assertion in Theorem 1.30, we have:
If [math]x_n\searrow0[/math] then [math]\sum_n(-1)^nx_n[/math] converges.
We have the [math]\sum_n(-1)^nx_n=\sum_ky_k[/math], where:
But, by drawing for instance the numbers [math]x_i[/math] on the real line, we see that [math]y_k[/math] are positive numbers, and that [math]\sum_ky_k[/math] is the sum of lengths of certain disjoint intervals, included in the interval [math][0,x_0][/math]. Thus we have [math]\sum_ky_k\leq x_0[/math], and this gives the result.
All this was a bit theoretical, and as something more concrete now, we have:
We have the following convergence
This is something quite tricky, as follows:
(1) Our first claim is that the following sequence is increasing:
In order to prove this, we use the following arithmetic-geometric inequality:
In practice, this gives the following inequality:
Now by raising to the power [math]n+1[/math] we obtain, as desired:
(2) Normally we are left with proving that [math]x_n[/math] is bounded from above, but this is non-trivial, and we have to use a trick. Consider the following sequence:
We will prove that this sequence [math]y_n[/math] is decreasing, and together with the fact that we have [math]x_n/y_n\to1[/math], this will give the result. So, this will be our plan.
(3) In order to prove now that [math]y_n[/math] is decreasing, we use, a bit as before:
In practice, this gives the following inequality:
Now by raising to the power [math]n+1[/math] we obtain from this:
The point now is that we have the following inversion formulae:
Thus by inverting the inequality that we found, we obtain, as desired:
(4) But with this, we can now finish. Indeed, the sequence [math]x_n[/math] is increasing, the sequence [math]y_n[/math] is decreasing, and we have [math]x_n \lt y_n[/math], as well as:
Thus, both sequences [math]x_n,y_n[/math] converge to a certain number [math]e[/math], as desired.
(5) Finally, regarding the numerics for our limiting number [math]e[/math], we know from the above that we have [math]x_n \lt e \lt y_n[/math] for any [math]n\in\mathbb N[/math], which reads:
Thus [math]e\in[2,3][/math], and with a bit of patience, or a computer, we obtain [math]e=2.71828\ldots[/math] We will actually come back to this question later, with better methods.
We should mention that there are many other ways of getting into [math]e[/math]. For instance it is possible to prove that we have the following formula, which is a bit more conceptual than the formula in Theorem 1.34, and also with the convergence being very quick:
Importantly, all this not the end of the story with [math]e[/math]. For instance, in relation with the first formula that we found, from Theorem 1.34, we have, more generally:
Also, in relation with the second formula, from above, we have, more generally:
To be more precise, these latter two formulae are something that we know at [math]x=1[/math]. The case [math]x=0[/math] is trivial, the case [math]x=-1[/math] follows from the case [math]x=1[/math], via some simple manipulations, and with a bit more work, we can get these formulae for any [math]x\in\mathbb N[/math], and then for any [math]x\in\mathbb Z[/math]. However, the general case [math]x\in\mathbb R[/math] is quite tricky, requiring a good knowledge of the theory of real functions. And, good news, real functions will be what we will be doing in the remainder of this first part, in chapters 2-4 below.
General references
Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].