guide:65baa63b75: Difference between revisions

From Stochiki
No edit summary
 
No edit summary
 
Line 1: Line 1:
<div class="d-none"><math>
\newcommand{\mathds}{\mathbb}</math></div>
{{Alert-warning|This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion. }}
===7a. Function spaces===
In this chapter we go back to the functions of one real variable, <math>f:\mathbb R\to\mathbb R</math> or <math>f:\mathbb R\to\mathbb C</math>, with some applications, by using our complex function technology. We will be mainly interested in constructing the Fourier transform, which is an operation <math>f\to\widehat{f}</math> on such functions, which can solve various questions. Some advertisement first:
\begin{advertisement}
There is only one serious tool in mathematics, and that is the Fourier transform.
\end{advertisement}
Or at least, that's what some people say. And as fans of calculus, we should of course side with them. We will see for instance how, via Fourier transform, light can be decomposed into color components, distant galaxies can be observed, atoms can be distinguished from each other, the future of the universe can be predicted, and so on.


Before doing that, however, let us study the spaces that the functions <math>f:\mathbb R\to\mathbb C</math> can form. These functions can be continuous, differentiable, infinitely differentiable, and so on, but there are many more properties that these functions can have, that we will investigate now. This will lead to various spaces of functions <math>f:\mathbb R\to\mathbb C</math>, that can be used, among others, in order to well-define the Fourier transform operation <math>f\to\widehat{f}</math>.
Let us start with some well-known and useful inequalities, as follows:
{{proofcard|Theorem|theorem-1|Given two functions <math>f,g:\mathbb R\to\mathbb C</math> and an exponent <math>p\geq1</math>, we have
<math display="block">
\left(\int_\mathbb R|f+g|^p\right)^{1/p}\leq\left(\int_\mathbb R|f|^p\right)^{1/p}+\left(\int_\mathbb R|g|^p\right)^{1/p}
</math>
called Minkowski inequality. Also, assuming that <math>p,q\geq1</math> satisfy <math>1/p+1/q=1</math>, we have
<math display="block">
\int_\mathbb R|fg|\leq \left(\int_\mathbb R|f|^p\right)^{1/p}\left(\int_\mathbb R|g|^q\right)^{1/q}
</math>
called Hölder inequality. These inequalities hold as well for <math>\infty</math> values of the exponents.
|All this is very standard, the idea being as follows:
(1) As a first observation, at <math>p=2</math>, which is a special exponent, we have <math>q=2</math> as well, and the Minkowski and Hölder inequalities are as follows:
<math display="block">
\left(\int_\mathbb R|f+g|^2\right)^{1/2}\leq\left(\int_\mathbb R|f|^2\right)^{1/2}+\left(\int_\mathbb R|g|^2\right)^{1/2}
</math>
<math display="block">
\int_\mathbb R|fg|\leq \left(\int_\mathbb R|f|^2\right)^{1/2}\left(\int_\mathbb R|g|^2\right)^{1/2}
</math>
But the proof of the Hölder inequality, called Cauchy-Schwarz inequality in this case, is something elementary, coming from the fact that <math>I(t)=\int_\mathbb R|f+twg|^2</math> with <math>|w|=1</math> is a positive degree 2 polynomial in <math>t\in\mathbb R</math>, and so its discriminant must be negative. As for the Minkowski inequality, this follows from this, by taking squares and simplifying.
(2) In general, let us first prove Hölder, in the case of finite exponents, <math>p,q\in(1,\infty)</math>. By linearity we can assume that <math>f,g</math> are normalized, in the following way:
<math display="block">
\int_\mathbb R|f|^p=\int_\mathbb R|g|^q=1
</math>
In this case, we want to prove that we have <math>\int_\mathbb R|fg|\leq1</math>. And for this purpose, we can use the Young inequality from chapter 3, which gives, for any <math>x\in\mathbb R</math>:
<math display="block">
|f(x)g(x)|\leq\frac{|f(x)|^p}{p}+\frac{|g(x)|^q}{q}
</math>
By integrating now over <math>x\in\mathbb R</math>, we obtain from this, as desired:
<math display="block">
\int_\mathbb R|fg|
\leq\int_\mathbb R\frac{|f|^p}{p}+\frac{|g|^q}{q}
=\frac{1}{p}+\frac{1}{q}
=1
</math>
Let us prove now Minkowski, again in the finite exponent case, <math>p\in(1,\infty)</math>. We have the following estimate, using the Hölder inequality, and the conjugate exponent:
<math display="block">
\begin{eqnarray*}
\int_\mathbb R|f+g|^p
&=&\int_\mathbb R|f+g|\cdot|f+g|^{p-1}\\
&\leq&\int_\mathbb R|f|\cdot|f+g|^{p-1}+\int_\mathbb R|g|\cdot|f+g|^{p-1}\\
&\leq&\left(\int_\mathbb R|f|^p\right)^{1/p}\left(\int_\mathbb R|f+g|^{(p-1)q}\right)^{1/q}\\
&+&\left(\int_\mathbb R|g|^p\right)^{1/p}\left(\int_\mathbb R|f+g|^{(p-1)q}\right)^{1/q}\\
&=&\left[\left(\int_\mathbb R|f|^p\right)^{1/p}+\left(\int_\mathbb R|g|^p\right)^{1/p}\right]\left(\int_\mathbb R|f+g|^p\right)^{1-1/p}
\end{eqnarray*}
</math>
Thus, we are led to the Minkowski inequality in the statement.
(3) Finally, with the convention that <math>(\int_\mathbb R|f|^p)^{1/p}</math> takes as value at <math>p=\infty</math> the essential supremum of <math>f</math>, the Minkowski inequality holds as well at <math>p=\infty</math>, trivially:
<math display="block">
\sup\big|f+g\big|\leq\sup\big|f\big|+\sup\big|g\big|
</math>
The same goes for the Hölder inequality at <math>p=\infty,q=1</math>, which is simply:
<math display="block">
\int_\mathbb R|fg|\leq\sup\big|f\big|\times\int_\mathbb R|g|
</math>
And finally, the same goes for the Hölder inequality at <math>p=1,q=\infty</math>.}}
As a consequence of the above results, we can formulate:
{{proofcard|Theorem|theorem-2|Given an interval <math>I\subset\mathbb R</math> and an exponent <math>p\geq1</math>, the following space, with the convention that functions are identified up to equality almost everywhere,
<math display="block">
L^p(I)=\left\{f:I\to\mathbb C\Big|\int_I|f(x)|^pdx < \infty\right\}
</math>
is a vector space, and the following quantity,
<math display="block">
||f||_p=\left(\int_I|f(x)|^p\right)^{1/p}
</math>
is a norm on it, in the sense that it satisfies the usual conditions for a vector space norm. Moreover, <math>L^p(I)</math> is complete with respect to the distance <math>d(f,g)=||f-g||_p</math>.
|This basically follows from Theorem 7.2, the idea being as follows:
(1) Again, let us first see what happens at <math>p=2</math>. Here everything is standard from what we have in Theorem 7.2, and with the remark that the space <math>L^2(I)</math> that we obtain is more than just a normed vector space, because we have as well a scalar product, related to the norm by the formula <math>||f||_2=\sqrt{ < f,f > }</math>, constructed as follows:
<math display="block">
< f,g > =\int_If(x)\,\overline{g(x)}\,dx
</math>
(2) In the general case now, where <math>p\geq1</math> is still finite, but arbitrary, the proof is similar, basically coming from the Minkowski inequality from Theorem 7.2.
(3) Finally, the extension at <math>p=\infty</math> is clear too, coming from definitions, and with the various conventions made at the end of the proof of Theorem 7.2.}}
There are many more things that can be said about the above spaces <math>L^2(I)</math>, which are all good to know, and we refer here to any functional analysis book. Going ahead now with our study of functions <math>f:\mathbb R\to\mathbb C</math>, let us define an interesting operation on such functions, called convolution, which is useful for many purposes. Let us start with:
{{defncard|label=|id=|The convolution of two functions <math>f,g:\mathbb R\to\mathbb C</math> is the function
<math display="block">
(f*g)(x)=\int_\mathbb R f(x-y)g(y)dy
</math>
provided that the function <math>y\to f(x-y)g(y)</math> is indeed integrable, for any <math>x</math>.}}
There are many reasons for introducing this operation, that we will gradually discover, in what follows. As a basic example, let us take <math>g=\chi_{[0,1]}</math>. We have then:
<math display="block">
(f*g)(x)=\int_0^1f(x-y)dy
</math>
Thus, with this choice of <math>g</math>, the operation <math>f\to f*g</math> has some sort of “regularizing effect”, that can be useful for many purposes. We will be back to this, later.
Goinh ahead with more theory, let us try to understand when the convolution operation is well-defined. We have here the following basic result:
{{proofcard|Theorem|theorem-3|The convolution operation is well-defined on the space
<math display="block">
C_c(\mathbb R)=\left\{f\in C(\mathbb R)\Big|supp(f)=\ {\rm compact}\right\}
</math>
of continuous functions <math>f:\mathbb R\to\mathbb C</math> having compact support.
|We have several things to be proved, the idea being as follows:
(1) First we must show that given two functions <math>f,g\in C_c(\mathbb R)</math>, their convolution <math>f*g</math> is well-defined, as a function <math>f*g:\mathbb R\to\mathbb C</math>. But this follows from the following estimate, where <math>l</math> denotes the length of the compact subsets of <math>\mathbb R</math>:
<math display="block">
\begin{eqnarray*}
\int_\mathbb R|f(x-y)g(y)|dy
&=&\int_{supp(g)}|f(x-y)g(y)|dy\\
&\leq&\max(g)\int_{supp(g)}|f(x-y)|dy\\
&\leq&\max(g)\cdot l(supp(g))\cdot\max(f)\\
& < &\infty
\end{eqnarray*}
</math>
(2) Next, we must show that the function <math>f*g:\mathbb R\to\mathbb C</math> that we constructed is indeed continuous. But this follows from the following estimate, where <math>K_f</math> is the constant of uniform continuity for the function <math>f\in C_c(\mathbb R)</math>:
<math display="block">
\begin{eqnarray*}
|(f*g)(x+\varepsilon)-(f*g)(x)|
&=&\left|\int_\mathbb R f(x+\varepsilon-y)g(y)dy-\int_\mathbb R f(x-y)g(y)dy\right|\\
&=&\left|\int_\mathbb R\left(f(x+\varepsilon-y)-f(x-y)\right)g(y)dy\right|\\
&\leq&\int_\mathbb R|f(x+\varepsilon-y)-f(x-y)|\cdot|g(y)|dy\\
&\leq&K_f\cdot\varepsilon\cdot\int_\mathbb R|g|
\end{eqnarray*}
</math>
(3) Finally, we must show that the function <math>f*g\in C(\mathbb R)</math> that we constructed has indeed compact support. For this purpose, our claim is that we have:
<math display="block">
supp(f+g)\subset supp(f)+supp(g)
</math>
In order to prove this claim, observe that we have, by definition of <math>f*g</math>:
<math display="block">
\begin{eqnarray*}
(f*g)(x)
&=&\int_\mathbb R f(x-y)g(y)dy\\
&=&\int_{supp(g)}f(x-y)g(y)dy
\end{eqnarray*}
</math>
But this latter quantity being 0 for <math>x\notin supp(f)+supp(g)</math>, this gives the result.}}
Here are now a few remarkable properties of the convolution operation:
{{proofcard|Proposition|proposition-1|The following hold, for the functions in <math>C_c(\mathbb R)</math>:
<ul><li> <math>f*g=g*f</math>.
</li>
<li> <math>f*(g*h)=(f*g)*h</math>.
</li>
<li> <math>f*(\lambda g+\mu h)=\lambda f*g+\mu f*h</math>.
</li>
</ul>
|These formulae are all elementary, the idea being as follows:
(1) This follows from the following computation, with <math>y=x-t</math>:
<math display="block">
\begin{eqnarray*}
(f*g)(x)
&=&\int_\mathbb R f(x-y)g(y)dy\\
&=&\int_\mathbb R f(t)g(x-t)dt\\
&=&\int_\mathbb R g(x-t)f(t)dt\\
&=&(g*f)(x)
\end{eqnarray*}
</math>
(2) This is clear from definitions.
(3) Once again, this is clear from definitions.}}
In relation with derivatives, and with the “regularizing effect” of the convolution operation mentioned after Definition 7.4, we have the following result:
{{proofcard|Theorem|theorem-4|Given two functions <math>f,g\in C_c(\mathbb R)</math>, assuming that <math>g</math> is differentiable, then so is <math>f*g</math>, with derivative given by the following formula:
<math display="block">
(f*g)'=f*g'
</math>
More generally, given <math>f,g\in C_c(\mathbb R)</math>, and assuming that <math>g</math> is <math>k</math> times differentiable, then so is <math>f*g</math>, with <math>k</math>-th derivative given by <math>(f*g)^{(k)}=f*g^{(k)}</math>.
|In what regards the first assertion, with <math>y=x-t</math>, then <math>t=x-y</math>, we get:
<math display="block">
\begin{eqnarray*}
(f*g)'(x)
&=&\frac{d}{dx}\int_\mathbb R f(x-y)g(y)dy\\
&=&\frac{d}{dx}\int_\mathbb R f(t)g(x-t)dt\\
&=&\int_\mathbb R f(t)g'(x-t)dt\\
&=&\int_\mathbb R f(x-y)g'(y)dy\\
&=&(f*g')(x)
\end{eqnarray*}
</math>
As for the second assertion, this follows form the first one, by recurrence.}}
Finally, getting beyond the compactly supported continuous functions, we have the following result, which is of particular theoretical importance:
{{proofcard|Theorem|theorem-5|The convolution operation is well-defined on <math>L^1(\mathbb R)</math>, and we have:
<math display="block">
||f*g||_1\leq||f||_1||g||_1
</math>
Thus, if <math>f\in L^1(\mathbb R)</math> and <math>g\in C_c^k(\mathbb R)</math>, then <math>f*g</math> is well-defined, and <math>f*g\in C_c^k(\mathbb R)</math>.
|In what regards the first assertion, this follows from the following computation, involving an intuitive manipulation on the double integrals, called Fubini theorem, that we will use as such here, and that we will fully clarify later on, in this book:
<math display="block">
\begin{eqnarray*}
\int_\mathbb R|(f*g)(x)|dx
&\leq&\int_\mathbb R\int_\mathbb R|f(x-y)g(y)|dydx\\
&=&\int_\mathbb R\int_\mathbb R|f(x-y)g(y)|dxdy\\
&=&\int_\mathbb R|f(x)|dx\int_\mathbb R|g(y)|dy
\end{eqnarray*}
</math>
As for the second assertion, this follows from the first one, and from Theorem 7.7.}}
Summarizing, we have now some good knowledge of the various spaces that the functions <math>f:\mathbb R\to\mathbb C</math> can form, and we have as well an interesting regularization operation <math>f\to f*g</math> on such functions, that can be used for various purposes.
===7b. Fourier transform===
We discuss here the construction and main properties of the Fourier transform, which is the main tool in analysis, and even in mathematics in general. We first have:
{{defncard|label=|id=|Given <math>f\in L^1(\mathbb R)</math>, we define a function <math>\widehat{f}:\mathbb R\to\mathbb C</math> by
<math display="block">
\widehat{f}(\xi)=\int_\mathbb R e^{ix\xi}f(x)dx
</math>
and call it Fourier transform of <math>f</math>.}}
As a first observation, even if <math>f</math> is a real function, <math>\widehat{f}</math> is a complex function, which is not necessarily real. Also, <math>\widehat{f}</math> is obviously well-defined, because <math>f\in L^1(\mathbb R)</math> and <math>|e^{ix\xi}|=1</math>. Also, the condition <math>f\in L^1(\mathbb R)</math> is basically needed for construcing <math>\widehat{f}</math>, because:
<math display="block">
\widehat{f}(0)=\int_\mathbb Rf(x)dx
</math>
Generally speaking, the Fourier transform is there for helping with various computations, with the above formula <math>\widehat{f}(0)=\int f</math> being something quite illustrating. Here are some basic properties of the Fourier transform, all providing some good motivations:
{{proofcard|Proposition|proposition-2|The Fourier transform has the following properties:
<ul><li> Linearity: <math>\widehat{f+g}=\widehat{f}+\widehat{g}</math>, <math>\widehat{\lambda f}=\lambda\widehat{f}</math>.
</li>
<li> Regularity: <math>\widehat{f}</math> is continuous and bounded.
</li>
<li> If <math>f</math> is even then <math>\widehat{f}</math> is even.
</li>
<li> If <math>f</math> is odd then <math>\widehat{f}</math> is odd.
</li>
</ul>
|These results are all elementary, as follows:
(1) The additivity formula is clear from definitions, as follows:
<math display="block">
\begin{eqnarray*}
\widehat{f+g}(\xi)
&=&\int_\mathbb R e^{ix\xi}(f+g)(x)dx\\
&=&\int_\mathbb R e^{ix\xi}f(x)dx+\int_\mathbb R e^{ix\xi}f(x)dx\\
&=&\widehat{f}(\xi)+\widehat{g}(\xi)
\end{eqnarray*}
</math>
As for the formula <math>\widehat{\lambda f}=\lambda\widehat{f}</math>, this is clear as well.
(2) The continuity of <math>\widehat{f}</math> follows indeed from:
<math display="block">
\begin{eqnarray*}
|\widehat{f}(\xi+\varepsilon)-\widehat{f}(\xi)|
&\leq&\int_\mathbb R\left|(e^{ix(\xi+\varepsilon)}-e^{ix\xi})f(x)\right|dx\\
&=&\int_\mathbb R\left|e^{ix\xi}(e^{ix\varepsilon}-1)f(x)\right|dx\\
&\leq&|e^{ix\varepsilon}-1|\int_\mathbb R|f|
\end{eqnarray*}
</math>
As for the boundedness of <math>\widehat{f}</math>, this is clear as well.
(3) This follows from the following computation, assuming that <math>f</math> is even:
<math display="block">
\begin{eqnarray*}
\widehat{f}(-\xi)
&=&\int_\mathbb R e^{-ix\xi}f(x)dx\\
&=&\int_\mathbb R e^{ix\xi}f(-x)dx\\
&=&\int_\mathbb R e^{ix\xi}f(x)dx\\
&=&\widehat{f}(\xi)
\end{eqnarray*}
</math>
(4) The proof here is similar to the proof of (3), by changing some signs.}}
We will be back to more theory in a moment, but let us explore now the examples. Here are some basic computations of Fourier transforms:
{{proofcard|Proposition|proposition-3|We have the following Fourier transform formulae,
<math display="block">
f=\chi_{[-a,a]}\implies\widehat{f}(\xi)=\frac{2\sin(a\xi)}{\xi}
</math>
<math display="block">
f=e^{-ax}\chi_{[0,\infty]}(x)\implies\widehat{f}(\xi)=\frac{1}{a-i\xi}
</math>
<math display="block">
f=e^{ax}\chi_{[-\infty,0]}(x)\implies\widehat{f}(\xi)=\frac{1}{a+i\xi}
</math>
<math display="block">
f=e^{-a|x|}\implies\widehat{f}(\xi)=\frac{2a}{a^2+\xi^2}
</math>
<math display="block">
f=sgn(x)e^{-a|x|}\implies\widehat{f}(\xi)=\frac{2i\xi}{a^2+\xi^2}
</math>
valid for any number <math>a > 0</math>.
|All this follows from some calculus, as follows:
(1) In what regards first formula, assuming <math>f=\chi_{[-a,a]}</math>, we have, by using the fact that <math>\sin(x\xi)</math> is an odd function, whose integral vanishes on centered intervals:
<math display="block">
\begin{eqnarray*}
\widehat{f}(\xi)
&=&\int_{-a}^ae^{ix\xi}dx\\
&=&\int_{-a}^a\cos(x\xi)dx+i\int_{-a}^a\sin(x\xi)dx\\
&=&\int_{-a}^a\cos(x\xi)dx\\
&=&\left[\frac{\sin(x\xi)}{\xi}\right]_{-a}^a\\
&=&\frac{2\sin(a\xi)}{\xi}
\end{eqnarray*}
</math>
(2) With <math>f(x)=e^{-ax}\chi_{[0,\infty]}(x)</math>, the computation goes as follows:
<math display="block">
\begin{eqnarray*}
\widehat{f}(\xi)
&=&\int_0^\infty e^{ix\xi-ax}dx\\
&=&\int_0^\infty e^{(i\xi-a)x}dx\\
&=&\left[\frac{e^{(i\xi-a)x}}{i\xi-a}\right]_0^\infty\\
&=&\frac{1}{a-i\xi}
\end{eqnarray*}
</math>
(3) Regarding the third formula, this follows from the second one, by using the following fact, generalizing the parity computations from Proposition 7.10:
<math display="block">
F(x)=f(-x)\implies\widehat{F}(\xi)=\widehat{f}(-\xi)
</math>
(4) The last 2 formulae follow from what we have, by making sums and differences, and the linearity properties of the Fourier transform, from Proposition 7.10.}}
We will see many other examples, in what follows. Getting back now to theory, we have the following result, adding to the various general properties in Proposition 7.10, and providing more motivations for the Fourier transform:
{{proofcard|Proposition|proposition-4|Given <math>f,g\in L^1(\mathbb R)</math> we have <math>\widehat{f}g,f\widehat{g}\in L^1(\mathbb R)</math> and
<math display="block">
\int_\mathbb R f(\xi)\widehat{g}(\xi)d\xi=\int_\mathbb R\widehat{f}(x)g(x)dx
</math>
called “exchange of hat” formula.
|Regarding the fact that we have indeed <math>\widehat{f}g,f\widehat{g}\in L^1(\mathbb R)</math>, this is actually a bit non-trivial, but we will be back to this later. Assuming this, we have:
<math display="block">
\int_\mathbb R f(\xi)\widehat{g}(\xi)d\xi=\int_\mathbb R\int_\mathbb R f(\xi)e^{ix\xi}g(x)dxd\xi
</math>
On the other hand, we have as well the following formula:
<math display="block">
\int_\mathbb R\widehat{f}(x)g(x)dx=\int_\mathbb R\int_\mathbb R e^{ix\xi}f(x)g(\xi)dxd\xi
</math>
Thus, with <math>x\leftrightarrow\xi</math>, we are led to the formula in the statement.}}
As an important result, showing the power of the Fourier transform, this transforms the derivative into something very simple, namely a multiplication by the variable:
{{proofcard|Theorem|theorem-6|Given <math>f:\mathbb R\to\mathbb C</math> such that <math>f,f'\in L^1(\mathbb R)</math>, we have:
<math display="block">
\widehat{f'}(\xi)=-i\xi\widehat{f}(\xi)
</math>
More generally, assuming <math>f,f',f'',\ldots,f^{(n)}\in L^1(\mathbb R)</math>, we have
<math display="block">
\widehat{f^{(k)}}(\xi)=(-i\xi)^k\widehat{f}(\xi)
</math>
for any <math>k=1,2,\ldots,n</math>.
|These results follow by doing a partial integration, as follows:
(1) Assuming that <math>f:\mathbb R\to\mathbb C</math> has compact support, we have indeed:
<math display="block">
\begin{eqnarray*}
\widehat{f'}(\xi)
&=&\int_\mathbb R e^{ix\xi}f'(x)dx\\
&=&-\int_\mathbb R i\xi e^{ix\xi}f(x)dx\\
&=&-i\xi\int_\mathbb R e^{ix\xi}f(x)dx\\
&=&-i\xi\widehat{f}(\xi)
\end{eqnarray*}
</math>
(2) Regarding the higher derivatives, the formula here follows by recurrence.}}
Importantly, we have a converse statement as well, as follows:
{{proofcard|Theorem|theorem-7|Assuming that <math>f\in L^1(\mathbb R)</math> is such that <math>F(x)=xf(x)</math> belongs to <math>L^1(\mathbb R)</math> too, the function <math>\widehat{f}</math> is differentiable, with derivative given by:
<math display="block">
(\widehat{f})'(\xi)=i\widehat{F}(\xi)
</math>
More generally, if <math>F_k(x)=x^kf(x)</math> belongs to <math>L^1(\mathbb R)</math>, for <math>k=0,1,\ldots,n</math>, we have
<math display="block">
(\widehat{f})^{(k)}(\xi)=i^k\widehat{F_k}(\xi)
</math>
for any <math>k=1,2,\ldots,n</math>.
|These results are both elementary, as follows:
(1) Regarding the first assertion, the computation here is as follows:
<math display="block">
\begin{eqnarray*}
(\widehat{f})'(\xi)
&=&\frac{d}{d\xi}\int_\mathbb R e^{ix\xi}f(x)dx\\
&=&\int_\mathbb R ixe^{ix\xi}f(x)dx\\
&=&i\int_\mathbb R e^{ix\xi}xf(x)dx\\
&=&i\widehat{F}(\xi)
\end{eqnarray*}
</math>
(2) As for the second assertion, this follows from the first one, by recurrence.}}
As a conclusion to all this, we are on a good way with our theory, and we have:
\begin{conclusion}
Modulo normalization factors, the Fourier transform converts the derivatives into multiplications by the variable, and vice versa.
\end{conclusion}
And isn't this interesting, because isn't computing derivatives a difficult task. Here is now another useful result, of the same type, this time regarding convolutions:
{{proofcard|Theorem|theorem-8|Assuming <math>f,g\in L^1(\mathbb R)</math>, the following happens:
<math display="block">
\widehat{f*g}=\widehat{f}\cdot\widehat{g}
</math>
Moreover, under suitable assumptions, the formula <math>\widehat{fg}=\widehat{f}*\widehat{g}</math> holds too.
|This is something quite subtle, the idea being as follows:
(1) Regarding the first assertion, this is something elementary, as follows:
<math display="block">
\begin{eqnarray*}
\widehat{f*g}(\xi)
&=&\int_\mathbb R e^{ix\xi}(f*g)(x)dx\\
&=&\int_\mathbb R\int_\mathbb R e^{ix\xi}f(x-y)g(y)dxdy\\
&=&\int_\mathbb R e^{iy\xi}\left(\int_\mathbb Re^{i(x-y)\xi}f(x-y)dx\right)g(y)dy\\
&=&\int_\mathbb R e^{iy\xi}\left(\int_\mathbb Re^{it\xi}f(t)dt\right)g(y)dy\\
&=&\int_\mathbb R e^{iy\xi}\widehat{f}(\xi)g(y)dy\\
&=&\widehat{f}(\xi)\widehat{g}(\xi)
\end{eqnarray*}
</math>
(2) As for the second assertion, this is something more tricky, and we will be back to it later. In the meantime, here is however some sort of proof, not very honest:
<math display="block">
\begin{eqnarray*}
(\widehat{f}*\widehat{g})(\xi)
&=&\int_\mathbb R\widehat{f}(\xi-\eta)\widehat{g}(\eta)d\eta\\
&=&\int_\mathbb R\int_\mathbb R\int_\mathbb R e^{ix(\xi-\eta)}f(x)e^{iy\eta}g(y)dxdyd\eta\\
&=&\int_\mathbb R\int_\mathbb R\int_\mathbb R e^{ix\eta}e^{i(y-x)\eta}f(x)g(y)dxdyd\eta\\
&=&\int_\mathbb R e^{ix\eta}f(x)g(x)dx\\
&=&\widehat{fg}(\eta)
\end{eqnarray*}
</math>
To be more precise, the point here is that we can pass from the triple to the single integral by arguing that “we must have <math>x=y</math>”. We will be back to this later.}}
As an updated conclusion to all this, we have, modulo a few bugs, to be fixed:
\begin{conclusion}
The Fourier transform converts the derivatives into multiplications by the variable, and convolutions into products, and vice versa.
\end{conclusion}
We will see applications of this later, after developing some more general theory. So, let us develop now more theory for the Fourier transform. We first have:
{{proofcard|Theorem|theorem-9|Given <math>f\in L^1(\mathbb R)</math>, its Fourier transform satisfies
<math display="block">
\lim_{\xi\to\pm\infty}\widehat{f}(\xi)=0
</math>
called Riemann-Lebesgue property of <math>\widehat{f}</math>.
|This is something quite technical, as follows:
(1) Given a function <math>f:\mathbb R\to\mathbb C</math> and a number <math>y\in\mathbb R</math>, let us set:
<math display="block">
f_y(x)=f(x-y)
</math>
Our claim is then that if <math>f\in L^p(\mathbb R)</math>, then the following function is uniformly continuous, with respect to the usual <math>p</math>-norm on the right:
<math display="block">
\mathbb R\to L^p(\mathbb R)\quad,\quad y\to f_y
</math>
(2) In order to prove this, fix <math>\varepsilon > 0</math>. Since <math>f\in L^p(\mathbb R)</math>, we can find a function of type <math>g:[-K,K]\to\mathbb C</math> which is continuous, such that:
<math display="block">
||f-g||_p < \varepsilon
</math>
Now since <math>g</math> is uniformly continuous, we can find <math>\delta\in(0,K)</math> such that:
<math display="block">
|s-t| < \delta\implies|g(s)-g(t)| < (3K)^{-1/p}\varepsilon
</math>
But this shows that we have the following estimate:
<math display="block">
\begin{eqnarray*}
||g_s-g_t||_p
&=&\left(\int_\mathbb R\big|g(x-s)-g(x-t)\big|^pdx\right)^{1/p}\\
& < &\left[(3K)^{-1}\varepsilon^p(2k+\delta)\right]^{1/p}\\
& < &\varepsilon
\end{eqnarray*}
</math>
By using now the formula <math>||f||_p=||f_s||_p</math>, which is clear, we obtain:
<math display="block">
\begin{eqnarray*}
||f_s-f_t||_p
&\leq&||f_s-g_s||_p+||g_s-g_t||_p+||g_t-f_t||_p\\
& < &\varepsilon+\varepsilon+\varepsilon\\
&=&3\varepsilon
\end{eqnarray*}
</math>
But this being true for any <math>|s-t| < \delta</math>, we have proved our claim.
(3) Let us prove now the Riemann-Lebesgue property of <math>\widehat{f}</math>, as formulated in the statement. By using <math>e^{\pi i}=-1</math>, and the change of variables <math>x\to x-\pi/\xi</math>, we have:
<math display="block">
\begin{eqnarray*}
\widehat{f}(\xi)
&=&\int_\mathbb Re^{ix\xi}f(x)dx\\
&=&-\int_\mathbb Re^{ix\xi}e^{\pi i}f(x)dx\\
&=&-\int_\mathbb Re^{i\xi(x+\pi/\xi)}f(x)dx\\
&=&-\int_\mathbb Re^{ix\xi}f\left(x-\frac{\pi}{\xi}\right)dx
\end{eqnarray*}
</math>
On the other hand, we have as well the following formula:
<math display="block">
\widehat{f}(\xi)=\int_\mathbb Re^{ix\xi}f(x)dx
</math>
Thus by summing, we obtain the following formula:
<math display="block">
2\widehat{f}(\xi)=\int_\mathbb Re^{ix\xi}\left(f(x)-f\left(x-\frac{\pi}{\xi}\right)\right)dx
</math>
But this gives the following estimate, with notations from (1):
<math display="block">
2|\widehat{f}(\xi)|\leq||f-f_{\pi/\xi}||_1
</math>
Since by (1) this goes to <math>0</math> with <math>\xi\to\pm\infty</math>, this gives the result.}}
Quite remarkably, and as a main result now regarding Fourier transforms, a function <math>f:\mathbb R\to\mathbb C</math> can be recovered from its Fourier transform <math>\widehat{f}:\mathbb R\to\mathbb C</math>, as follows:
{{proofcard|Theorem|theorem-10|Assuming <math>f,\widehat{f}\in L^1(\mathbb R)</math>, we have
<math display="block">
f(x)=\int_\mathbb R e^{-ix\xi}\widehat{f}(\xi)d\xi
</math>
almost everywhere, called Fourier inversion formula.
|This is something quite tricky, due to the fact that a direct attempt by double integration fails. Consider the following function, depending on a parameter <math>\lambda > 0</math>:
<math display="block">
\varphi_\lambda(x)=\int_\mathbb Re^{-ix\xi-\lambda|\xi|}d\xi
</math>
We have then the following computation:
<math display="block">
\begin{eqnarray*}
(f*\varphi_\lambda)(x)
&=&\int_\mathbb Rf(x-y)\varphi_\lambda(y)dy\\
&=&\int_\mathbb R\int_\mathbb Rf(x-y)e^{-iy\xi-\lambda|\xi|}d\xi dy\\
&=&\int_\mathbb Re^{-\lambda|\xi|}\left(\int_\mathbb Rf(x-y)e^{-iy\xi}dy\right)d\xi\\
&=&\int_\mathbb Re^{-\lambda|\xi|}e^{-ix\xi}\widehat{f}(\xi)d\xi
\end{eqnarray*}
</math>
By letting now <math>\lambda\to0</math>, we obtain from this the following formula:
<math display="block">
\lim_{\lambda\to0}(f*\varphi_\lambda)(x)=\int_\mathbb R e^{-ix\xi}\widehat{f}(\xi)d\xi
</math>
On the other hand, by using Theorem 7.18 we obtain that, almost everywhere:
<math display="block">
\lim_{\lambda\to0}(f*\varphi_\lambda)(x)=f(x)
</math>
Thus, we are led to the conclusion in the statement.}}
There are many more things that can be said about Fourier transforms, a key result being the Plancherel formula, allowing us to talk about the Fourier transform over the space <math>L^2(\mathbb R)</math>. Also, we can talk about the Fourier transform over the space <math>\mathcal S</math> of functions all whose derivatives are rapidly decreasing, called Schwartz space. It is beyond our scope here to discuss all this, and we recommend instead any good book on the subject.
===7c. Groups, extensions===
All the above theory concerns the Fourier transform over <math>\mathbb R</math>, but there are some other types of Fourier transforms too, the idea being that we have such a transform over any locally compact abelian group <math>G</math>. In order to discuss this, let us start with:
{{defncard|label=|id=|An abelian group is a set <math>G</math> with a multiplication operation
<math display="block">
(g,h)\to gh
</math>
which must satisfy the following conditions:
<ul><li> Commutativity: we have <math>gh=hg</math>, for any <math>g,h\in G</math>.
</li>
<li> Associativity: we have <math>(gh)k=g(hk)</math>, for any <math>g,h,k\in G</math>.
</li>
<li> Unit: there is an element <math>1\in G</math> such that <math>g1=g</math>, for any <math>g\in G</math>.
</li>
<li> Inverses: for any <math>g\in G</math> there is <math>g^{-1}\in G</math> such that <math>gg^{-1}=1</math>.
</li>
</ul>}}
There are many examples of abelian groups, and in order to understand what is going on, and to develop our Fourier transform theory, let us first restrict the attention to the finite group case. Here we have as basic examples the cyclic group <math>\mathbb Z_N</math>, or more generally the products of such cyclic groups, which are clearly both finite and abelian:
<math display="block">
G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k}
</math>
Our goal in what follows will be that of proving that these are in fact all the finite abelian groups, and also that a Fourier transform theory, called “discrete Fourier transform”, can be developed over such groups. Let us start our study with:
{{proofcard|Proposition|proposition-5|Given a finite abelian group <math>G</math>, the group morphisms
<math display="block">
\chi:G\to\mathbb T
</math>
with <math>\mathbb T</math> being the unit circle, called characters of <math>G</math>, form a finite abelian group <math>\widehat{G}</math>.
|There are several things to be proved here, the idea being as follows:
(1) Our first claim is that <math>\widehat{G}</math> is a group, with the pointwise multiplication, namely:
<math display="block">
(\chi\rho)(g)=\chi(g)\rho(g)
</math>
Indeed, if <math>\chi,\rho</math> are characters, so is <math>\chi\rho</math>, and so the multiplication is well-defined on <math>\widehat{G}</math>. Regarding the unit, this is the trivial character <math>1:G\to\mathbb T</math>, mapping <math>g\to1</math>, for any <math>g\in G</math>. Finally, we have inverses, with the inverse of <math>\chi:G\to\mathbb T</math> being its conjugate:
<math display="block">
\bar{\chi}:G\to\mathbb T\quad,\quad
g\to\overline{\chi(g)}
</math>
(2) Our next claim is that <math>\widehat{G}</math> is finite. Indeed, given a group element <math>g\in G</math>, we can talk about its order, which is smallest integer <math>k\in\mathbb N</math> such that <math>g^k=1</math>. Now assuming that we have a character <math>\chi:G\to\mathbb T</math>, we have the following formula:
<math display="block">
\chi(g)^k=1
</math>
Thus <math>\chi(g)</math> must be one of the <math>k</math>-th roots of unity, and in particular there are finitely many choices for <math>\chi(g)</math>. Thus, there are finitely many choices for <math>\chi</math>, as desired.
(3) Finally, the fact that <math>\widehat{G}</math> is abelian follows from definitions, because the pointwise multiplication of functions, and in particular of characters, is commutative.}}
The above construction is quite interesting, and we have:
{{proofcard|Theorem|theorem-11|The character group operation <math>G\to\widehat{G}</math> for the finite abelian groups, called Pontrjagin duality, has the following properties:
<ul><li> The dual of a cyclic group is the group itself, <math>\widehat{\mathbb Z}_N=\mathbb Z_N</math>.
</li>
<li> The dual of a product is the product of duals, <math>\widehat{G\times H}=\widehat{G}\times\widehat{H}</math>.
</li>
<li> Any product of cyclic groups <math>G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k}</math> is self-dual, <math>G=\widehat{G}</math>.
</li>
</ul>
|We have several things to be proved, the idea being as follows:
(1) A character <math>\chi:\mathbb Z_N\to\mathbb T</math> is uniquely determined by its value <math>z=\chi(g)</math> on the standard generator <math>g\in\mathbb Z_N</math>. But this value must satisfy:
<math display="block">
z^N=1
</math>
Thus we must have <math>z\in\mathbb Z_N</math>, with the cyclic group <math>\mathbb Z_N</math> being regarded this time as being the group of <math>N</math>-th roots of unity. Now conversely, any <math>N</math>-th root of unity <math>z\in\mathbb Z_N</math> defines a character <math>\chi:\mathbb Z_N\to\mathbb T</math>, by setting, for any <math>r\in\mathbb N</math>:
<math display="block">
\chi(g^r)=z^r
</math>
Thus we have an identification <math>\widehat{\mathbb Z}_N=\mathbb Z_N</math>, as claimed.
(2) A character of a product of groups <math>\chi:G\times H\to\mathbb T</math> must satisfy:
<math display="block">
\chi(g,h)=\chi\left[(g,1)(1,h)\right]=\chi(g,1)\chi(1,h)
</math>
Thus <math>\chi</math> must appear as the product of its restrictions <math>\chi_{|G},\chi_{|H}</math>, which must be both characters, and this gives the identification in the statement.
(3) This follows from (1) and (2). Alternatively, any character <math>\chi:G\to\mathbb T</math> is uniquely determined by its values <math>\chi(g_1),\ldots,\chi(g_k)</math> on the standard generators of <math>\mathbb Z_{N_1},\ldots,\mathbb Z_{N_k}</math>, which must belong to <math>\mathbb Z_{N_1},\ldots,\mathbb Z_{N_k}\subset\mathbb T</math>, and this gives <math>\widehat{G}=G</math>, as claimed.}}
At a more advanced level now, we have the following result:
{{proofcard|Theorem|theorem-12|The finite abelian groups are the following groups,
<math display="block">
G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k}
</math>
and these groups are all self-dual, <math>G=\widehat{G}</math>.
|This is something quite tricky, the idea being as follows:
(1) In order to prove our result, assume that <math>G</math> is finite and abelian. For any prime number <math>p\in\mathbb N</math>, let us define <math>G_p\subset G</math> to be the subset of elements having as order a power of <math>p</math>. Equivalently, this subset <math>G_p\subset G</math> can be defined as follows:
<math display="block">
G_p=\left\{g\in G\Big|\exists k\in\mathbb N,g^{p^k}=1\right\}
</math>
(2) It is then routine to check, based on definitions, that each <math>G_p</math> is a subgroup. Our claim now is that we have a direct product decomposition as follows:
<math display="block">
G=\prod_pG_p
</math>
(3) Indeed, by using the fact that our group <math>G</math> is abelian, we have a morphism as follows, with the order of the factors when computing <math>\prod_pg_p</math> being irrelevant:
<math display="block">
\prod_pG_p\to G\quad,\quad (g_p)\to\prod_pg_p
</math>
Moreover, it is routine to check that this morphism is both injective and surjective, via some simple manipulations, so we have our group decomposition, as in (2).
(4) Thus, we are left with proving that each component <math>G_p</math> decomposes as a product of cyclic groups, having as orders powers of <math>p</math>, as follows:
<math display="block">
G_p=\mathbb Z_{p^{r_1}}\times\ldots\times\mathbb Z_{p^{r_s}}
</math>
But this is something that can be checked by recurrence on <math>|G_p|</math>, via some routine computations, and we are led to the conclusion in the statement.
(5) Finally, the fact that the finite abelian groups are self-dual, <math>G=\widehat{G}</math>, follows from the structure result that we just proved, and from Theorem 7.22 (3).}}
In relation now with Fourier analysis, the result is as follows:
{{proofcard|Theorem|theorem-13|Given a finite abelian group <math>G</math>, we have an isomorphism as follows, obtained by linearizing/delinearizing the characters,
<math display="block">
C^*(G)\simeq C(\widehat{G})
</math>
where <math>C^*(G)</math> is the algebra of functions <math>\varphi:G\to\mathbb C</math>, with convolution product, and <math>C(\widehat{G})</math> is the algebra of functions <math>\varphi:\widehat{G}\to\mathbb C</math>, with usual product.
|There are several things going on here, the idea being as follows:
(1) Given a finite abelian group <math>G</math>, we can talk about the complex algebra <math>C(G)</math> formed by the complex functions <math>\varphi:G\to\mathbb C</math>, with usual product, namely:
<math display="block">
(\varphi\psi)(g)=\varphi(g)\psi(g)
</math>
Observe that we have <math>C(G)\simeq\mathbb C^N</math> as an algebra, where <math>N=|G|</math>, with this being best seen via the basis of <math>C(G)</math> formed by the Dirac masses at the points of <math>G</math>:
<math display="block">
C(G)=\left\{\sum_{g\in G}\lambda_g\delta_g\Big|\lambda_g\in\mathbb C\right\}
</math>
(2) On the other hand, we can talk as well about the algebra <math>C^*(G)</math> formed by the same functions <math>\varphi:G\to\mathbb C</math>, but this time with the convolution product, namely:
<math display="block">
(\varphi*\psi)(g)=\sum_{h\in G}\varphi(gh^{-1})\psi(h)
</math>
Since we have <math>\delta_k*\delta_l=\delta_{kl}</math> for any <math>k,l\in G</math>, as you can easily check by using the above formula, the Dirac masses <math>\delta_g\in C^*(G)</math> behave like the group elements <math>g\in G</math>. Thus, we can view our algebra as follows, with multiplication given by <math>g\cdot h=gh</math>, and linearity:
<math display="block">
C^*(G)=\left\{\sum_{g\in G}\lambda_gg\Big|\lambda_g\in\mathbb C\right\}
</math>
(3) Now that we know what the statement is about, let us go for the proof. The first observation is that we have a morphism of algebras as follows:
<math display="block">
C^*(G)\to C(\widehat{G})\quad,\quad g\to\left[\chi\to\chi(g)\right]
</math>
Now since on both sides we have vector spaces of dimension <math>N=|G|</math>, it is enough to check that this morphism is injective. But this is best done via Theorem 7.23, which shows that the characters <math>\chi\in\widehat{G}</math> separate the points <math>g\in G</math>, as desired.}}
We can feel that Theorem 7.24 is related to Fourier analysis, and we have:
\begin{fact}
The following happen, regarding the locally compact abelian groups:
<ul><li> What we did in the finite case, namely group characters, and construction and basic properties of the dual, can be extended to them.
</li>
<li> As basic examples of this, besides what we have in the finite case, and notably <math>\widehat{\mathbb Z}_N=\mathbb Z_N</math>, we have <math>\widehat{\mathbb Z}=\mathbb T</math>, <math>\widehat{\mathbb T}=\mathbb Z</math>, and also <math>\widehat{\mathbb R}=\mathbb R</math>.
</li>
<li> With some care for analytic aspects, <math>C^*(G)\simeq C(\widehat{G})</math> remains true in this setting, and in the case <math>G=\mathbb R</math>, this isomorphism is the Fourier transform.
</li>
</ul>
\end{fact}
Obviously, all this is a bit heavy, but you get the point, we have 3 types of Fourier analysis in life, namely the “standard” one that we previously learned in this chapter, corresponding to <math>G=\mathbb R</math>, then another one that we skipped, and that we encourage you to learn, called the “Fourier series” one, corresponding to <math>G=\mathbb Z,\mathbb T</math>, and finally the “discrete” one that we started to learn, over <math>G=\mathbb Z_N</math> and other finite abelian groups.
In practice, all this is a bit complicated, and back now to the finite abelian groups, let us work out a softer version of all the above, which is what is really needed, in practice, when doing discrete Fourier analysis. For <math>G=\mathbb Z_N</math>, what we need is:
{{defncard|label=|id=|The Fourier matrix <math>F_N</math> is the following matrix, with <math>w=e^{2\pi i/N}</math>:
<math display="block">
F_N=
\begin{pmatrix}
1&1&1&\ldots&1\\
1&w&w^2&\ldots&w^{N-1}\\
1&w^2&w^4&\ldots&w^{2(N-1)}\\
\vdots&\vdots&\vdots&&\vdots\\
1&w^{N-1}&w^{2(N-1)}&\ldots&w^{(N-1)^2}
\end{pmatrix}
</math>
That is, <math>F_N=(w^{ij})_{ij}</math>, with indices <math>i,j\in\{0,1,\ldots,N-1\}</math>, taken modulo <math>N</math>.}}
Observe that this matrix is Hadamard, in the sense that its entries are on the unit circle, and the rows are pairwise orthogonal. In fact, in general, we have:
{{proofcard|Theorem|theorem-14|Given a finite abelian group <math>G</math>, with dual group <math>\widehat{G}=\{\chi:G\to\mathbb T\}</math>, consider the corresponding Fourier coupling, namely:
<math display="block">
\mathcal F_G:G\times\widehat{G}\to\mathbb T\quad,\quad
(i,\chi)\to\chi(i)
</math>
<ul><li> Via the standard isomorphism <math>G\simeq\widehat{G}</math>, this Fourier coupling can be regarded as a square matrix, <math>F_G\in M_G(\mathbb T)</math>, which is a complex Hadamard matrix.
</li>
<li> In the case of the cyclic group <math>G=\mathbb Z_N</math> we obtain in this way, via the standard identification <math>\mathbb Z_N=\{1,\ldots,N\}</math>, the Fourier matrix <math>F_N</math>.
</li>
<li> In general, when using a decomposition <math>G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k}</math>, the corresponding Fourier matrix is given by <math>F_G=F_{N_1}\otimes\ldots\otimes F_{N_k}</math>.
</li>
</ul>
|This follows indeed by using the above finite abelian group theory:
(1) With the identification <math>G\simeq\widehat{G}</math> made our matrix is given by <math>(F_G)_{i\chi}=\chi(i)</math>, and the scalar products between the rows are computed as follows:
<math display="block">
< R_i,R_j >
=\sum_\chi\chi(i)\overline{\chi(j)}
=\sum_\chi\chi(i-j)
=|G|\cdot\delta_{ij}
</math>
Thus, we obtain indeed a complex Hadamard matrix.
(2) This follows from the well-known and elementary fact that, via the identifications <math>\mathbb Z_N=\widehat{\mathbb Z_N}=\{1,\ldots,N\}</math>, the Fourier coupling here is as follows, with <math>w=e^{2\pi i/N}</math>:
<math display="block">
(i,j)\to w^{ij}
</math>
(3) We use here the following formula that we know, for the duals of products:
<math display="block">
\widehat{H\times K}=\widehat{H}\times\widehat{K}
</math>
At the level of the corresponding Fourier couplings, we obtain from this:
<math display="block">
F_{H\times K}=F_H\otimes F_K
</math>
Now by decomposing <math>G</math> into cyclic groups, as in the statement, and by using (2) for the cyclic components, we obtain the formula in the statement.}}
As a nice application of discrete Fourier analysis, we have:
{{proofcard|Theorem|theorem-15|For a matrix <math>M\in M_N(\mathbb C)</math>, the following are equivalent:
<ul><li> <math>M</math> is circulant, <math>M_{ij}=\xi_{j-i}</math>, for a certain vector <math>\xi\in\mathbb C^N</math>.
</li>
<li> <math>M</math> is Fourier-diagonal, <math>M=F_NQF_N^*</math>, for a certain diagonal matrix <math>Q</math>.
</li>
</ul>
Moreover, if these conditions hold, then <math>\xi=F_N^*q</math>, where <math>q=(Q_{11},\ldots,Q_{NN})</math>.
|This follows from some computations with roots of unity, as follows:
<math>(1)\implies(2)</math> Assuming <math>M_{ij}=\xi_{j-i}</math>, the matrix <math>Q=F_N^*MF_N</math> is indeed diagonal, as shown by the following computation:
<math display="block">
\begin{eqnarray*}
Q_{ij}
&=&\sum_{kl}w^{jl-ik}\xi_{l-k}\\
&=&\sum_{kr}w^{j(k+r)-ik}\xi_r\\
&=&N\delta_{ij}\sum_rw^{jr}\xi_r
\end{eqnarray*}
</math>
<math>(2)\implies(1)</math> Assuming <math>Q=diag(q_1,\ldots,q_N)</math>, the matrix <math>M=F_NQF_N^*</math> is indeed circulant, as shown by the following computation:
<math display="block">
M_{ij}
=\sum_kw^{ik}Q_{kk}w^{-jk}
=\sum_kw^{(i-j)k}q_k
</math>
Indeed, since the last term depends only on <math>j-i</math>, we have <math>M_{ij}=\xi_{j-i}</math>, with <math>\xi_i=\sum_kw^{-ik}q_k=(F_N^*q)_i</math>. Thus, we are led to the conclusions in the statement.}}
As an illustration for the above result, the all-one matrix diagonalizes as follows:
<math display="block">
\begin{pmatrix}
1&\ldots&\ldots&1\\
\vdots&&&\vdots\\
\vdots&&&\vdots\\
1&\ldots&\ldots&1\end{pmatrix}
=\frac{1}{N}\,F_N
\begin{pmatrix}
N\\
&0\\
&&\ddots\\
&&&0\end{pmatrix}F_N^*
</math>
But you might know this already, since this was a recurrent exercise, in this book.
===7d. Independence, limits===
With the above theory in hand, let us go back to probability. Our claim is that things are very interesting here, with the real-life notion of independence corresponding to our mathematical notion of convolution, and with the Fourier transform being something very useful in probability, in order to understand the independence. Let us start with:
{{defncard|label=|id=|Two variables <math>f,g\in L^\infty(X)</math> are called independent when
<math display="block">
E(f^kg^l)=E(f^k)\, E(g^l)
</math>
happens, for any <math>k,l\in\mathbb N</math>.}}
This definition hides some non-trivial things. Indeed, by linearity, we would like to have a formula as follows, valid for any polynomials <math>P,Q\in\mathbb R[X]</math>:
<math display="block">
E[P(f)Q(g)]=E[P(f)]\,E[Q(g)]
</math>
By a continuity argument, it is enough to have this formula for the characteristic functions <math>\chi_I,\chi_J</math> of the arbitrary measurable sets of real numbers <math>I,J\subset\mathbb R</math>:
<math display="block">
E[\chi_I(f)\chi_J(g)]=E[\chi_I(f)]\,E[\chi_J(g)]
</math>
Thus, we are led to the usual definition of independence, namely:
<math display="block">
P(f\in I,g\in J)=P(f\in I)\,P(g\in J)
</math>
All this might seem a bit abstract, but in practice, the idea is of course that <math>f,g</math> must be independent, in an intuitive, real-life sense. As a first result now, we have:
{{proofcard|Theorem|theorem-16|Assuming that <math>f,g\in L^\infty(X)</math> are independent, we have
<math display="block">
\mu_{f+g}=\mu_f*\mu_g
</math>
where <math>*</math> is the convolution of real probability measures.
|We have the following computation, using the independence of <math>f,g</math>:
<math display="block">
\begin{eqnarray*}
M_k(f+g)
&=&E((f+g)^k)\\
&=&\sum_r\binom{k}{r}E(f^rg^{k-r})\\
&=&\sum_r\binom{k}{r}M_r(f)M_{k-r}(g)
\end{eqnarray*}
</math>
On the other hand, we have as well the following computation:
<math display="block">
\begin{eqnarray*}
\int_\mathbb Rx^kd(\mu_f*\mu_g)(x)
&=&\int_{\mathbb R\times\mathbb R}(x+y)^kd\mu_f(x)d\mu_g(y)\\
&=&\sum_r\binom{k}{r}\int_\mathbb Rx^rd\mu_f(x)\int_\mathbb Ry^{k-r}d\mu_g(y)\\
&=&\sum_r\binom{k}{r}M_r(f)M_{k-r}(g)
\end{eqnarray*}
</math>
Thus <math>\mu_{f+g}</math> and <math>\mu_f*\mu_g</math> have the same moments, and so they coincide, as claimed.}}
Here is now a second result on independence, which is something more advanced:
{{proofcard|Theorem|theorem-17|Assuming that <math>f,g\in L^\infty(X)</math> are independent, we have
<math display="block">
F_{f+g}=F_fF_g
</math>
where <math>F_f(x)=E(e^{ixf})</math> is the Fourier transform.
|We have the following computation, by using Theorem 7.30:
<math display="block">
\begin{eqnarray*}
F_{f+g}(x)
&=&\int_\mathbb Re^{ixz}d\mu_{f+g}(z)\\
&=&\int_\mathbb Re^{ixz}d(\mu_f*\mu_g)(z)\\
&=&\int_{\mathbb R\times\mathbb R}e^{ix(z+t)}d\mu_f(z)d\mu_g(t)\\
&=&\int_\mathbb Re^{ixz}d\mu_f(z)\int_\mathbb Re^{ixt}d\mu_g(t)\\
&=&F_f(x)F_g(x)
\end{eqnarray*}
</math>
Thus, we are led to the conclusion in the statement.}}
In order to work out some basic illustrations, let us go back to the Poisson laws <math>p_t</math>, from chapter 4. We can now say more about them, first with the following result:
{{proofcard|Theorem|theorem-18|We have the following formula, for any <math>s,t > 0</math>,
<math display="block">
p_s*p_t=p_{s+t}
</math>
so the Poisson laws form a convolution semigroup.
|By using <math>\delta_k*\delta_l=\delta_{k+l}</math> and the binomial formula, we obtain:
<math display="block">
\begin{eqnarray*}
p_s*p_t
&=&e^{-s}\sum_k\frac{s^k}{k!}\,\delta_k*e^{-t}\sum_l\frac{t^l}{l!}\,\delta_l\\
&=&e^{-s-t}\sum_n\delta_n\sum_{k+l=n}\frac{s^kt^l}{k!l!}\\
&=&e^{-s-t}\sum_n\frac{\delta_n}{n!}\sum_{k+l=n}\frac{n!}{k!l!}s^kt^l\\\
&=&e^{-s-t}\sum_n\frac{(s+t)^n}{n!}\,\delta_n\\
&=&p_{s+t}
\end{eqnarray*}
</math>
Thus, we are led to the conclusion in the statement.}}
Next in line, we have the following result, which is fundamental as well:
{{proofcard|Theorem|theorem-19|The Poisson laws appear as formal exponentials
<math display="block">
p_t=\sum_k\frac{t^k(\delta_1-\delta_0)^{*k}}{k!}
</math>
with respect to the convolution of measures <math>*</math>.
|By using the binomial formula, the measure on the right is:
<math display="block">
\begin{eqnarray*}
\mu
&=&\sum_k\frac{t^k}{k!}\sum_{r+s=k}(-1)^s\frac{k!}{r!s!}\delta_r\\
&=&\sum_kt^k\sum_{r+s=k}(-1)^s\frac{\delta_r}{r!s!}\\
&=&\sum_r\frac{t^r\delta_r}{r!}\sum_s\frac{(-1)^s}{s!}\\
&=&\frac{1}{e}\sum_r\frac{t^r\delta_r}{r!}\\
&=&p_t
\end{eqnarray*}
</math>
Thus, we are led to the conclusion in the statement.}}
Regarding now the Fourier transform computation, this is as follows:
{{proofcard|Theorem|theorem-20|The Fourier transform of <math>p_t</math> is given by
<math display="block">
F_{p_t}(y)=\exp\left((e^{iy}-1)t\right)
</math>
for any <math>t > 0</math>.
|We have indeed the following computation:
<math display="block">
\begin{eqnarray*}
F_{p_t}(y)
&=&e^{-t}\sum_k\frac{t^k}{k!}F_{\delta_k}(y)\\
&=&e^{-t}\sum_k\frac{t^k}{k!}\,e^{iky}\\
&=&e^{-t}\sum_k\frac{(e^{iy}t)^k}{k!}\\
&=&\exp(-t)\exp(e^{iy}t)\\
&=&\exp\left((e^{iy}-1)t\right)
\end{eqnarray*}
</math>
Thus, we obtain the formula in the statement.}}
Observe that the above formula gives an alternative proof for Theorem 7.32, by the using the fact that the logarithm of the Fourier transform linearizes the convolution. As another application, we can now establish the Poisson Limit Theorem, as follows:
{{proofcard|Theorem (PLT)|theorem-21|We have the following convergence, in moments,
<math display="block">
\left(\left(1-\frac{t}{n}\right)\delta_0+\frac{t}{n}\,\delta_1\right)^{*n}\to p_t
</math>
for any <math>t > 0</math>.
|Let us denote by <math>\nu_n</math> the measure under the convolution sign, namely:
<math display="block">
\nu_n=\left(1-\frac{t}{n}\right)\delta_0+\frac{t}{n}\,\delta_1
</math>
We have the following computation, for the Fourier transform of the limit:
<math display="block">
\begin{eqnarray*}
F_{\delta_r}(y)=e^{iry}
&\implies&F_{\nu_n}(y)=\left(1-\frac{t}{n}\right)+\frac{t}{n}\,e^{iy}\\
&\implies&F_{\nu_n^{*n}}(y)=\left(\left(1-\frac{t}{n}\right)+\frac{t}{n}\,e^{iy}\right)^n\\
&\implies&F_{\nu_n^{*n}}(y)=\left(1+\frac{(e^{iy}-1)t}{n}\right)^n\\
&\implies&F(y)=\exp\left((e^{iy}-1)t\right)
\end{eqnarray*}
</math>
Thus, we obtain indeed the Fourier transform of <math>p_t</math>, as desired.}}
We have accumulated so far a lot of probability knowledge, going along with our learning of calculus, in chapter 4, chapter 6, and in the present chapter. All this certainly needs some systematic discussion, and we will be back to it once we will have the full calculus tools that are needed, in chapter 14, which will be dedicated to probability.
==General references==
{{cite arXiv|last1=Banica|first1=Teo|year=2024|title=Calculus and applications|eprint=2401.00911|class=math.CO}}

Latest revision as of 15:13, 21 April 2025

[math] \newcommand{\mathds}{\mathbb}[/math]

This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion.

7a. Function spaces

In this chapter we go back to the functions of one real variable, [math]f:\mathbb R\to\mathbb R[/math] or [math]f:\mathbb R\to\mathbb C[/math], with some applications, by using our complex function technology. We will be mainly interested in constructing the Fourier transform, which is an operation [math]f\to\widehat{f}[/math] on such functions, which can solve various questions. Some advertisement first: \begin{advertisement} There is only one serious tool in mathematics, and that is the Fourier transform. \end{advertisement} Or at least, that's what some people say. And as fans of calculus, we should of course side with them. We will see for instance how, via Fourier transform, light can be decomposed into color components, distant galaxies can be observed, atoms can be distinguished from each other, the future of the universe can be predicted, and so on.


Before doing that, however, let us study the spaces that the functions [math]f:\mathbb R\to\mathbb C[/math] can form. These functions can be continuous, differentiable, infinitely differentiable, and so on, but there are many more properties that these functions can have, that we will investigate now. This will lead to various spaces of functions [math]f:\mathbb R\to\mathbb C[/math], that can be used, among others, in order to well-define the Fourier transform operation [math]f\to\widehat{f}[/math].


Let us start with some well-known and useful inequalities, as follows:

Theorem

Given two functions [math]f,g:\mathbb R\to\mathbb C[/math] and an exponent [math]p\geq1[/math], we have

[[math]] \left(\int_\mathbb R|f+g|^p\right)^{1/p}\leq\left(\int_\mathbb R|f|^p\right)^{1/p}+\left(\int_\mathbb R|g|^p\right)^{1/p} [[/math]]
called Minkowski inequality. Also, assuming that [math]p,q\geq1[/math] satisfy [math]1/p+1/q=1[/math], we have

[[math]] \int_\mathbb R|fg|\leq \left(\int_\mathbb R|f|^p\right)^{1/p}\left(\int_\mathbb R|g|^q\right)^{1/q} [[/math]]
called Hölder inequality. These inequalities hold as well for [math]\infty[/math] values of the exponents.


Show Proof

All this is very standard, the idea being as follows:


(1) As a first observation, at [math]p=2[/math], which is a special exponent, we have [math]q=2[/math] as well, and the Minkowski and Hölder inequalities are as follows:

[[math]] \left(\int_\mathbb R|f+g|^2\right)^{1/2}\leq\left(\int_\mathbb R|f|^2\right)^{1/2}+\left(\int_\mathbb R|g|^2\right)^{1/2} [[/math]]

[[math]] \int_\mathbb R|fg|\leq \left(\int_\mathbb R|f|^2\right)^{1/2}\left(\int_\mathbb R|g|^2\right)^{1/2} [[/math]]


But the proof of the Hölder inequality, called Cauchy-Schwarz inequality in this case, is something elementary, coming from the fact that [math]I(t)=\int_\mathbb R|f+twg|^2[/math] with [math]|w|=1[/math] is a positive degree 2 polynomial in [math]t\in\mathbb R[/math], and so its discriminant must be negative. As for the Minkowski inequality, this follows from this, by taking squares and simplifying.


(2) In general, let us first prove Hölder, in the case of finite exponents, [math]p,q\in(1,\infty)[/math]. By linearity we can assume that [math]f,g[/math] are normalized, in the following way:

[[math]] \int_\mathbb R|f|^p=\int_\mathbb R|g|^q=1 [[/math]]


In this case, we want to prove that we have [math]\int_\mathbb R|fg|\leq1[/math]. And for this purpose, we can use the Young inequality from chapter 3, which gives, for any [math]x\in\mathbb R[/math]:

[[math]] |f(x)g(x)|\leq\frac{|f(x)|^p}{p}+\frac{|g(x)|^q}{q} [[/math]]


By integrating now over [math]x\in\mathbb R[/math], we obtain from this, as desired:

[[math]] \int_\mathbb R|fg| \leq\int_\mathbb R\frac{|f|^p}{p}+\frac{|g|^q}{q} =\frac{1}{p}+\frac{1}{q} =1 [[/math]]


Let us prove now Minkowski, again in the finite exponent case, [math]p\in(1,\infty)[/math]. We have the following estimate, using the Hölder inequality, and the conjugate exponent:

[[math]] \begin{eqnarray*} \int_\mathbb R|f+g|^p &=&\int_\mathbb R|f+g|\cdot|f+g|^{p-1}\\ &\leq&\int_\mathbb R|f|\cdot|f+g|^{p-1}+\int_\mathbb R|g|\cdot|f+g|^{p-1}\\ &\leq&\left(\int_\mathbb R|f|^p\right)^{1/p}\left(\int_\mathbb R|f+g|^{(p-1)q}\right)^{1/q}\\ &+&\left(\int_\mathbb R|g|^p\right)^{1/p}\left(\int_\mathbb R|f+g|^{(p-1)q}\right)^{1/q}\\ &=&\left[\left(\int_\mathbb R|f|^p\right)^{1/p}+\left(\int_\mathbb R|g|^p\right)^{1/p}\right]\left(\int_\mathbb R|f+g|^p\right)^{1-1/p} \end{eqnarray*} [[/math]]


Thus, we are led to the Minkowski inequality in the statement.


(3) Finally, with the convention that [math](\int_\mathbb R|f|^p)^{1/p}[/math] takes as value at [math]p=\infty[/math] the essential supremum of [math]f[/math], the Minkowski inequality holds as well at [math]p=\infty[/math], trivially:

[[math]] \sup\big|f+g\big|\leq\sup\big|f\big|+\sup\big|g\big| [[/math]]


The same goes for the Hölder inequality at [math]p=\infty,q=1[/math], which is simply:

[[math]] \int_\mathbb R|fg|\leq\sup\big|f\big|\times\int_\mathbb R|g| [[/math]]


And finally, the same goes for the Hölder inequality at [math]p=1,q=\infty[/math].

As a consequence of the above results, we can formulate:

Theorem

Given an interval [math]I\subset\mathbb R[/math] and an exponent [math]p\geq1[/math], the following space, with the convention that functions are identified up to equality almost everywhere,

[[math]] L^p(I)=\left\{f:I\to\mathbb C\Big|\int_I|f(x)|^pdx \lt \infty\right\} [[/math]]
is a vector space, and the following quantity,

[[math]] ||f||_p=\left(\int_I|f(x)|^p\right)^{1/p} [[/math]]
is a norm on it, in the sense that it satisfies the usual conditions for a vector space norm. Moreover, [math]L^p(I)[/math] is complete with respect to the distance [math]d(f,g)=||f-g||_p[/math].


Show Proof

This basically follows from Theorem 7.2, the idea being as follows:


(1) Again, let us first see what happens at [math]p=2[/math]. Here everything is standard from what we have in Theorem 7.2, and with the remark that the space [math]L^2(I)[/math] that we obtain is more than just a normed vector space, because we have as well a scalar product, related to the norm by the formula [math]||f||_2=\sqrt{ \lt f,f \gt }[/math], constructed as follows:

[[math]] \lt f,g \gt =\int_If(x)\,\overline{g(x)}\,dx [[/math]]


(2) In the general case now, where [math]p\geq1[/math] is still finite, but arbitrary, the proof is similar, basically coming from the Minkowski inequality from Theorem 7.2.


(3) Finally, the extension at [math]p=\infty[/math] is clear too, coming from definitions, and with the various conventions made at the end of the proof of Theorem 7.2.

There are many more things that can be said about the above spaces [math]L^2(I)[/math], which are all good to know, and we refer here to any functional analysis book. Going ahead now with our study of functions [math]f:\mathbb R\to\mathbb C[/math], let us define an interesting operation on such functions, called convolution, which is useful for many purposes. Let us start with:

Definition

The convolution of two functions [math]f,g:\mathbb R\to\mathbb C[/math] is the function

[[math]] (f*g)(x)=\int_\mathbb R f(x-y)g(y)dy [[/math]]
provided that the function [math]y\to f(x-y)g(y)[/math] is indeed integrable, for any [math]x[/math].

There are many reasons for introducing this operation, that we will gradually discover, in what follows. As a basic example, let us take [math]g=\chi_{[0,1]}[/math]. We have then:

[[math]] (f*g)(x)=\int_0^1f(x-y)dy [[/math]]


Thus, with this choice of [math]g[/math], the operation [math]f\to f*g[/math] has some sort of “regularizing effect”, that can be useful for many purposes. We will be back to this, later.


Goinh ahead with more theory, let us try to understand when the convolution operation is well-defined. We have here the following basic result:

Theorem

The convolution operation is well-defined on the space

[[math]] C_c(\mathbb R)=\left\{f\in C(\mathbb R)\Big|supp(f)=\ {\rm compact}\right\} [[/math]]
of continuous functions [math]f:\mathbb R\to\mathbb C[/math] having compact support.


Show Proof

We have several things to be proved, the idea being as follows:


(1) First we must show that given two functions [math]f,g\in C_c(\mathbb R)[/math], their convolution [math]f*g[/math] is well-defined, as a function [math]f*g:\mathbb R\to\mathbb C[/math]. But this follows from the following estimate, where [math]l[/math] denotes the length of the compact subsets of [math]\mathbb R[/math]:

[[math]] \begin{eqnarray*} \int_\mathbb R|f(x-y)g(y)|dy &=&\int_{supp(g)}|f(x-y)g(y)|dy\\ &\leq&\max(g)\int_{supp(g)}|f(x-y)|dy\\ &\leq&\max(g)\cdot l(supp(g))\cdot\max(f)\\ & \lt &\infty \end{eqnarray*} [[/math]]


(2) Next, we must show that the function [math]f*g:\mathbb R\to\mathbb C[/math] that we constructed is indeed continuous. But this follows from the following estimate, where [math]K_f[/math] is the constant of uniform continuity for the function [math]f\in C_c(\mathbb R)[/math]:

[[math]] \begin{eqnarray*} |(f*g)(x+\varepsilon)-(f*g)(x)| &=&\left|\int_\mathbb R f(x+\varepsilon-y)g(y)dy-\int_\mathbb R f(x-y)g(y)dy\right|\\ &=&\left|\int_\mathbb R\left(f(x+\varepsilon-y)-f(x-y)\right)g(y)dy\right|\\ &\leq&\int_\mathbb R|f(x+\varepsilon-y)-f(x-y)|\cdot|g(y)|dy\\ &\leq&K_f\cdot\varepsilon\cdot\int_\mathbb R|g| \end{eqnarray*} [[/math]]


(3) Finally, we must show that the function [math]f*g\in C(\mathbb R)[/math] that we constructed has indeed compact support. For this purpose, our claim is that we have:

[[math]] supp(f+g)\subset supp(f)+supp(g) [[/math]]


In order to prove this claim, observe that we have, by definition of [math]f*g[/math]:

[[math]] \begin{eqnarray*} (f*g)(x) &=&\int_\mathbb R f(x-y)g(y)dy\\ &=&\int_{supp(g)}f(x-y)g(y)dy \end{eqnarray*} [[/math]]


But this latter quantity being 0 for [math]x\notin supp(f)+supp(g)[/math], this gives the result.

Here are now a few remarkable properties of the convolution operation:

Proposition

The following hold, for the functions in [math]C_c(\mathbb R)[/math]:

  • [math]f*g=g*f[/math].
  • [math]f*(g*h)=(f*g)*h[/math].
  • [math]f*(\lambda g+\mu h)=\lambda f*g+\mu f*h[/math].


Show Proof

These formulae are all elementary, the idea being as follows:


(1) This follows from the following computation, with [math]y=x-t[/math]:

[[math]] \begin{eqnarray*} (f*g)(x) &=&\int_\mathbb R f(x-y)g(y)dy\\ &=&\int_\mathbb R f(t)g(x-t)dt\\ &=&\int_\mathbb R g(x-t)f(t)dt\\ &=&(g*f)(x) \end{eqnarray*} [[/math]]


(2) This is clear from definitions.


(3) Once again, this is clear from definitions.

In relation with derivatives, and with the “regularizing effect” of the convolution operation mentioned after Definition 7.4, we have the following result:

Theorem

Given two functions [math]f,g\in C_c(\mathbb R)[/math], assuming that [math]g[/math] is differentiable, then so is [math]f*g[/math], with derivative given by the following formula:

[[math]] (f*g)'=f*g' [[/math]]
More generally, given [math]f,g\in C_c(\mathbb R)[/math], and assuming that [math]g[/math] is [math]k[/math] times differentiable, then so is [math]f*g[/math], with [math]k[/math]-th derivative given by [math](f*g)^{(k)}=f*g^{(k)}[/math].


Show Proof

In what regards the first assertion, with [math]y=x-t[/math], then [math]t=x-y[/math], we get:

[[math]] \begin{eqnarray*} (f*g)'(x) &=&\frac{d}{dx}\int_\mathbb R f(x-y)g(y)dy\\ &=&\frac{d}{dx}\int_\mathbb R f(t)g(x-t)dt\\ &=&\int_\mathbb R f(t)g'(x-t)dt\\ &=&\int_\mathbb R f(x-y)g'(y)dy\\ &=&(f*g')(x) \end{eqnarray*} [[/math]]


As for the second assertion, this follows form the first one, by recurrence.

Finally, getting beyond the compactly supported continuous functions, we have the following result, which is of particular theoretical importance:

Theorem

The convolution operation is well-defined on [math]L^1(\mathbb R)[/math], and we have:

[[math]] ||f*g||_1\leq||f||_1||g||_1 [[/math]]
Thus, if [math]f\in L^1(\mathbb R)[/math] and [math]g\in C_c^k(\mathbb R)[/math], then [math]f*g[/math] is well-defined, and [math]f*g\in C_c^k(\mathbb R)[/math].


Show Proof

In what regards the first assertion, this follows from the following computation, involving an intuitive manipulation on the double integrals, called Fubini theorem, that we will use as such here, and that we will fully clarify later on, in this book:

[[math]] \begin{eqnarray*} \int_\mathbb R|(f*g)(x)|dx &\leq&\int_\mathbb R\int_\mathbb R|f(x-y)g(y)|dydx\\ &=&\int_\mathbb R\int_\mathbb R|f(x-y)g(y)|dxdy\\ &=&\int_\mathbb R|f(x)|dx\int_\mathbb R|g(y)|dy \end{eqnarray*} [[/math]]


As for the second assertion, this follows from the first one, and from Theorem 7.7.

Summarizing, we have now some good knowledge of the various spaces that the functions [math]f:\mathbb R\to\mathbb C[/math] can form, and we have as well an interesting regularization operation [math]f\to f*g[/math] on such functions, that can be used for various purposes.

7b. Fourier transform

We discuss here the construction and main properties of the Fourier transform, which is the main tool in analysis, and even in mathematics in general. We first have:

Definition

Given [math]f\in L^1(\mathbb R)[/math], we define a function [math]\widehat{f}:\mathbb R\to\mathbb C[/math] by

[[math]] \widehat{f}(\xi)=\int_\mathbb R e^{ix\xi}f(x)dx [[/math]]
and call it Fourier transform of [math]f[/math].

As a first observation, even if [math]f[/math] is a real function, [math]\widehat{f}[/math] is a complex function, which is not necessarily real. Also, [math]\widehat{f}[/math] is obviously well-defined, because [math]f\in L^1(\mathbb R)[/math] and [math]|e^{ix\xi}|=1[/math]. Also, the condition [math]f\in L^1(\mathbb R)[/math] is basically needed for construcing [math]\widehat{f}[/math], because:

[[math]] \widehat{f}(0)=\int_\mathbb Rf(x)dx [[/math]]


Generally speaking, the Fourier transform is there for helping with various computations, with the above formula [math]\widehat{f}(0)=\int f[/math] being something quite illustrating. Here are some basic properties of the Fourier transform, all providing some good motivations:

Proposition

The Fourier transform has the following properties:

  • Linearity: [math]\widehat{f+g}=\widehat{f}+\widehat{g}[/math], [math]\widehat{\lambda f}=\lambda\widehat{f}[/math].
  • Regularity: [math]\widehat{f}[/math] is continuous and bounded.
  • If [math]f[/math] is even then [math]\widehat{f}[/math] is even.
  • If [math]f[/math] is odd then [math]\widehat{f}[/math] is odd.


Show Proof

These results are all elementary, as follows:


(1) The additivity formula is clear from definitions, as follows:

[[math]] \begin{eqnarray*} \widehat{f+g}(\xi) &=&\int_\mathbb R e^{ix\xi}(f+g)(x)dx\\ &=&\int_\mathbb R e^{ix\xi}f(x)dx+\int_\mathbb R e^{ix\xi}f(x)dx\\ &=&\widehat{f}(\xi)+\widehat{g}(\xi) \end{eqnarray*} [[/math]]


As for the formula [math]\widehat{\lambda f}=\lambda\widehat{f}[/math], this is clear as well.


(2) The continuity of [math]\widehat{f}[/math] follows indeed from:

[[math]] \begin{eqnarray*} |\widehat{f}(\xi+\varepsilon)-\widehat{f}(\xi)| &\leq&\int_\mathbb R\left|(e^{ix(\xi+\varepsilon)}-e^{ix\xi})f(x)\right|dx\\ &=&\int_\mathbb R\left|e^{ix\xi}(e^{ix\varepsilon}-1)f(x)\right|dx\\ &\leq&|e^{ix\varepsilon}-1|\int_\mathbb R|f| \end{eqnarray*} [[/math]]


As for the boundedness of [math]\widehat{f}[/math], this is clear as well.


(3) This follows from the following computation, assuming that [math]f[/math] is even:

[[math]] \begin{eqnarray*} \widehat{f}(-\xi) &=&\int_\mathbb R e^{-ix\xi}f(x)dx\\ &=&\int_\mathbb R e^{ix\xi}f(-x)dx\\ &=&\int_\mathbb R e^{ix\xi}f(x)dx\\ &=&\widehat{f}(\xi) \end{eqnarray*} [[/math]]


(4) The proof here is similar to the proof of (3), by changing some signs.

We will be back to more theory in a moment, but let us explore now the examples. Here are some basic computations of Fourier transforms:

Proposition

We have the following Fourier transform formulae,

[[math]] f=\chi_{[-a,a]}\implies\widehat{f}(\xi)=\frac{2\sin(a\xi)}{\xi} [[/math]]

[[math]] f=e^{-ax}\chi_{[0,\infty]}(x)\implies\widehat{f}(\xi)=\frac{1}{a-i\xi} [[/math]]

[[math]] f=e^{ax}\chi_{[-\infty,0]}(x)\implies\widehat{f}(\xi)=\frac{1}{a+i\xi} [[/math]]

[[math]] f=e^{-a|x|}\implies\widehat{f}(\xi)=\frac{2a}{a^2+\xi^2} [[/math]]

[[math]] f=sgn(x)e^{-a|x|}\implies\widehat{f}(\xi)=\frac{2i\xi}{a^2+\xi^2} [[/math]]
valid for any number [math]a \gt 0[/math].


Show Proof

All this follows from some calculus, as follows:


(1) In what regards first formula, assuming [math]f=\chi_{[-a,a]}[/math], we have, by using the fact that [math]\sin(x\xi)[/math] is an odd function, whose integral vanishes on centered intervals:

[[math]] \begin{eqnarray*} \widehat{f}(\xi) &=&\int_{-a}^ae^{ix\xi}dx\\ &=&\int_{-a}^a\cos(x\xi)dx+i\int_{-a}^a\sin(x\xi)dx\\ &=&\int_{-a}^a\cos(x\xi)dx\\ &=&\left[\frac{\sin(x\xi)}{\xi}\right]_{-a}^a\\ &=&\frac{2\sin(a\xi)}{\xi} \end{eqnarray*} [[/math]]


(2) With [math]f(x)=e^{-ax}\chi_{[0,\infty]}(x)[/math], the computation goes as follows:

[[math]] \begin{eqnarray*} \widehat{f}(\xi) &=&\int_0^\infty e^{ix\xi-ax}dx\\ &=&\int_0^\infty e^{(i\xi-a)x}dx\\ &=&\left[\frac{e^{(i\xi-a)x}}{i\xi-a}\right]_0^\infty\\ &=&\frac{1}{a-i\xi} \end{eqnarray*} [[/math]]


(3) Regarding the third formula, this follows from the second one, by using the following fact, generalizing the parity computations from Proposition 7.10:

[[math]] F(x)=f(-x)\implies\widehat{F}(\xi)=\widehat{f}(-\xi) [[/math]]


(4) The last 2 formulae follow from what we have, by making sums and differences, and the linearity properties of the Fourier transform, from Proposition 7.10.

We will see many other examples, in what follows. Getting back now to theory, we have the following result, adding to the various general properties in Proposition 7.10, and providing more motivations for the Fourier transform:

Proposition

Given [math]f,g\in L^1(\mathbb R)[/math] we have [math]\widehat{f}g,f\widehat{g}\in L^1(\mathbb R)[/math] and

[[math]] \int_\mathbb R f(\xi)\widehat{g}(\xi)d\xi=\int_\mathbb R\widehat{f}(x)g(x)dx [[/math]]
called “exchange of hat” formula.


Show Proof

Regarding the fact that we have indeed [math]\widehat{f}g,f\widehat{g}\in L^1(\mathbb R)[/math], this is actually a bit non-trivial, but we will be back to this later. Assuming this, we have:

[[math]] \int_\mathbb R f(\xi)\widehat{g}(\xi)d\xi=\int_\mathbb R\int_\mathbb R f(\xi)e^{ix\xi}g(x)dxd\xi [[/math]]


On the other hand, we have as well the following formula:

[[math]] \int_\mathbb R\widehat{f}(x)g(x)dx=\int_\mathbb R\int_\mathbb R e^{ix\xi}f(x)g(\xi)dxd\xi [[/math]]


Thus, with [math]x\leftrightarrow\xi[/math], we are led to the formula in the statement.

As an important result, showing the power of the Fourier transform, this transforms the derivative into something very simple, namely a multiplication by the variable:

Theorem

Given [math]f:\mathbb R\to\mathbb C[/math] such that [math]f,f'\in L^1(\mathbb R)[/math], we have:

[[math]] \widehat{f'}(\xi)=-i\xi\widehat{f}(\xi) [[/math]]
More generally, assuming [math]f,f',f'',\ldots,f^{(n)}\in L^1(\mathbb R)[/math], we have

[[math]] \widehat{f^{(k)}}(\xi)=(-i\xi)^k\widehat{f}(\xi) [[/math]]
for any [math]k=1,2,\ldots,n[/math].


Show Proof

These results follow by doing a partial integration, as follows:


(1) Assuming that [math]f:\mathbb R\to\mathbb C[/math] has compact support, we have indeed:

[[math]] \begin{eqnarray*} \widehat{f'}(\xi) &=&\int_\mathbb R e^{ix\xi}f'(x)dx\\ &=&-\int_\mathbb R i\xi e^{ix\xi}f(x)dx\\ &=&-i\xi\int_\mathbb R e^{ix\xi}f(x)dx\\ &=&-i\xi\widehat{f}(\xi) \end{eqnarray*} [[/math]]


(2) Regarding the higher derivatives, the formula here follows by recurrence.

Importantly, we have a converse statement as well, as follows:

Theorem

Assuming that [math]f\in L^1(\mathbb R)[/math] is such that [math]F(x)=xf(x)[/math] belongs to [math]L^1(\mathbb R)[/math] too, the function [math]\widehat{f}[/math] is differentiable, with derivative given by:

[[math]] (\widehat{f})'(\xi)=i\widehat{F}(\xi) [[/math]]
More generally, if [math]F_k(x)=x^kf(x)[/math] belongs to [math]L^1(\mathbb R)[/math], for [math]k=0,1,\ldots,n[/math], we have

[[math]] (\widehat{f})^{(k)}(\xi)=i^k\widehat{F_k}(\xi) [[/math]]
for any [math]k=1,2,\ldots,n[/math].


Show Proof

These results are both elementary, as follows:


(1) Regarding the first assertion, the computation here is as follows:

[[math]] \begin{eqnarray*} (\widehat{f})'(\xi) &=&\frac{d}{d\xi}\int_\mathbb R e^{ix\xi}f(x)dx\\ &=&\int_\mathbb R ixe^{ix\xi}f(x)dx\\ &=&i\int_\mathbb R e^{ix\xi}xf(x)dx\\ &=&i\widehat{F}(\xi) \end{eqnarray*} [[/math]]


(2) As for the second assertion, this follows from the first one, by recurrence.

As a conclusion to all this, we are on a good way with our theory, and we have: \begin{conclusion} Modulo normalization factors, the Fourier transform converts the derivatives into multiplications by the variable, and vice versa. \end{conclusion} And isn't this interesting, because isn't computing derivatives a difficult task. Here is now another useful result, of the same type, this time regarding convolutions:

Theorem

Assuming [math]f,g\in L^1(\mathbb R)[/math], the following happens:

[[math]] \widehat{f*g}=\widehat{f}\cdot\widehat{g} [[/math]]
Moreover, under suitable assumptions, the formula [math]\widehat{fg}=\widehat{f}*\widehat{g}[/math] holds too.


Show Proof

This is something quite subtle, the idea being as follows:


(1) Regarding the first assertion, this is something elementary, as follows:

[[math]] \begin{eqnarray*} \widehat{f*g}(\xi) &=&\int_\mathbb R e^{ix\xi}(f*g)(x)dx\\ &=&\int_\mathbb R\int_\mathbb R e^{ix\xi}f(x-y)g(y)dxdy\\ &=&\int_\mathbb R e^{iy\xi}\left(\int_\mathbb Re^{i(x-y)\xi}f(x-y)dx\right)g(y)dy\\ &=&\int_\mathbb R e^{iy\xi}\left(\int_\mathbb Re^{it\xi}f(t)dt\right)g(y)dy\\ &=&\int_\mathbb R e^{iy\xi}\widehat{f}(\xi)g(y)dy\\ &=&\widehat{f}(\xi)\widehat{g}(\xi) \end{eqnarray*} [[/math]]


(2) As for the second assertion, this is something more tricky, and we will be back to it later. In the meantime, here is however some sort of proof, not very honest:

[[math]] \begin{eqnarray*} (\widehat{f}*\widehat{g})(\xi) &=&\int_\mathbb R\widehat{f}(\xi-\eta)\widehat{g}(\eta)d\eta\\ &=&\int_\mathbb R\int_\mathbb R\int_\mathbb R e^{ix(\xi-\eta)}f(x)e^{iy\eta}g(y)dxdyd\eta\\ &=&\int_\mathbb R\int_\mathbb R\int_\mathbb R e^{ix\eta}e^{i(y-x)\eta}f(x)g(y)dxdyd\eta\\ &=&\int_\mathbb R e^{ix\eta}f(x)g(x)dx\\ &=&\widehat{fg}(\eta) \end{eqnarray*} [[/math]]


To be more precise, the point here is that we can pass from the triple to the single integral by arguing that “we must have [math]x=y[/math]”. We will be back to this later.

As an updated conclusion to all this, we have, modulo a few bugs, to be fixed: \begin{conclusion} The Fourier transform converts the derivatives into multiplications by the variable, and convolutions into products, and vice versa. \end{conclusion} We will see applications of this later, after developing some more general theory. So, let us develop now more theory for the Fourier transform. We first have:

Theorem

Given [math]f\in L^1(\mathbb R)[/math], its Fourier transform satisfies

[[math]] \lim_{\xi\to\pm\infty}\widehat{f}(\xi)=0 [[/math]]
called Riemann-Lebesgue property of [math]\widehat{f}[/math].


Show Proof

This is something quite technical, as follows:


(1) Given a function [math]f:\mathbb R\to\mathbb C[/math] and a number [math]y\in\mathbb R[/math], let us set:

[[math]] f_y(x)=f(x-y) [[/math]]


Our claim is then that if [math]f\in L^p(\mathbb R)[/math], then the following function is uniformly continuous, with respect to the usual [math]p[/math]-norm on the right:

[[math]] \mathbb R\to L^p(\mathbb R)\quad,\quad y\to f_y [[/math]]


(2) In order to prove this, fix [math]\varepsilon \gt 0[/math]. Since [math]f\in L^p(\mathbb R)[/math], we can find a function of type [math]g:[-K,K]\to\mathbb C[/math] which is continuous, such that:

[[math]] ||f-g||_p \lt \varepsilon [[/math]]


Now since [math]g[/math] is uniformly continuous, we can find [math]\delta\in(0,K)[/math] such that:

[[math]] |s-t| \lt \delta\implies|g(s)-g(t)| \lt (3K)^{-1/p}\varepsilon [[/math]]


But this shows that we have the following estimate:

[[math]] \begin{eqnarray*} ||g_s-g_t||_p &=&\left(\int_\mathbb R\big|g(x-s)-g(x-t)\big|^pdx\right)^{1/p}\\ & \lt &\left[(3K)^{-1}\varepsilon^p(2k+\delta)\right]^{1/p}\\ & \lt &\varepsilon \end{eqnarray*} [[/math]]


By using now the formula [math]||f||_p=||f_s||_p[/math], which is clear, we obtain:

[[math]] \begin{eqnarray*} ||f_s-f_t||_p &\leq&||f_s-g_s||_p+||g_s-g_t||_p+||g_t-f_t||_p\\ & \lt &\varepsilon+\varepsilon+\varepsilon\\ &=&3\varepsilon \end{eqnarray*} [[/math]]


But this being true for any [math]|s-t| \lt \delta[/math], we have proved our claim.


(3) Let us prove now the Riemann-Lebesgue property of [math]\widehat{f}[/math], as formulated in the statement. By using [math]e^{\pi i}=-1[/math], and the change of variables [math]x\to x-\pi/\xi[/math], we have:

[[math]] \begin{eqnarray*} \widehat{f}(\xi) &=&\int_\mathbb Re^{ix\xi}f(x)dx\\ &=&-\int_\mathbb Re^{ix\xi}e^{\pi i}f(x)dx\\ &=&-\int_\mathbb Re^{i\xi(x+\pi/\xi)}f(x)dx\\ &=&-\int_\mathbb Re^{ix\xi}f\left(x-\frac{\pi}{\xi}\right)dx \end{eqnarray*} [[/math]]


On the other hand, we have as well the following formula:

[[math]] \widehat{f}(\xi)=\int_\mathbb Re^{ix\xi}f(x)dx [[/math]]


Thus by summing, we obtain the following formula:

[[math]] 2\widehat{f}(\xi)=\int_\mathbb Re^{ix\xi}\left(f(x)-f\left(x-\frac{\pi}{\xi}\right)\right)dx [[/math]]


But this gives the following estimate, with notations from (1):

[[math]] 2|\widehat{f}(\xi)|\leq||f-f_{\pi/\xi}||_1 [[/math]]


Since by (1) this goes to [math]0[/math] with [math]\xi\to\pm\infty[/math], this gives the result.

Quite remarkably, and as a main result now regarding Fourier transforms, a function [math]f:\mathbb R\to\mathbb C[/math] can be recovered from its Fourier transform [math]\widehat{f}:\mathbb R\to\mathbb C[/math], as follows:

Theorem

Assuming [math]f,\widehat{f}\in L^1(\mathbb R)[/math], we have

[[math]] f(x)=\int_\mathbb R e^{-ix\xi}\widehat{f}(\xi)d\xi [[/math]]
almost everywhere, called Fourier inversion formula.


Show Proof

This is something quite tricky, due to the fact that a direct attempt by double integration fails. Consider the following function, depending on a parameter [math]\lambda \gt 0[/math]:

[[math]] \varphi_\lambda(x)=\int_\mathbb Re^{-ix\xi-\lambda|\xi|}d\xi [[/math]]


We have then the following computation:

[[math]] \begin{eqnarray*} (f*\varphi_\lambda)(x) &=&\int_\mathbb Rf(x-y)\varphi_\lambda(y)dy\\ &=&\int_\mathbb R\int_\mathbb Rf(x-y)e^{-iy\xi-\lambda|\xi|}d\xi dy\\ &=&\int_\mathbb Re^{-\lambda|\xi|}\left(\int_\mathbb Rf(x-y)e^{-iy\xi}dy\right)d\xi\\ &=&\int_\mathbb Re^{-\lambda|\xi|}e^{-ix\xi}\widehat{f}(\xi)d\xi \end{eqnarray*} [[/math]]


By letting now [math]\lambda\to0[/math], we obtain from this the following formula:

[[math]] \lim_{\lambda\to0}(f*\varphi_\lambda)(x)=\int_\mathbb R e^{-ix\xi}\widehat{f}(\xi)d\xi [[/math]]


On the other hand, by using Theorem 7.18 we obtain that, almost everywhere:

[[math]] \lim_{\lambda\to0}(f*\varphi_\lambda)(x)=f(x) [[/math]]


Thus, we are led to the conclusion in the statement.

There are many more things that can be said about Fourier transforms, a key result being the Plancherel formula, allowing us to talk about the Fourier transform over the space [math]L^2(\mathbb R)[/math]. Also, we can talk about the Fourier transform over the space [math]\mathcal S[/math] of functions all whose derivatives are rapidly decreasing, called Schwartz space. It is beyond our scope here to discuss all this, and we recommend instead any good book on the subject.

7c. Groups, extensions

All the above theory concerns the Fourier transform over [math]\mathbb R[/math], but there are some other types of Fourier transforms too, the idea being that we have such a transform over any locally compact abelian group [math]G[/math]. In order to discuss this, let us start with:

Definition

An abelian group is a set [math]G[/math] with a multiplication operation

[[math]] (g,h)\to gh [[/math]]
which must satisfy the following conditions:

  • Commutativity: we have [math]gh=hg[/math], for any [math]g,h\in G[/math].
  • Associativity: we have [math](gh)k=g(hk)[/math], for any [math]g,h,k\in G[/math].
  • Unit: there is an element [math]1\in G[/math] such that [math]g1=g[/math], for any [math]g\in G[/math].
  • Inverses: for any [math]g\in G[/math] there is [math]g^{-1}\in G[/math] such that [math]gg^{-1}=1[/math].

There are many examples of abelian groups, and in order to understand what is going on, and to develop our Fourier transform theory, let us first restrict the attention to the finite group case. Here we have as basic examples the cyclic group [math]\mathbb Z_N[/math], or more generally the products of such cyclic groups, which are clearly both finite and abelian:

[[math]] G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k} [[/math]]


Our goal in what follows will be that of proving that these are in fact all the finite abelian groups, and also that a Fourier transform theory, called “discrete Fourier transform”, can be developed over such groups. Let us start our study with:

Proposition

Given a finite abelian group [math]G[/math], the group morphisms

[[math]] \chi:G\to\mathbb T [[/math]]
with [math]\mathbb T[/math] being the unit circle, called characters of [math]G[/math], form a finite abelian group [math]\widehat{G}[/math].


Show Proof

There are several things to be proved here, the idea being as follows:


(1) Our first claim is that [math]\widehat{G}[/math] is a group, with the pointwise multiplication, namely:

[[math]] (\chi\rho)(g)=\chi(g)\rho(g) [[/math]]


Indeed, if [math]\chi,\rho[/math] are characters, so is [math]\chi\rho[/math], and so the multiplication is well-defined on [math]\widehat{G}[/math]. Regarding the unit, this is the trivial character [math]1:G\to\mathbb T[/math], mapping [math]g\to1[/math], for any [math]g\in G[/math]. Finally, we have inverses, with the inverse of [math]\chi:G\to\mathbb T[/math] being its conjugate:

[[math]] \bar{\chi}:G\to\mathbb T\quad,\quad g\to\overline{\chi(g)} [[/math]]


(2) Our next claim is that [math]\widehat{G}[/math] is finite. Indeed, given a group element [math]g\in G[/math], we can talk about its order, which is smallest integer [math]k\in\mathbb N[/math] such that [math]g^k=1[/math]. Now assuming that we have a character [math]\chi:G\to\mathbb T[/math], we have the following formula:

[[math]] \chi(g)^k=1 [[/math]]


Thus [math]\chi(g)[/math] must be one of the [math]k[/math]-th roots of unity, and in particular there are finitely many choices for [math]\chi(g)[/math]. Thus, there are finitely many choices for [math]\chi[/math], as desired.


(3) Finally, the fact that [math]\widehat{G}[/math] is abelian follows from definitions, because the pointwise multiplication of functions, and in particular of characters, is commutative.

The above construction is quite interesting, and we have:

Theorem

The character group operation [math]G\to\widehat{G}[/math] for the finite abelian groups, called Pontrjagin duality, has the following properties:

  • The dual of a cyclic group is the group itself, [math]\widehat{\mathbb Z}_N=\mathbb Z_N[/math].
  • The dual of a product is the product of duals, [math]\widehat{G\times H}=\widehat{G}\times\widehat{H}[/math].
  • Any product of cyclic groups [math]G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k}[/math] is self-dual, [math]G=\widehat{G}[/math].


Show Proof

We have several things to be proved, the idea being as follows:


(1) A character [math]\chi:\mathbb Z_N\to\mathbb T[/math] is uniquely determined by its value [math]z=\chi(g)[/math] on the standard generator [math]g\in\mathbb Z_N[/math]. But this value must satisfy:

[[math]] z^N=1 [[/math]]


Thus we must have [math]z\in\mathbb Z_N[/math], with the cyclic group [math]\mathbb Z_N[/math] being regarded this time as being the group of [math]N[/math]-th roots of unity. Now conversely, any [math]N[/math]-th root of unity [math]z\in\mathbb Z_N[/math] defines a character [math]\chi:\mathbb Z_N\to\mathbb T[/math], by setting, for any [math]r\in\mathbb N[/math]:

[[math]] \chi(g^r)=z^r [[/math]]


Thus we have an identification [math]\widehat{\mathbb Z}_N=\mathbb Z_N[/math], as claimed.


(2) A character of a product of groups [math]\chi:G\times H\to\mathbb T[/math] must satisfy:

[[math]] \chi(g,h)=\chi\left[(g,1)(1,h)\right]=\chi(g,1)\chi(1,h) [[/math]]


Thus [math]\chi[/math] must appear as the product of its restrictions [math]\chi_{|G},\chi_{|H}[/math], which must be both characters, and this gives the identification in the statement.


(3) This follows from (1) and (2). Alternatively, any character [math]\chi:G\to\mathbb T[/math] is uniquely determined by its values [math]\chi(g_1),\ldots,\chi(g_k)[/math] on the standard generators of [math]\mathbb Z_{N_1},\ldots,\mathbb Z_{N_k}[/math], which must belong to [math]\mathbb Z_{N_1},\ldots,\mathbb Z_{N_k}\subset\mathbb T[/math], and this gives [math]\widehat{G}=G[/math], as claimed.

At a more advanced level now, we have the following result:

Theorem

The finite abelian groups are the following groups,

[[math]] G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k} [[/math]]
and these groups are all self-dual, [math]G=\widehat{G}[/math].


Show Proof

This is something quite tricky, the idea being as follows:


(1) In order to prove our result, assume that [math]G[/math] is finite and abelian. For any prime number [math]p\in\mathbb N[/math], let us define [math]G_p\subset G[/math] to be the subset of elements having as order a power of [math]p[/math]. Equivalently, this subset [math]G_p\subset G[/math] can be defined as follows:

[[math]] G_p=\left\{g\in G\Big|\exists k\in\mathbb N,g^{p^k}=1\right\} [[/math]]


(2) It is then routine to check, based on definitions, that each [math]G_p[/math] is a subgroup. Our claim now is that we have a direct product decomposition as follows:

[[math]] G=\prod_pG_p [[/math]]


(3) Indeed, by using the fact that our group [math]G[/math] is abelian, we have a morphism as follows, with the order of the factors when computing [math]\prod_pg_p[/math] being irrelevant:

[[math]] \prod_pG_p\to G\quad,\quad (g_p)\to\prod_pg_p [[/math]]


Moreover, it is routine to check that this morphism is both injective and surjective, via some simple manipulations, so we have our group decomposition, as in (2).


(4) Thus, we are left with proving that each component [math]G_p[/math] decomposes as a product of cyclic groups, having as orders powers of [math]p[/math], as follows:

[[math]] G_p=\mathbb Z_{p^{r_1}}\times\ldots\times\mathbb Z_{p^{r_s}} [[/math]]


But this is something that can be checked by recurrence on [math]|G_p|[/math], via some routine computations, and we are led to the conclusion in the statement.


(5) Finally, the fact that the finite abelian groups are self-dual, [math]G=\widehat{G}[/math], follows from the structure result that we just proved, and from Theorem 7.22 (3).

In relation now with Fourier analysis, the result is as follows:

Theorem

Given a finite abelian group [math]G[/math], we have an isomorphism as follows, obtained by linearizing/delinearizing the characters,

[[math]] C^*(G)\simeq C(\widehat{G}) [[/math]]
where [math]C^*(G)[/math] is the algebra of functions [math]\varphi:G\to\mathbb C[/math], with convolution product, and [math]C(\widehat{G})[/math] is the algebra of functions [math]\varphi:\widehat{G}\to\mathbb C[/math], with usual product.


Show Proof

There are several things going on here, the idea being as follows:


(1) Given a finite abelian group [math]G[/math], we can talk about the complex algebra [math]C(G)[/math] formed by the complex functions [math]\varphi:G\to\mathbb C[/math], with usual product, namely:

[[math]] (\varphi\psi)(g)=\varphi(g)\psi(g) [[/math]]


Observe that we have [math]C(G)\simeq\mathbb C^N[/math] as an algebra, where [math]N=|G|[/math], with this being best seen via the basis of [math]C(G)[/math] formed by the Dirac masses at the points of [math]G[/math]:

[[math]] C(G)=\left\{\sum_{g\in G}\lambda_g\delta_g\Big|\lambda_g\in\mathbb C\right\} [[/math]]


(2) On the other hand, we can talk as well about the algebra [math]C^*(G)[/math] formed by the same functions [math]\varphi:G\to\mathbb C[/math], but this time with the convolution product, namely:

[[math]] (\varphi*\psi)(g)=\sum_{h\in G}\varphi(gh^{-1})\psi(h) [[/math]]


Since we have [math]\delta_k*\delta_l=\delta_{kl}[/math] for any [math]k,l\in G[/math], as you can easily check by using the above formula, the Dirac masses [math]\delta_g\in C^*(G)[/math] behave like the group elements [math]g\in G[/math]. Thus, we can view our algebra as follows, with multiplication given by [math]g\cdot h=gh[/math], and linearity:

[[math]] C^*(G)=\left\{\sum_{g\in G}\lambda_gg\Big|\lambda_g\in\mathbb C\right\} [[/math]]



(3) Now that we know what the statement is about, let us go for the proof. The first observation is that we have a morphism of algebras as follows:

[[math]] C^*(G)\to C(\widehat{G})\quad,\quad g\to\left[\chi\to\chi(g)\right] [[/math]]


Now since on both sides we have vector spaces of dimension [math]N=|G|[/math], it is enough to check that this morphism is injective. But this is best done via Theorem 7.23, which shows that the characters [math]\chi\in\widehat{G}[/math] separate the points [math]g\in G[/math], as desired.

We can feel that Theorem 7.24 is related to Fourier analysis, and we have: \begin{fact} The following happen, regarding the locally compact abelian groups:

  • What we did in the finite case, namely group characters, and construction and basic properties of the dual, can be extended to them.
  • As basic examples of this, besides what we have in the finite case, and notably [math]\widehat{\mathbb Z}_N=\mathbb Z_N[/math], we have [math]\widehat{\mathbb Z}=\mathbb T[/math], [math]\widehat{\mathbb T}=\mathbb Z[/math], and also [math]\widehat{\mathbb R}=\mathbb R[/math].
  • With some care for analytic aspects, [math]C^*(G)\simeq C(\widehat{G})[/math] remains true in this setting, and in the case [math]G=\mathbb R[/math], this isomorphism is the Fourier transform.

\end{fact} Obviously, all this is a bit heavy, but you get the point, we have 3 types of Fourier analysis in life, namely the “standard” one that we previously learned in this chapter, corresponding to [math]G=\mathbb R[/math], then another one that we skipped, and that we encourage you to learn, called the “Fourier series” one, corresponding to [math]G=\mathbb Z,\mathbb T[/math], and finally the “discrete” one that we started to learn, over [math]G=\mathbb Z_N[/math] and other finite abelian groups.


In practice, all this is a bit complicated, and back now to the finite abelian groups, let us work out a softer version of all the above, which is what is really needed, in practice, when doing discrete Fourier analysis. For [math]G=\mathbb Z_N[/math], what we need is:

Definition

The Fourier matrix [math]F_N[/math] is the following matrix, with [math]w=e^{2\pi i/N}[/math]:

[[math]] F_N= \begin{pmatrix} 1&1&1&\ldots&1\\ 1&w&w^2&\ldots&w^{N-1}\\ 1&w^2&w^4&\ldots&w^{2(N-1)}\\ \vdots&\vdots&\vdots&&\vdots\\ 1&w^{N-1}&w^{2(N-1)}&\ldots&w^{(N-1)^2} \end{pmatrix} [[/math]]
That is, [math]F_N=(w^{ij})_{ij}[/math], with indices [math]i,j\in\{0,1,\ldots,N-1\}[/math], taken modulo [math]N[/math].

Observe that this matrix is Hadamard, in the sense that its entries are on the unit circle, and the rows are pairwise orthogonal. In fact, in general, we have:

Theorem

Given a finite abelian group [math]G[/math], with dual group [math]\widehat{G}=\{\chi:G\to\mathbb T\}[/math], consider the corresponding Fourier coupling, namely:

[[math]] \mathcal F_G:G\times\widehat{G}\to\mathbb T\quad,\quad (i,\chi)\to\chi(i) [[/math]]

  • Via the standard isomorphism [math]G\simeq\widehat{G}[/math], this Fourier coupling can be regarded as a square matrix, [math]F_G\in M_G(\mathbb T)[/math], which is a complex Hadamard matrix.
  • In the case of the cyclic group [math]G=\mathbb Z_N[/math] we obtain in this way, via the standard identification [math]\mathbb Z_N=\{1,\ldots,N\}[/math], the Fourier matrix [math]F_N[/math].
  • In general, when using a decomposition [math]G=\mathbb Z_{N_1}\times\ldots\times\mathbb Z_{N_k}[/math], the corresponding Fourier matrix is given by [math]F_G=F_{N_1}\otimes\ldots\otimes F_{N_k}[/math].


Show Proof

This follows indeed by using the above finite abelian group theory:


(1) With the identification [math]G\simeq\widehat{G}[/math] made our matrix is given by [math](F_G)_{i\chi}=\chi(i)[/math], and the scalar products between the rows are computed as follows:

[[math]] \lt R_i,R_j \gt =\sum_\chi\chi(i)\overline{\chi(j)} =\sum_\chi\chi(i-j) =|G|\cdot\delta_{ij} [[/math]]


Thus, we obtain indeed a complex Hadamard matrix.


(2) This follows from the well-known and elementary fact that, via the identifications [math]\mathbb Z_N=\widehat{\mathbb Z_N}=\{1,\ldots,N\}[/math], the Fourier coupling here is as follows, with [math]w=e^{2\pi i/N}[/math]:

[[math]] (i,j)\to w^{ij} [[/math]]


(3) We use here the following formula that we know, for the duals of products:

[[math]] \widehat{H\times K}=\widehat{H}\times\widehat{K} [[/math]]


At the level of the corresponding Fourier couplings, we obtain from this:

[[math]] F_{H\times K}=F_H\otimes F_K [[/math]]


Now by decomposing [math]G[/math] into cyclic groups, as in the statement, and by using (2) for the cyclic components, we obtain the formula in the statement.

As a nice application of discrete Fourier analysis, we have:

Theorem

For a matrix [math]M\in M_N(\mathbb C)[/math], the following are equivalent:

  • [math]M[/math] is circulant, [math]M_{ij}=\xi_{j-i}[/math], for a certain vector [math]\xi\in\mathbb C^N[/math].
  • [math]M[/math] is Fourier-diagonal, [math]M=F_NQF_N^*[/math], for a certain diagonal matrix [math]Q[/math].

Moreover, if these conditions hold, then [math]\xi=F_N^*q[/math], where [math]q=(Q_{11},\ldots,Q_{NN})[/math].


Show Proof

This follows from some computations with roots of unity, as follows:


[math](1)\implies(2)[/math] Assuming [math]M_{ij}=\xi_{j-i}[/math], the matrix [math]Q=F_N^*MF_N[/math] is indeed diagonal, as shown by the following computation:

[[math]] \begin{eqnarray*} Q_{ij} &=&\sum_{kl}w^{jl-ik}\xi_{l-k}\\ &=&\sum_{kr}w^{j(k+r)-ik}\xi_r\\ &=&N\delta_{ij}\sum_rw^{jr}\xi_r \end{eqnarray*} [[/math]]


[math](2)\implies(1)[/math] Assuming [math]Q=diag(q_1,\ldots,q_N)[/math], the matrix [math]M=F_NQF_N^*[/math] is indeed circulant, as shown by the following computation:

[[math]] M_{ij} =\sum_kw^{ik}Q_{kk}w^{-jk} =\sum_kw^{(i-j)k}q_k [[/math]]


Indeed, since the last term depends only on [math]j-i[/math], we have [math]M_{ij}=\xi_{j-i}[/math], with [math]\xi_i=\sum_kw^{-ik}q_k=(F_N^*q)_i[/math]. Thus, we are led to the conclusions in the statement.

As an illustration for the above result, the all-one matrix diagonalizes as follows:

[[math]] \begin{pmatrix} 1&\ldots&\ldots&1\\ \vdots&&&\vdots\\ \vdots&&&\vdots\\ 1&\ldots&\ldots&1\end{pmatrix} =\frac{1}{N}\,F_N \begin{pmatrix} N\\ &0\\ &&\ddots\\ &&&0\end{pmatrix}F_N^* [[/math]]


But you might know this already, since this was a recurrent exercise, in this book.

7d. Independence, limits

With the above theory in hand, let us go back to probability. Our claim is that things are very interesting here, with the real-life notion of independence corresponding to our mathematical notion of convolution, and with the Fourier transform being something very useful in probability, in order to understand the independence. Let us start with:

Definition

Two variables [math]f,g\in L^\infty(X)[/math] are called independent when

[[math]] E(f^kg^l)=E(f^k)\, E(g^l) [[/math]]
happens, for any [math]k,l\in\mathbb N[/math].

This definition hides some non-trivial things. Indeed, by linearity, we would like to have a formula as follows, valid for any polynomials [math]P,Q\in\mathbb R[X][/math]:

[[math]] E[P(f)Q(g)]=E[P(f)]\,E[Q(g)] [[/math]]


By a continuity argument, it is enough to have this formula for the characteristic functions [math]\chi_I,\chi_J[/math] of the arbitrary measurable sets of real numbers [math]I,J\subset\mathbb R[/math]:

[[math]] E[\chi_I(f)\chi_J(g)]=E[\chi_I(f)]\,E[\chi_J(g)] [[/math]]


Thus, we are led to the usual definition of independence, namely:

[[math]] P(f\in I,g\in J)=P(f\in I)\,P(g\in J) [[/math]]


All this might seem a bit abstract, but in practice, the idea is of course that [math]f,g[/math] must be independent, in an intuitive, real-life sense. As a first result now, we have:

Theorem

Assuming that [math]f,g\in L^\infty(X)[/math] are independent, we have

[[math]] \mu_{f+g}=\mu_f*\mu_g [[/math]]
where [math]*[/math] is the convolution of real probability measures.


Show Proof

We have the following computation, using the independence of [math]f,g[/math]:

[[math]] \begin{eqnarray*} M_k(f+g) &=&E((f+g)^k)\\ &=&\sum_r\binom{k}{r}E(f^rg^{k-r})\\ &=&\sum_r\binom{k}{r}M_r(f)M_{k-r}(g) \end{eqnarray*} [[/math]]


On the other hand, we have as well the following computation:

[[math]] \begin{eqnarray*} \int_\mathbb Rx^kd(\mu_f*\mu_g)(x) &=&\int_{\mathbb R\times\mathbb R}(x+y)^kd\mu_f(x)d\mu_g(y)\\ &=&\sum_r\binom{k}{r}\int_\mathbb Rx^rd\mu_f(x)\int_\mathbb Ry^{k-r}d\mu_g(y)\\ &=&\sum_r\binom{k}{r}M_r(f)M_{k-r}(g) \end{eqnarray*} [[/math]]


Thus [math]\mu_{f+g}[/math] and [math]\mu_f*\mu_g[/math] have the same moments, and so they coincide, as claimed.

Here is now a second result on independence, which is something more advanced:

Theorem

Assuming that [math]f,g\in L^\infty(X)[/math] are independent, we have

[[math]] F_{f+g}=F_fF_g [[/math]]
where [math]F_f(x)=E(e^{ixf})[/math] is the Fourier transform.


Show Proof

We have the following computation, by using Theorem 7.30:

[[math]] \begin{eqnarray*} F_{f+g}(x) &=&\int_\mathbb Re^{ixz}d\mu_{f+g}(z)\\ &=&\int_\mathbb Re^{ixz}d(\mu_f*\mu_g)(z)\\ &=&\int_{\mathbb R\times\mathbb R}e^{ix(z+t)}d\mu_f(z)d\mu_g(t)\\ &=&\int_\mathbb Re^{ixz}d\mu_f(z)\int_\mathbb Re^{ixt}d\mu_g(t)\\ &=&F_f(x)F_g(x) \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

In order to work out some basic illustrations, let us go back to the Poisson laws [math]p_t[/math], from chapter 4. We can now say more about them, first with the following result:

Theorem

We have the following formula, for any [math]s,t \gt 0[/math],

[[math]] p_s*p_t=p_{s+t} [[/math]]
so the Poisson laws form a convolution semigroup.


Show Proof

By using [math]\delta_k*\delta_l=\delta_{k+l}[/math] and the binomial formula, we obtain:

[[math]] \begin{eqnarray*} p_s*p_t &=&e^{-s}\sum_k\frac{s^k}{k!}\,\delta_k*e^{-t}\sum_l\frac{t^l}{l!}\,\delta_l\\ &=&e^{-s-t}\sum_n\delta_n\sum_{k+l=n}\frac{s^kt^l}{k!l!}\\ &=&e^{-s-t}\sum_n\frac{\delta_n}{n!}\sum_{k+l=n}\frac{n!}{k!l!}s^kt^l\\\ &=&e^{-s-t}\sum_n\frac{(s+t)^n}{n!}\,\delta_n\\ &=&p_{s+t} \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

Next in line, we have the following result, which is fundamental as well:

Theorem

The Poisson laws appear as formal exponentials

[[math]] p_t=\sum_k\frac{t^k(\delta_1-\delta_0)^{*k}}{k!} [[/math]]
with respect to the convolution of measures [math]*[/math].


Show Proof

By using the binomial formula, the measure on the right is:

[[math]] \begin{eqnarray*} \mu &=&\sum_k\frac{t^k}{k!}\sum_{r+s=k}(-1)^s\frac{k!}{r!s!}\delta_r\\ &=&\sum_kt^k\sum_{r+s=k}(-1)^s\frac{\delta_r}{r!s!}\\ &=&\sum_r\frac{t^r\delta_r}{r!}\sum_s\frac{(-1)^s}{s!}\\ &=&\frac{1}{e}\sum_r\frac{t^r\delta_r}{r!}\\ &=&p_t \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

Regarding now the Fourier transform computation, this is as follows:

Theorem

The Fourier transform of [math]p_t[/math] is given by

[[math]] F_{p_t}(y)=\exp\left((e^{iy}-1)t\right) [[/math]]
for any [math]t \gt 0[/math].


Show Proof

We have indeed the following computation:

[[math]] \begin{eqnarray*} F_{p_t}(y) &=&e^{-t}\sum_k\frac{t^k}{k!}F_{\delta_k}(y)\\ &=&e^{-t}\sum_k\frac{t^k}{k!}\,e^{iky}\\ &=&e^{-t}\sum_k\frac{(e^{iy}t)^k}{k!}\\ &=&\exp(-t)\exp(e^{iy}t)\\ &=&\exp\left((e^{iy}-1)t\right) \end{eqnarray*} [[/math]]


Thus, we obtain the formula in the statement.

Observe that the above formula gives an alternative proof for Theorem 7.32, by the using the fact that the logarithm of the Fourier transform linearizes the convolution. As another application, we can now establish the Poisson Limit Theorem, as follows:

Theorem (PLT)

We have the following convergence, in moments,

[[math]] \left(\left(1-\frac{t}{n}\right)\delta_0+\frac{t}{n}\,\delta_1\right)^{*n}\to p_t [[/math]]
for any [math]t \gt 0[/math].


Show Proof

Let us denote by [math]\nu_n[/math] the measure under the convolution sign, namely:

[[math]] \nu_n=\left(1-\frac{t}{n}\right)\delta_0+\frac{t}{n}\,\delta_1 [[/math]]


We have the following computation, for the Fourier transform of the limit:

[[math]] \begin{eqnarray*} F_{\delta_r}(y)=e^{iry} &\implies&F_{\nu_n}(y)=\left(1-\frac{t}{n}\right)+\frac{t}{n}\,e^{iy}\\ &\implies&F_{\nu_n^{*n}}(y)=\left(\left(1-\frac{t}{n}\right)+\frac{t}{n}\,e^{iy}\right)^n\\ &\implies&F_{\nu_n^{*n}}(y)=\left(1+\frac{(e^{iy}-1)t}{n}\right)^n\\ &\implies&F(y)=\exp\left((e^{iy}-1)t\right) \end{eqnarray*} [[/math]]


Thus, we obtain indeed the Fourier transform of [math]p_t[/math], as desired.

We have accumulated so far a lot of probability knowledge, going along with our learning of calculus, in chapter 4, chapter 6, and in the present chapter. All this certainly needs some systematic discussion, and we will be back to it once we will have the full calculus tools that are needed, in chapter 14, which will be dedicated to probability.

General references

Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].