Normal variables

This article was automatically generated from a tex file and may contain conversion errors. If permitted, you may login and edit this article to improve the conversion.

14a. Probability basics

Good news, with the calculus that we know, we can have a crash course in advanced probability. We have already met some probability in this book, scattered at the end of chapters 4, 6, 7, and we will first review that material. Then, helped by the Gauss integral formula [math]\int_\mathbb R e^{-x^2}=\sqrt{\pi}[/math] that we just learned, and by multivariable integration in general, we will develop the theory of normal variables, real and complex, which is central to advanced probability. And then, there will certainly be more, all interesting things.

Sounds exciting, doesn't it. Cat however seems unfazed, and declares: \begin{cat} Probability is the same thing as measure theory. Read Rudin. \end{cat} Damn cat, looks like we disagree more and more as time goes by, especially on these multivariable integration topics. So, let me ask you cat, how many times did you come upon a function which is not integrable? How many times did you fail catching a mouse, due to a failure of Fubini? What about catching birds, does the Zorn lemma really help there? Things are nice and smooth in life, or at least that's my belief. \begin{cat} Yes for mice and birds, but electrons can be quite tricky. \end{cat} Humm, good point, but let's leave electrons for later, for chapter 16. So, forgetting now about philosophy, and pedagogy matters, but dear reader feel free to have your own opinion here, and why not agreeing with cat, and going ahead with our plan, let us first review the few things that we know about probability, from chapters 4, 6, 7.

There will be of course a bit of redundancy, but always good to talk again about that things, with a bit less details of course, this time. As a starting point, we have:

Definition

Let [math]X[/math] be a probability space, that is, a space with a probability measure, and with the corresponding integration denoted [math]E[/math], and called expectation.

The random variables are the real functions [math]f\in L^\infty(X)[/math].
The moments of such a variable are the numbers [math]M_k(f)=E(f^k)[/math].
The law of such a variable is the measure given by [math]M_k(f)=\int_\mathbb Rx^kd\mu_f(x)[/math].

Here, as explained in chapter 7, the fact that [math]\mu_f[/math] as above exists indeed is not exactly trivial. But we can do this by looking at formulae of the following type:

[[math]] E(\varphi(f))=\int_\mathbb R\varphi(x)d\mu_f(x) [[/math]]

Indeed, having this for monomials [math]\varphi(x)=x^n[/math], as above, is the same as having it for polynomials [math]\varphi\in\mathbb R[X][/math], which in turn is the same as having it for the characteristic functions [math]\varphi=\chi_I[/math] of measurable sets [math]I\subset\mathbb R[/math]. Thus, in the end, what we need is:

[[math]] P(f\in I)=\mu_f(I) [[/math]]

But this formula can serve as a definition for [math]\mu_f[/math], and we are done.

Regarding now independence, as explained in chapter 7, we can formulate here:

Definition

Two variables [math]f,g\in L^\infty(X)[/math] are called independent when

[[math]] E(f^kg^l)=E(f^k)\,E(g^l) [[/math]]

happens, for any [math]k,l\in\mathbb N[/math].

Again, this definition hides some non-trivial things, the idea being a bit as before, namely that of looking at formulae of the following type:

[[math]] E[\varphi(f)\psi(g)]=E[\varphi(f)]\,E[\psi(g)] [[/math]]

To be more precise, passing as before from monomials to polynomials, then to characteristic functions, we are led to the usual definition of independence, namely:

[[math]] P(f\in I,g\in J)=P(f\in I)\,P(g\in J) [[/math]]

As a first result now, that we know well from chapter 7, we have:

Theorem

Assuming that [math]f,g\in L^\infty(X)[/math] are independent, we have

[[math]] \mu_{f+g}=\mu_f*\mu_g [[/math]]

where [math]*[/math] is the convolution of real probability measures.

Show Proof

We have the following computation, using the independence of [math]f,g[/math]:

[[math]] \int_\mathbb Rx^kd\mu_{f+g}(x) =E((f+g)^k) =\sum_r\binom{k}{r}M_r(f)M_{k-r}(g) [[/math]]

On the other hand, we have as well the following computation:

[[math]] \begin{eqnarray*} \int_\mathbb Rx^kd(\mu_f*\mu_g)(x) &=&\int_{\mathbb R\times\mathbb R}(x+y)^kd\mu_f(x)d\mu_g(y)\\ &=&\sum_r\binom{k}{r}M_r(f)M_{k-r}(g) \end{eqnarray*} [[/math]]

Thus [math]\mu_{f+g}[/math] and [math]\mu_f*\mu_g[/math] have the same moments, so they coincide, as claimed.

■

As a second result on independence, which is more advanced, we have:

Theorem

Assuming that [math]f,g\in L^\infty(X)[/math] are independent, we have

[[math]] F_{f+g}=F_fF_g [[/math]]

where [math]F_f(x)=E(e^{ixf})[/math] is the Fourier transform.

Show Proof

This is something that we know too from chapter 7, coming from:

[[math]] \begin{eqnarray*} F_{f+g}(x) &=&\int_\mathbb Re^{ixz}d(\mu_f*\mu_g)(z)\\ &=&\int_{\mathbb R\times\mathbb R}e^{ix(z+t)}d\mu_f(z)d\mu_g(t)\\ &=&\int_\mathbb Re^{ixz}d\mu_f(z)\int_\mathbb Re^{ixt}d\mu_g(t)\\ &=&F_f(x)F_g(x) \end{eqnarray*} [[/math]]

Thus, we are led to the conclusion in the statement.

■

So long for laws, moments and independence, at the theoretical level. In what follows Definitions 14.3 and 14.4 and Theorems 14.5 and 14.6 are what we need, I mean we will certainly get away with that, as general theory, and this no matter what cat says.

However, still staying general, we will need as well, from time to time, some key complex analysis results that we established in chapter 6. First, we have:

Theorem

The density of a real probability measure [math]\mu[/math] can be recaptured from the sequence of moments [math]\{M_k\}_{k\geq0}[/math] via the Stieltjes inversion formula

[[math]] d\mu (x)=\lim_{t\searrow 0}-\frac{1}{\pi}\,Im\left(G(x+it)\right)\cdot dx [[/math]]

where the function on the right, given in terms of moments by

[[math]] G(\xi)=\xi^{-1}+M_1\xi^{-2}+M_2\xi^{-3}+\ldots [[/math]]

is the Cauchy transform of the measure [math]\mu[/math].

Show Proof

This is something quite subtle and heavy, and for the full proof, along with some basic applications, we refer to chapter 6, the idea being as follows:

(1) Regarding the proof, the Cauchy transform of our measure [math]\mu[/math] is given by:

[[math]] G(\xi) =\xi^{-1}\sum_{k=0}^\infty M_k\xi^{-k} =\int_\mathbb R\frac{1}{\xi-y}\,d\mu(y) [[/math]]

Now with [math]\xi=x+it[/math], we obtain from this the following formula:

[[math]] \begin{eqnarray*} Im(G(x+it)) &=&\int_\mathbb RIm\left(\frac{1}{x-y+it}\right)d\mu(y)\\ &=&-\int_\mathbb R\frac{t}{(x-y)^2+t^2}\,d\mu(y) \end{eqnarray*} [[/math]]

By integrating over [math][a,b][/math] we obtain, with the change of variables [math]x=y+tz[/math]:

[[math]] \begin{eqnarray*} \int_a^bIm(G(x+it))dx &=&-\int_\mathbb R\int_{(a-y)/t}^{(b-y)/t}\frac{t}{(tz)^2+t^2}\,t\,dz\,d\mu(y)\\ &=&-\int_\mathbb R\int_{(a-y)/t}^{(b-y)/t}\frac{1}{1+z^2}\,dz\,d\mu(y)\\ &=&-\int_\mathbb R\left(\arctan\frac{b-y}{t}-\arctan\frac{a-y}{t}\right)d\mu(y) \end{eqnarray*} [[/math]]

(2) The point now is that with [math]t\searrow0[/math] we have the following estimates:

[[math]] \lim_{t\searrow0}\left(\arctan\frac{b-y}{t}-\arctan\frac{a-y}{t}\right) =\begin{cases} \frac{\pi}{2}-\frac{\pi}{2}=0& (y \lt a)\\ \frac{\pi}{2}-0=\frac{\pi}{2}& (y=a)\\ \frac{\pi}{2}-(-\frac{\pi}{2})=\pi& (a \lt y \lt b)\\ 0-(-\frac{\pi}{2})=\frac{\pi}{2}& (y=b)\\ -\frac{\pi}{2}-(-\frac{\pi}{2})=0& (y \gt b) \end{cases} [[/math]]

We therefore obtain the following formula, which proves our result:

[[math]] \lim_{t\searrow0}\int_a^bIm(G(x+it))dx=-\pi\left(\mu(a,b)+\frac{\mu(a)+\mu(b)}{2}\right) [[/math]]

(3) As applications, the first thing that we did in chapter 6 was to find the measure having as even moments the Catalan numbers, [math]C_k=\frac{1}{k+1}\binom{2k}{k}[/math], and having all odd moments [math]0[/math]. We found the following measure, called Wigner semicircle law on [math][-2,2][/math]:

[[math]] \gamma_1=\frac{1}{2\pi}\sqrt{4-x^2}dx [[/math]]

(4) We also computed the measure having as moments the Catalan numbers, [math]M_k=C_k[/math]. We found the following measure, called Marchenko-Pastur law on [math][0,4][/math]:

[[math]] \pi_1=\frac{1}{2\pi}\sqrt{4x^{-1}-1}\,dx [[/math]]

(5) Next, we found that the measure having as moments the central binomial coefficients, [math]D_k=\binom{2k}{k}[/math], is the following measure, called arcsine law on [math][0,4][/math]:

[[math]] \alpha_1=\frac{1}{\pi\sqrt{x(4-x)}}\,dx [[/math]]

(6) Finally, for the middle binomial coefficients, [math]E_k=\binom{k}{[k/2]}[/math], we found the following law on [math][-2,2][/math], called modified arcsine law on [math][-2,2][/math]:

[[math]] \sigma_1=\frac{1}{2\pi}\sqrt{\frac{2+x}{2-x}}\,dx [[/math]]

(7) All this is very nice, and as already mentioned in chapter 6, these measures are those appearing via random walks on basic graphs. In addition, these measures are as well the main laws in Random Matrix Theory (RMT). More on them later.

■

We have as well the following result, also from chapter 6, which can be useful too:

Theorem

A sequence of numbers [math]M_0,M_1,M_2,M_3,\ldots\in\mathbb R[/math], with [math]M_0=1[/math], is the series of moments of a real probability measure [math]\mu[/math] precisely when:

[[math]] \begin{vmatrix}M_0\end{vmatrix}\geq0\quad,\quad \begin{vmatrix} M_0&M_1\\ M_1&M_2 \end{vmatrix}\geq0\quad,\quad \begin{vmatrix} M_0&M_1&M_2\\ M_1&M_2&M_3\\ M_2&M_3&M_4\\ \end{vmatrix}\geq0\quad,\quad \ldots [[/math]]

That is, the associated Hankel determinants must be all positive.

Show Proof

This is something a bit heavier, and as a first observation, the positivity conditions in the statement tell us that the following linear forms must be positive:

[[math]] \sum_{i,j=1}^nc_i\bar{c}_jM_{i+j}\geq0 [[/math]]

But this is something very classical, in one sense the result being elementary, coming from the following computation, which shows that we have positivity indeed:

[[math]] \int_\mathbb R\left|\sum_{i=1}^nc_ix^i\right|^2d\mu(x) =\int_\mathbb R\sum_{i,j=1}^nc_i\bar{c}_jx^{i+j}d\mu(x) =\sum_{i,j=1}^nc_i\bar{c}_jM_{i+j} [[/math]]

As for the other sense, here the result comes once again from the above formula, this time via some standard study, inspired from the positivity results from chapter 12.

■

All this is very nice, and we have some interesting theory going on. Let us discuss now some illustrations. We will first talk about discrete probability. We have:

Definition

The Poisson law of parameter [math]t \gt 0[/math] is the measure

[[math]] p_t=e^{-t}\sum_{k\in\mathbb N}\frac{t^k}{k!}\,\delta_k [[/math]]

with the letter “p” standing for Poisson.

We have already talked about these laws, first in chapter 4, with an elementary discussion, using binomials and factorials, and then in chapter 7 too. So, let us quickly review what we know. Going directly for the kill, Fourier transform computation, we have:

Theorem

The Fourier transform of [math]p_t[/math] is given by:

[[math]] F_{p_t}(y)=\exp\left((e^{iy}-1)t\right) [[/math]]

In particular we have [math]p_s*p_t=p_{s+t}[/math], called convolution semigroup property.

Show Proof

We have indeed the following computation:

[[math]] \begin{eqnarray*} F_{p_t}(y) &=&e^{-t}\sum_k\frac{t^k}{k!}F_{\delta_k}(y)\\ &=&e^{-t}\sum_k\frac{t^k}{k!}\,e^{iky}\\ &=&e^{-t}\sum_k\frac{(e^{iy}t)^k}{k!}\\ &=&\exp\left((e^{iy}-1)t\right) \end{eqnarray*} [[/math]]

As for the second assertion, this follows from the fact that [math]\log F_{p_t}[/math] is linear in [math]t[/math], via the linearization property for the convolution from Theorem 14.6.

■

The above result suggests that the laws [math]p_t[/math] should appear as some sort of exponentials with respect to convolution, and yes indeed, this is the case, as shown by:

Proposition

The Poisson laws appear as formal exponentials

[[math]] p_t=\sum_k\frac{t^k(\delta_1-\delta_0)^{*k}}{k!} [[/math]]

with respect to the convolution of measures [math]*[/math].

Show Proof

By using the binomial formula, the measure on the right is:

[[math]] \begin{eqnarray*} \mu &=&\sum_k\frac{t^k}{k!}\sum_{r+s=k}(-1)^s\frac{k!}{r!s!}\delta_r\\ &=&\sum_r\frac{t^r\delta_r}{r!}\sum_s\frac{(-1)^s}{s!}\\ &=&\frac{1}{e}\sum_r\frac{t^r\delta_r}{r!}\\ &=&p_t \end{eqnarray*} [[/math]]

Thus, we are led to the conclusion in the statement.

■

As a main result now, we have the Poisson Limit Theorem, as follows:

Theorem (PLT)

We have the following convergence, in moments,

[[math]] \left(\left(1-\frac{t}{n}\right)\delta_0+\frac{t}{n}\delta_1\right)^{*n}\to p_t [[/math]]

for any [math]t \gt 0[/math].

Show Proof

Indeed, if we denote by [math]\nu_n[/math] the measure under the convolution sign, we have the following computation, for the Fourier transform of the limit:

[[math]] \begin{eqnarray*} F_{\delta_r}(y)=e^{iry} &\implies&F_{\nu_n}(y)=\left(1-\frac{t}{n}\right)+\frac{t}{n}e^{iy}\\ &\implies&F_{\nu_n^{*n}}(y)=\left(\left(1-\frac{t}{n}\right)+\frac{t}{n}e^{iy}\right)^n\\ &\implies&F_{\nu_n^{*n}}(y)=\left(1+\frac{(e^{iy}-1)t}{n}\right)^n\\ &\implies&F(y)=\exp\left((e^{iy}-1)t\right) \end{eqnarray*} [[/math]]

Thus, we obtain indeed the Fourier transform of [math]p_t[/math], as desired.

■

Finally, one more thing that we know about the Poisson laws are some interesting formulae for their moments, from chapter 4. The result there was as follows:

Theorem

The moments of [math]p_1[/math] are the Bell numbers,

[[math]] M_k(p_1)=|P(k)| [[/math]]

where [math]P(k)[/math] is the set of partitions of [math]\{1,\ldots,k\}[/math]. More generally, we have

[[math]] M_k(p_t)=\sum_{\pi\in P(k)}t^{|\pi|} [[/math]]

for any [math]t \gt 0[/math], where [math]|.|[/math] is the number of blocks.

Show Proof

We know that the moments of [math]p_1[/math] are given by the following formula:

[[math]] M_k=\frac{1}{e}\sum_r\frac{r^k}{r!} [[/math]]

We therefore have the following recurrence formula for these moments:

[[math]] \begin{eqnarray*} M_{k+1} &=&\frac{1}{e}\sum_r\frac{r^k}{r!}\left(1+\frac{1}{r}\right)^k\\ &=&\frac{1}{e}\sum_r\frac{r^k}{r!}\sum_s\binom{k}{s}r^{-s}\\ &=&\sum_s\binom{k}{s}M_{k-s} \end{eqnarray*} [[/math]]

But the Bell numbers [math]B_k=|P(k)|[/math] satisfy the same recurrence, so we have [math]M_k=B_k[/math], as claimed. Next, we know that the moments of [math]p_t[/math] with [math]t \gt 0[/math] are given by:

[[math]] N_k=e^{-t}\sum_r\frac{t^rr^k}{r!} [[/math]]

We therefore have the following recurrence formula for these moments:

[[math]] \begin{eqnarray*} N_{k+1} &=&e^{-t}\sum_r\frac{t^{r+1}r^k}{r!}\left(1+\frac{1}{r}\right)^k\\ &=&e^{-t}\sum_r\frac{t^{r+1}r^k}{r!}\sum_s\binom{k}{s}r^{-s}\\ &=&t\sum_s\binom{k}{s}N_{k-s} \end{eqnarray*} [[/math]]

But the numbers [math]S_k=\sum_{\pi\in P(k)}t^{|\pi|}[/math] are easily seen to satisfy the same recurrence, with the same initial values, namely [math]t[/math] and [math]t+t^2[/math], so we have [math]N_k=S_k[/math], as claimed.

■

Summarizing, we know so far what probability theory is, modulo perhaps some annoying foundational details, that do not seem needed for computations. Then we have some powerful tools, namely convolution, Fourier, Stieltjes, and Hankel determinants if needed. Then, as a main achievement, we have a full theory in the main discrete case, that of the Poisson laws. And finally, in the continuous case, we came across a number of interesting laws, namely Wigner, Marchenko-Pastur, arcsine, and modified arcsine.

This is not bad, as a start. In what follows we will first do the necessary, namely mirror what we know about the Poisson laws, with a complete study of the normal laws, which are the central objects in continuous probability. And then, we will go for more.

14b. Normal variables

In order to introduce the normal variables, we will need the Gauss integral formula, established in chapter 13. So, let us recall from there that we have:

Theorem

We have the following formula,

[[math]] \int_\mathbb Re^{-x^2}dx=\sqrt{\pi} [[/math]]

called Gauss integral formula.

Show Proof

Let [math]I[/math] be the above integral. By using polar coordinates, we obtain:

[[math]] \begin{eqnarray*} I^2 &=&\int_\mathbb R\int_\mathbb Re^{-x^2-y^2}dxdy\\ &=&\int_0^{2\pi}\int_0^\infty e^{-r^2}rdrdt\\ &=&2\pi\times\frac{1}{2} \end{eqnarray*} [[/math]]

Thus, we are led to the formula in the statement.

■

As a main application of the Gauss formula, we can now formulate:

Definition

The normal law of parameter [math]1[/math] is the following measure:

[[math]] g_1=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}dx [[/math]]

More generally, the normal law of parameter [math]t \gt 0[/math] is the following measure:

[[math]] g_t=\frac{1}{\sqrt{2\pi t}}e^{-x^2/2t}dx [[/math]]

These are also called Gaussian distributions, with “g” standing for Gauss.

Observe that the above laws have indeed mass 1, as they should. This follows indeed from the Gauss formula, which gives, with [math]x=\sqrt{2t}\,y[/math]:

[[math]] \begin{eqnarray*} \int_\mathbb R e^{-x^2/2t}dx &=&\int_\mathbb R e^{-y^2}\sqrt{2t}\,dy\\ &=&\sqrt{2t}\int_\mathbb R e^{-y^2}dy\\ &=&\sqrt{2t}\times\sqrt{\pi}\\ &=&\sqrt{2\pi t} \end{eqnarray*} [[/math]]

Generally speaking, the normal laws appear as bit everywhere, in real life. The reasons behind this phenomenon come from the Central Limit Theorem (CLT), that we will explain in a moment, after developing some general theory. As a first result, we have:

Proposition

We have the variance formula

[[math]] V(g_t)=t [[/math]]

valid for any [math]t \gt 0[/math].

Show Proof

The first moment is 0, because our normal law [math]g_t[/math] is centered. As for the second moment, this can be computed as follows:

[[math]] \begin{eqnarray*} M_2 &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Rx^2e^{-x^2/2t}dx\\ &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb R(tx)\left(-e^{-x^2/2t}\right)'dx\\ &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Rte^{-x^2/2t}dx\\ &=&t \end{eqnarray*} [[/math]]

We conclude from this that the variance is [math]V=M_2=t[/math].

■

Here is another result, which is the key one for the study of the normal laws:

Theorem

We have the following formula, valid for any [math]t \gt 0[/math]:

[[math]] F_{g_t}(x)=e^{-tx^2/2} [[/math]]

In particular, the normal laws satisfy [math]g_s*g_t=g_{s+t}[/math], for any [math]s,t \gt 0[/math].

Show Proof

The Fourier transform formula can be established as follows:

[[math]] \begin{eqnarray*} F_{g_t}(x) &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Re^{-y^2/2t+ixy}dy\\ &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Re^{-(y/\sqrt{2t}-\sqrt{t/2}ix)^2-tx^2/2}dy\\ &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Re^{-z^2-tx^2/2}\sqrt{2t}dz\\ &=&\frac{1}{\sqrt{\pi}}e^{-tx^2/2}\int_\mathbb Re^{-z^2}dz\\ &=&\frac{1}{\sqrt{\pi}}e^{-tx^2/2}\cdot\sqrt{\pi}\\ &=&e^{-tx^2/2} \end{eqnarray*} [[/math]]

As for the last assertion, this follows from the fact that [math]\log F_{g_t}[/math] is linear in [math]t[/math].

■

We are now ready to state and prove the CLT, as follows:

Theorem (CLT)

Given random variables [math]f_1,f_2,f_3,\ldots\in L^\infty(X)[/math] which are i.i.d., centered, and with variance [math]t \gt 0[/math], we have, with [math]n\to\infty[/math], in moments,

[[math]] \frac{1}{\sqrt{n}}\sum_{i=1}^nf_i\sim g_t [[/math]]

where [math]g_t[/math] is the Gaussian law of parameter [math]t[/math], having as density [math]\frac{1}{\sqrt{2\pi t}}e^{-y^2/2t}dy[/math].

Show Proof

We use the Fourier transform, which is by definition given by:

[[math]] F_f(x)=E(e^{ixf}) [[/math]]

In terms of moments, we have the following formula:

[[math]] \begin{eqnarray*} F_f(x) &=&E\left(\sum_{k=0}^\infty\frac{(ixf)^k}{k!}\right)\\ &=&\sum_{k=0}^\infty\frac{(ix)^kE(f^k)}{k!}\\ &=&\sum_{k=0}^\infty\frac{i^kM_k(f)}{k!}\,x^k \end{eqnarray*} [[/math]]

Thus, the Fourier transform of the variable in the statement is:

[[math]] \begin{eqnarray*} F(x) &=&\left[F_f\left(\frac{x}{\sqrt{n}}\right)\right]^n\\ &=&\left[1-\frac{tx^2}{2n}+O(n^{-2})\right]^n\\ &\simeq&\left[1-\frac{tx^2}{2n}\right]^n\\ &\simeq&e^{-tx^2/2} \end{eqnarray*} [[/math]]

But this latter function being the Fourier transform of [math]g_t[/math], we obtain the result.

■

Let us discuss now some further properties of the normal law. We first have:

Proposition

The even moments of the normal law are the numbers

[[math]] M_k(g_t)=t^{k/2}\times k!! [[/math]]

where [math]k!!=(k-1)(k-3)(k-5)\ldots\,[/math], and the odd moments vanish.

Show Proof

We have the following computation, valid for any integer [math]k\in\mathbb N[/math]:

[[math]] \begin{eqnarray*} M_k &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Ry^ke^{-y^2/2t}dy\\ &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb R(ty^{k-1})\left(-e^{-y^2/2t}\right)'dy\\ &=&\frac{1}{\sqrt{2\pi t}}\int_\mathbb Rt(k-1)y^{k-2}e^{-y^2/2t}dy\\ &=&t(k-1)\times\frac{1}{\sqrt{2\pi t}}\int_\mathbb Ry^{k-2}e^{-y^2/2t}dy\\ &=&t(k-1)M_{k-2} \end{eqnarray*} [[/math]]

Now recall from the proof of Proposition 14.16 that we have [math]M_0=1[/math], [math]M_1=0[/math]. Thus by recurrence, we are led to the formula in the statement.

■

We have the following alternative formulation of the above result:

Proposition

The moments of the normal law are the numbers

[[math]] M_k(g_t)=t^{k/2}|P_2(k)| [[/math]]

where [math]P_2(k)[/math] is the set of pairings of [math]\{1,\ldots,k\}[/math].

Show Proof

Let us count the pairings of [math]\{1,\ldots,k\}[/math]. In order to have such a pairing, we must pair [math]1[/math] with one of the numbers [math]2,\ldots,k[/math], and then use a pairing of the remaining [math]k-2[/math] numbers. Thus, we have the following recurrence formula:

[[math]] |P_2(k)|=(k-1)|P_2(k-2)| [[/math]]

As for the initial data, this is [math]P_1=0[/math], [math]P_2=1[/math]. Thus, we are led to the result.

■

We are not done yet, and here is one more improvement of the above:

Theorem

The moments of the normal law are the numbers

[[math]] M_k(g_t)=\sum_{\pi\in P_2(k)}t^{|\pi|} [[/math]]

where [math]P_2(k)[/math] is the set of pairings of [math]\{1,\ldots,k\}[/math], and [math]|.|[/math] is the number of blocks.

Show Proof

This follows indeed from Proposition 14.20, because the number of blocks of a pairing of [math]\{1,\ldots,k\}[/math] is trivially [math]k/2[/math], independently of the pairing.

■

Let us discuss now the complex analogues of all this, with a notion of complex normal, or Gaussian law. To start with, we have the following definition:

Definition

The complex normal, or Gaussian law of parameter [math]t \gt 0[/math] is

[[math]] G_t=law\left(\frac{1}{\sqrt{2}}(a+ib)\right) [[/math]]

where [math]a,b[/math] are independent, each following the law [math]g_t[/math].

In short, the complex normal laws appear as natural complexifications of the real normal laws. As in the real case, these measures form convolution semigroups:

Proposition

The complex Gaussian laws have the property

[[math]] G_s*G_t=G_{s+t} [[/math]]

for any [math]s,t \gt 0[/math], and so they form a convolution semigroup.

Show Proof

This follows indeed from the real result, namely [math]g_s*g_t=g_{s+t}[/math], established in Theorem 14.17, simply by taking the real and imaginary parts.

■

We have as well the following complex analogue of the CLT:

Theorem (CCLT)

Given complex variables [math]f_1,f_2,f_3,\ldots\in L^\infty(X)[/math] which are i.i.d., centered, and with common variance [math]t \gt 0[/math], we have

[[math]] \frac{1}{\sqrt{n}}\sum_{i=1}^nf_i\sim G_t [[/math]]

with [math]n\to\infty[/math], in moments.

Show Proof

This follows indeed from the real CLT, established in Theorem 14.18, simply by taking the real and imaginary parts of all the variables involved.

■

Regarding now the moments, the situation here is more complicated than in the real case, because in order to have good results, we have to deal with both the complex variables, and their conjugates. Let us formulate the following definition:

Definition

The moments a complex variable [math]f\in L^\infty(X)[/math] are the numbers

[[math]] M_k=E(f^k) [[/math]]

depending on colored integers [math]k=\circ\bullet\bullet\circ\ldots\,[/math], with the conventions

[[math]] f^\emptyset=1\quad,\quad f^\circ=f\quad,\quad f^\bullet=\bar{f} [[/math]]

and multiplicativity, in order to define the colored powers [math]f^k[/math].

Observe that, since [math]f,\bar{f}[/math] commute, we can permute terms, and restrict the attention to exponents of type [math]k=\ldots\circ\circ\circ\bullet\bullet\bullet\ldots\,[/math], if we want to. However, our results about the complex Gaussian laws, and other complex laws too, later on, will actually look better without doing is, so we will use Definition 14.25 as stated. We first have:

Theorem

The moments of the complex normal law are given by

[[math]] M_k(G_t)=\begin{cases} t^pp!&(k\ {\rm uniform, of\ length}\ 2p)\\ 0&(k\ {\rm not\ uniform}) \end{cases} [[/math]]

where [math]k=\circ\bullet\bullet\circ\ldots[/math] is called uniform when it contains the same number of [math]\circ[/math] and [math]\bullet[/math].

Show Proof

We must compute the moments, with respect to colored integer exponents [math]k=\circ\bullet\bullet\circ\ldots[/math]\,, of the variable from Definition 14.22, namely:

[[math]] f=\frac{1}{\sqrt{2}}(a+ib) [[/math]]

We can assume that we are in the case [math]t=1[/math], and the proof here goes as follows:

(1) As a first observation, in the case where our exponent [math]k=\circ\bullet\bullet\circ\ldots[/math] is not uniform, a standard rotation argument shows that the corresponding moment of [math]f[/math] vanishes. To be more precise, the variable [math]f'=wf[/math] is complex Gaussian too, for any complex number [math]w\in\mathbb T[/math], and from [math]M_k(f)=M_k(f')[/math] we obtain [math]M_k(f)=0[/math], in this case.

(2) In the uniform case now, where the exponent [math]k=\circ\bullet\bullet\circ\ldots[/math] consists of [math]p[/math] copies of [math]\circ[/math] and [math]p[/math] copies of [math]\bullet[/math]\,, the corresponding moment can be computed as follows:

[[math]] \begin{eqnarray*} M_k &=&\int(f\bar{f})^p\\ &=&\frac{1}{2^p}\int(a^2+b^2)^p\\ &=&\frac{1}{2^p}\sum_r\binom{p}{r}\int a^{2r}\int b^{2p-2r}\\ &=&\frac{1}{2^p}\sum_r\binom{p}{r}(2r)!!(2p-2r)!!\\ &=&\frac{1}{2^p}\sum_r\frac{p!}{r!(p-r)!}\cdot\frac{(2r)!}{2^rr!}\cdot\frac{(2p-2r)!}{2^{p-r}(p-r)!}\\ &=&\frac{p!}{4^p}\sum_r\binom{2r}{r}\binom{2p-2r}{p-r} \end{eqnarray*} [[/math]]

(3) In order to finish now the computation, let us recall that we have the following formula, coming from the generalized binomial formula, or from the Taylor formula:

[[math]] \frac{1}{\sqrt{1+t}}=\sum_{q=0}^\infty\binom{2q}{q}\left(\frac{-t}{4}\right)^q [[/math]]

By taking the square of this series, we obtain the following formula:

[[math]] \frac{1}{1+t} =\sum_p\left(\frac{-t}{4}\right)^p\sum_r\binom{2r}{r}\binom{2p-2r}{p-r} [[/math]]

Now by looking at the coefficient of [math]t^p[/math] on both sides, we conclude that the sum on the right equals [math]4^p[/math]. Thus, we can finish the moment computation in (2), as follows:

[[math]] M_k=\frac{p!}{4^p}\times 4^p=p! [[/math]]

We are therefore led to the conclusion in the statement.

■

As before with the real Gaussian laws, a better-looking statement is in terms of partitions. Given a colored integer [math]k=\circ\bullet\bullet\circ\ldots\,[/math], we say that a pairing [math]\pi\in P_2(k)[/math] is matching when it pairs [math]\circ-\bullet[/math] symbols. With this convention, we have the following result:

Theorem

The moments of the complex normal law are the numbers

[[math]] M_k(G_t)=\sum_{\pi\in\mathcal P_2(k)}t^{|\pi|} [[/math]]

where [math]\mathcal P_2(k)[/math] are the matching pairings of [math]\{1,\ldots,k\}[/math], and [math]|.|[/math] is the number of blocks.

Show Proof

This is a reformulation of Theorem 14.26. Indeed, we can assume that we are in the case [math]t=1[/math], and here we know from Theorem 14.26 that the moments are:

[[math]] M_k=\begin{cases} (|k|/2)!&(k\ {\rm uniform})\\ 0&(k\ {\rm not\ uniform}) \end{cases} [[/math]]

On the other hand, the numbers [math]|\mathcal P_2(k)|[/math] are given by exactly the same formula. Indeed, in order to have a matching pairing of [math]k[/math], our exponent [math]k=\circ\bullet\bullet\circ\ldots[/math] must be uniform, consisting of [math]p[/math] copies of [math]\circ[/math] and [math]p[/math] copies of [math]\bullet[/math], with [math]p=|k|/2[/math]. But then the matching pairings of [math]k[/math] correspond to the permutations of the [math]\bullet[/math] symbols, as to be matched with [math]\circ[/math] symbols, and so we have [math]p![/math] such pairings. Thus, we have the same formula as for the moments of [math]f[/math], and we are led to the conclusion in the statement.

■

In practice, we also need to know how to compute joint moments. We have here:

Theorem (Wick formula)

Given independent variables [math]f_i[/math], each following the complex normal law [math]G_t[/math], with [math]t \gt 0[/math] being a fixed parameter, we have the formula

[[math]] E\left(f_{i_1}^{k_1}\ldots f_{i_s}^{k_s}\right)=t^{s/2}\#\left\{\pi\in\mathcal P_2(k)\Big|\pi\leq\ker i\right\} [[/math]]

where [math]k=k_1\ldots k_s[/math] and [math]i=i_1\ldots i_s[/math], for the joint moments of these variables, where [math]\pi\leq\ker i[/math] means that the indices of [math]i[/math] must fit into the blocks of [math]\pi[/math], in the obvious way.

Show Proof

This is something well-known, which can be proved as follows:

(1) Let us first discuss the case where we have a single variable [math]f[/math], which amounts in taking [math]f_i=f[/math] for any [math]i[/math] in the formula in the statement. What we have to compute here are the moments of [math]f[/math], with respect to colored integer exponents [math]k=\circ\bullet\bullet\circ\ldots\,[/math], and the formula in the statement tells us that these moments must be:

[[math]] E(f^k)=t^{|k|/2}|\mathcal P_2(k)| [[/math]]

But this is the formula in Theorem 14.27, so we are done with this case.

(2) In general now, when expanding the product [math]f_{i_1}^{k_1}\ldots f_{i_s}^{k_s}[/math] and rearranging the terms, we are left with doing a number of computations as in (1), and then making the product of the expectations that we found. But this amounts in counting the partitions in the statement, with the condition [math]\pi\leq\ker i[/math] there standing for the fact that we are doing the various type (1) computations independently, and then making the product.

■

The above statement is one of the possible formulations of the Wick formula, and there are many more formulations, which are all useful. For instance, we have:

Theorem (Wick formula 2)

Given independent variables [math]f_i[/math], each following the complex normal law [math]G_t[/math], with [math]t \gt 0[/math] being a fixed parameter, we have the formula

[[math]] E\left(f_{i_1}\ldots f_{i_k}f_{j_1}^*\ldots f_{j_k}^*\right)=t^k\#\left\{\pi\in S_k\Big|i_{\pi(r)}=j_r,\forall r\right\} [[/math]]

for the non-vanishing joint moments of these variables.

Show Proof

This follows from the usual Wick formula, from Theorem 14.28. With some changes in the indices and notations, the formula there reads:

[[math]] E\left(f_{I_1}^{K_1}\ldots f_{I_s}^{K_s}\right)=t^{s/2}\#\left\{\sigma\in\mathcal P_2(K)\Big|\sigma\leq\ker I\right\} [[/math]]

Now observe that we have [math]\mathcal P_2(K)=\emptyset[/math], unless the colored integer [math]K=K_1\ldots K_s[/math] is uniform, in the sense that it contains the same number of [math]\circ[/math] and [math]\bullet[/math] symbols. Up to permutations, the non-trivial case, where the moment is non-vanishing, is the case where the colored integer [math]K=K_1\ldots K_s[/math] is of the following special form:

[[math]] K=\underbrace{\circ\circ\ldots\circ}_k\ \underbrace{\bullet\bullet\ldots\bullet}_k [[/math]]

So, let us focus on this case, which is the non-trivial one. Here we have [math]s=2k[/math], and we can write the multi-index [math]I=I_1\ldots I_s[/math] in the following way:

[[math]] I=i_1\ldots i_k\ j_1\ldots j_k [[/math]]

With these changes made, the above usual Wick formula reads:

[[math]] E\left(f_{i_1}\ldots f_{i_k}f_{j_1}^*\ldots f_{j_k}^*\right)=t^k\#\left\{\sigma\in\mathcal P_2(K)\Big|\sigma\leq\ker(ij)\right\} [[/math]]

The point now is that the matching pairings [math]\sigma\in\mathcal P_2(K)[/math], with [math]K=\circ\ldots\circ\bullet\ldots\bullet\,[/math], of length [math]2k[/math], as above, correspond to the permutations [math]\pi\in S_k[/math], in the obvious way. With this identification made, the above modified usual Wick formula becomes:

[[math]] E\left(f_{i_1}\ldots f_{i_k}f_{j_1}^*\ldots f_{j_k}^*\right)=t^k\#\left\{\pi\in S_k\Big|i_{\pi(r)}=j_r,\forall r\right\} [[/math]]

Thus, we have reached to the formula in the statement, and we are done.

■

Finally, here is one more formulation of the Wick formula, useful as well:

Theorem (Wick formula 3)

Given independent variables [math]f_i[/math], each following the complex normal law [math]G_t[/math], with [math]t \gt 0[/math] being a fixed parameter, we have the formula

[[math]] E\left(f_{i_1}f_{j_1}^*\ldots f_{i_k}f_{j_k}^*\right)=t^k\#\left\{\pi\in S_k\Big|i_{\pi(r)}=j_r,\forall r\right\} [[/math]]

for the non-vanishing joint moments of these variables.

Show Proof

This follows from our second Wick formula, from Theorem 14.29, simply by permuting the terms, as to have an alternating sequence of plain and conjugate variables. Alternatively, we can start with Theorem 14.28, and then perform the same manipulations as in the proof of Theorem 14.29, but with the exponent being this time as follows:

[[math]] K=\underbrace{\circ\bullet\circ\bullet\ldots\ldots\circ\bullet}_{2k} [[/math]]

Thus, we are led to the conclusion in the statement.

■

So long for the normal laws, real and complex. These appear pretty much everywhere in mathematics, and with the above, you are fully armed for dealing with them. In particular the Wick formula is what you need for Random Matrix Theory (RMT), shall you ever get interested in that, and with a good reference here being Mehta ^[1].

14c. Hyperspherical laws

We can do many things with the probability theory that we learned so far, and especially with the real and complex normal laws. As a first application, let us go back to the spherical integrals, studied in chapter 13. In the real case, we have:

Theorem

The moments of the hyperspherical variables are

[[math]] \int_{S^{N-1}_\mathbb R}x_i^pdx=\frac{(N-1)!!p!!}{(N+p-1)!!} [[/math]]

and the rescaled variables [math]y_i=\sqrt{N}x_i[/math] become normal and independent with [math]N\to\infty[/math].

Show Proof

The moment formula in the statement follows from the general formulae in chapter 13. As a consequence, with [math]N\to\infty[/math] we have the following estimate:

[[math]] \begin{eqnarray*} \int_{S^{N-1}_\mathbb R}x_i^pdx &\simeq&N^{-p/2}\times p!!\\ &=&N^{-p/2}M_p(g_1) \end{eqnarray*} [[/math]]

Thus, the rescaled variables [math]\sqrt{N}x_i[/math] become normal with [math]N\to\infty[/math], as claimed. As for the proof of the asymptotic independence, this is standard too, once again by using the formulae in chapter 13. Indeed, the joint moments of [math]x_1,\ldots,x_N[/math] are given by:

[[math]] \begin{eqnarray*} \int_{S^{N-1}_\mathbb R}x_1^{k_1}\ldots x_N^{k_N}\,dx &=&\frac{(N-1)!!k_1!!\ldots k_N!!}{(N+\Sigma k_i-1)!!}\\ &\simeq&N^{-\Sigma k_i}\times k_1!!\ldots k_N!! \end{eqnarray*} [[/math]]

By rescaling, the joint moments of the variables [math]y_i=\sqrt{N}x_i[/math] are given by:

[[math]] \int_{S^{N-1}_\mathbb R}y_1^{k_1}\ldots y_N^{k_N}\,dx\simeq k_1!!\ldots k_N!! [[/math]]

Thus, we have multiplicativity, and so independence with [math]N\to\infty[/math], as claimed.

■

Importantly, we can recover the normal laws as well in connection with the rotation groups. Indeed, we have the following reformulation of Theorem 14.31:

Theorem

We have the integration formula

[[math]] \int_{O_N}U_{ij}^p\,dU=\frac{(N-1)!!p!!}{(N+p-1)!!} [[/math]]

and the rescaled variables [math]V_{ij}=\sqrt{N}U_{ij}[/math] become normal and independent with [math]N\to\infty[/math].

Show Proof

We use the basic fact that the rotations [math]U\in O_N[/math] act on the points of the real sphere [math]z\in S^{N-1}_\mathbb R[/math], with the stabilizer of [math]z=(1,0,\ldots,0)[/math] being the subgroup [math]O_{N-1}\subset O_N[/math]. In algebraic terms, this gives an identification as follows:

[[math]] S^{N-1}_\mathbb R=O_N/O_{N-1} [[/math]]

In functional analytic terms, this result provides us with an embedding as follows, for any [math]i[/math], which makes correspond the respective integration functionals:

[[math]] C(S^{N-1}_\mathbb R)\subset C(O_N)\quad,\quad x_i\to U_{1i} [[/math]]

With this identification made, the result follows from Theorem 14.31.

■

In the complex case, the analogues of the above results are as follows:

Theorem

The rescalings [math]\sqrt{N}z_i[/math] of the unit complex sphere coordinates

[[math]] z_i:S^{N-1}_\mathbb C\to\mathbb C [[/math]]

as well as the rescalings [math]\sqrt{N}U_{ij}[/math] of the unitary group coordinates

[[math]] U_{ij}:U_N\to\mathbb C [[/math]]

become complex Gaussian and independent with [math]N\to\infty[/math].

Show Proof

We have several assertions to be proved, the idea being as follows:

(1) According to the formulae in chapter 13, the polynomials integrals in [math]z_i,\bar{z}_i[/math] vanish, unless the number of [math]z_i,\bar{z}_i[/math] is the same. In this latter case these terms can be grouped together, by using [math]z_i\bar{z}_i=|z_i|^2[/math], and the relevant integration formula is:

[[math]] \int_{S^{N-1}_\mathbb C}|z_i|^{2k}\,dz=\frac{(N-1)!k!}{(N+k-1)!} [[/math]]

Now with [math]N\to\infty[/math], we obtain from this the following estimate:

[[math]] \int_{S^{N-1}_\mathbb C}|z_i|^{2k}dx \simeq N^{-k}\times k! [[/math]]

Thus, the rescaled variables [math]\sqrt{N}z_i[/math] become normal with [math]N\to\infty[/math], as claimed.

(2) As for the proof of the asymptotic independence, this is standard too, again by using the formulae in chapter 13. Indeed, the joint moments of [math]z_1,\ldots,z_N[/math] are given by:

[[math]] \begin{eqnarray*} \int_{S^{N-1}_\mathbb R}|z_1|^{2k_1}\ldots|z_N|^{2k_N}\,dx &=&\frac{(N-1)!k_1!\ldots k_n!}{(N+\sum k_i-1)!}\\ &\simeq&N^{-\Sigma k_i}\times k_1!\ldots k_N! \end{eqnarray*} [[/math]]

By rescaling, the joint moments of the variables [math]y_i=\sqrt{N}z_i[/math] are given by:

[[math]] \int_{S^{N-1}_\mathbb R}|y_1|^{2k_1}\ldots|y_N|^{2k_N}\,dx\simeq k_1!\ldots k_N! [[/math]]

Thus, we have multiplicativity, and so independence with [math]N\to\infty[/math], as claimed.

(3) Regarding the last assertion, we can use here the basic fact that the rotations [math]U\in U_N[/math] act on the points of the sphere [math]z\in S^{N-1}_\mathbb C[/math], with the stabilizer of [math]z=(1,0,\ldots,0)[/math] being the subgroup [math]U_{N-1}\subset U_N[/math]. In algebraic terms, this gives an equality as follows:

[[math]] S^{N-1}_\mathbb C=U_N/U_{N-1} [[/math]]

In functional analytic terms, this result provides us with an embedding as follows, for any [math]i[/math], which makes correspond the respective integration functionals:

[[math]] C(S^{N-1}_\mathbb C)\subset C(U_N)\quad,\quad x_i\to U_{1i} [[/math]]

With this identification made, the result follows from (1,2).

■

14d. Rotations, reflections

All the above is quite nice, and suggests that there are deep relations between group theory and probability. So, let us discuss this now, by starting with the discrete case, which is the simplest one. We have here the following beautiful, well-known result:

Theorem

The probability for a random [math]\sigma\in S_N[/math] to have no fixed points is

[[math]] P\simeq\frac{1}{e} [[/math]]

in the [math]N\to\infty[/math] limit, where [math]e=2.71828\ldots[/math] is the usual constant from analysis.

Show Proof

This is best viewed by using the inclusion-exclusion principle. Let us set:

[[math]] S_N^k=\left\{\sigma\in S_N\Big|\sigma(k)=k\right\} [[/math]]

The set of permutations having no fixed points, called derangements, is then:

[[math]] X_N=\left(\bigcup_kS_N^k\right)^c [[/math]]

Now the inclusion-exclusion principle tells us that we have:

[[math]] \begin{eqnarray*} |X_N| &=&\left|\left(\bigcup_kS_N^k\right)^c\right|\\ &=&|S_N|-\sum_k|S_N^k|+\sum_{k \lt l}|S_N^k\cap S_N^l|-\ldots+(-1)^N\sum_{k_1 \lt \ldots \lt k_N}|S_N^{k_1}\cup\ldots\cup S_N^{k_N}|\\ &=&N!-N(N-1)!+\binom{N}{2}(N-2)!-\ldots+(-1)^N\binom{N}{N}(N-N)!\\ &=&\sum_{r=0}^N(-1)^r\binom{N}{r}(N-r)! \end{eqnarray*} [[/math]]

Thus, the probability that we are interested in, for a random permutation [math]\sigma\in S_N[/math] to have no fixed points, is given by the following formula:

[[math]] P =\frac{|X_N|}{N!} =\sum_{r=0}^N\frac{(-1)^r}{r!} [[/math]]

Since on the right we have the expansion of [math]1/e[/math], this gives the result.

■

The above result looks quite exciting. In order to further explore and refine it, we will need some notions from group theory, and more specifically, the following definition:

Definition

Given a closed subgroup [math]G\subset U_N[/math], the function

[[math]] \chi:G\to\mathbb C\quad,\quad \chi(g)=\sum_{i=1}^Ng_{ii} [[/math]]

is called main character of [math]G[/math].

We should mention that there is a long story with these characters, the idea being that a compact group [math]G[/math] can have several representations [math]\pi:G\subset U_N[/math], which can be studied via their characters [math]\chi_\pi:G\to\mathbb C[/math], and with all this coming from Weyl ^[2], ^[3].

All this is quite heavy, but technically, we will not need all this here. Indeed, in relation with our questions, we can formulate the following nice, elementary result:

Theorem

Consider the symmetric group [math]S_N[/math], regarded as the permutation group, [math]S_N\subset O_N[/math], of the [math]N[/math] coordinate axes of [math]\mathbb R^N[/math].

The main character [math]\chi\in C(S_N)[/math] counts the number of fixed points.
The law of [math]\chi\in C(S_N)[/math] becomes Poisson [math](1)[/math], in the [math]N\to\infty[/math] limit.

Show Proof

We have two things to be proved here, the idea being as follows:

(1) The permutation matrices [math]\sigma\in O_N[/math], which give the embedding [math]S_N\subset O_N[/math] in the statement, being given by [math]\sigma_{ij}=\delta_{i\sigma(j)}[/math], we have the following computation:

[[math]] \chi(\sigma) =\sum_i\delta_{\sigma(i)i} =\#\left\{i\in\{1,\ldots,N\}\Big|\sigma(i)=i\right\} [[/math]]

(2) In order to establish now the asymptotic result in the statement, we must prove the following formula, for any [math]r\in\mathbb N[/math], in the [math]N\to\infty[/math] limit:

[[math]] P(\chi=r)\simeq\frac{1}{r!e} [[/math]]

We already know, from Theorem 14.34, that this formula holds at [math]r=0[/math]. In the general case now, we have to count the permutations [math]\sigma\in S_N[/math] having exactly [math]r[/math] points. Now since having such a permutation amounts in choosing [math]r[/math] points among [math]1,\ldots,N[/math], and then permuting the [math]N-r[/math] points left, without fixed points allowed, we have:

[[math]] \begin{eqnarray*} \#\left\{\sigma\in S_N\Big|\chi(\sigma)=r\right\} &=&\binom{N}{r}\#\left\{\sigma\in S_{N-r}\Big|\chi(\sigma)=0\right\}\\ &=&\frac{N!}{r!(N-r)!}\#\left\{\sigma\in S_{N-r}\Big|\chi(\sigma)=0\right\}\\ &=&N!\times\frac{1}{r!}\times\frac{\#\left\{\sigma\in S_{N-r}\Big|\chi(\sigma)=0\right\}}{(N-r)!} \end{eqnarray*} [[/math]]

By dividing everything by [math]N![/math], we obtain from this the following formula:

[[math]] \frac{\#\left\{\sigma\in S_N\Big|\chi(\sigma)=r\right\}}{N!}=\frac{1}{r!}\times\frac{\#\left\{\sigma\in S_{N-r}\Big|\chi(\sigma)=0\right\}}{(N-r)!} [[/math]]

Now by using the computation at [math]r=0[/math], that we already have, from Theorem 14.34, it follows that with [math]N\to\infty[/math] we have the following estimate:

[[math]] P(\chi=r) \simeq\frac{1}{r!}\cdot P(\chi=0) \simeq\frac{1}{r!}\cdot\frac{1}{e} [[/math]]

Thus, we obtain as limiting measure the Poisson law of parameter 1, as stated.

■

As a next step, let us try now to generalize what we have, namely Theorem 14.36, as to reach to the Poisson laws of arbitrary parameter [math]t \gt 0[/math]. Again we will need some notions from group theory, and more specifically, the following definition:

Definition

Given a closed subgroup [math]G\subset U_N[/math], the function

[[math]] \chi_t:G\to\mathbb C\quad,\quad \chi_t(g)=\sum_{i=1}^{[tN]}g_{ii} [[/math]]

is called main truncated character of [math]G[/math], of parameter [math]t\in(0,1][/math].

As before with plain characters, there is some theory behind this definition, involving this time Random Matrix Theory (RMT), but technically, we will not need this here. Indeed, we can now formulate the following result, which is nice and elementary, generalizing Theorem 14.36, and which will be our final saying on the subject:

Theorem

Consider the symmetric group [math]S_N[/math], regarded as the permutation group, [math]S_N\subset O_N[/math], of the [math]N[/math] coordinate axes of [math]\mathbb R^N[/math].

The variable [math]\chi_t[/math] counts the number of fixed points among [math]1,\ldots,[tN][/math].
The law of this variable [math]\chi_t[/math] becomes Poisson [math](t)[/math], in the [math]N\to\infty[/math] limit.

Show Proof

We already know from Theorem 14.36 that the results hold at [math]t=1[/math]. In general, the proof is similar, the idea being as follows:

(1) We have indeed the following computation, coming from definitions:

[[math]] \chi_t(\sigma) =\sum_{i=1}^{[tN]}\delta_{\sigma(i)i} =\#\left\{i\in\{1,\ldots,[tN]\}\Big|\sigma(i)=i\right\} [[/math]]

(2) Consider indeed the following sets, as in the proof of Theorem 14.34:

[[math]] S_N^k=\left\{\sigma\in S_N\Big|\sigma(k)=k\right\} [[/math]]

The set of permutations having no fixed points among [math]1,\ldots,[tN][/math] is then:

[[math]] X_N=\left(\bigcup_{k\leq[tN]}S_N^k\right)^c [[/math]]

As before in the proof of Theorem 14.36, we obtain by inclusion-exclusion that:

[[math]] \begin{eqnarray*} P(\chi_t=0) &=&\frac{1}{N!}\sum_{r=0}^{[tN]}(-1)^r\sum_{k_1 \lt \ldots \lt k_r \lt [tN]}|S_N^{k_1}\cap\ldots\cap S_N^{k_r}|\\ &=&\frac{1}{N!}\sum_{r=0}^{[tN]}(-1)^r\binom{[tN]}{r}(N-r)!\\ &=&\sum_{r=0}^{[tN]}\frac{(-1)^r}{r!}\cdot\frac{[tN]!(N-r)!}{N!([tN]-r)!} \end{eqnarray*} [[/math]]

Now with [math]N\to\infty[/math], we obtain from this the following estimate:

[[math]] \begin{eqnarray*} P(\chi_t=0) &\simeq&\sum_{r=0}^{[tN]}\frac{(-1)^r}{r!}\cdot t^r\\ &=&\sum_{r=0}^{[tN]}\frac{(-t)^r}{r!}\\ &\simeq&e^{-t} \end{eqnarray*} [[/math]]

More generally, by counting the permutations [math]\sigma\in S_N[/math] having exactly [math]r[/math] fixed points among [math]1,\ldots,[tN][/math], as in the proof of Theorem 14.36, we obtain:

[[math]] P(\chi_t=r)\simeq\frac{t^r}{r!e^t} [[/math]]

Thus, we obtain in the limit a Poisson law of parameter [math]t[/math], as stated.

■

It is possible to establish as well continuous analogues of the above results, involving the groups [math]O_N,U_N[/math], and the real and complex normal laws. However, the proofs here are more technical. For an elementary introduction to this, have a look at my book ^[4].

Finally, let us end all this with some low-dimensional magics, involving physicists' favorite group, [math]SU_2[/math], and one of their favorite laws, the Wigner law [math]\gamma_1[/math]. We have:

Theorem

The main character of [math]SU_2[/math] follows the following law,

[[math]] \gamma_1=\frac{1}{2\pi}\sqrt{4-x^2}dx [[/math]]

which is the Wigner law of parameter [math]1[/math].

Show Proof

The group [math]SU_2[/math] is by definition the group of unitary rotations [math]U\in U_2[/math] of determinant one, and by solving the equation [math]U^*=U^{-1}[/math], we are led to:

[[math]] SU_2=\left\{\begin{pmatrix}a+ib&c+id\\ -c+id&a-ib\end{pmatrix}\ \Big|\ a^2+b^2+c^2+d^2=1\right\} [[/math]]

In this picture, the main character is given by the following formula:

[[math]] \chi\begin{pmatrix}a+ib&c+id\\ -c+id&a-ib\end{pmatrix}=2a [[/math]]

We are therefore left with computing the law of the following variable:

[[math]] a\in C(S^3_\mathbb R) [[/math]]

But this is something very familiar, namely a hyperspherical variable at [math]N=4[/math], so we can use here Theorem 14.31. We obtain the following moment formula:

[[math]] \begin{eqnarray*} \int_{S^3_\mathbb R}a^{2k} &=&\frac{3!!(2k)!!}{(2k+3)!!}\\ &=&2\cdot\frac{(2k)!}{2^kk!2^{k+1}(k+1)!}\\ &=&\frac{1}{4^k}\cdot\frac{1}{k+1}\binom{2k}{k}\\ &=&\frac{C_k}{4^k} \end{eqnarray*} [[/math]]

Thus the variable [math]2a\in C(S^3_\mathbb R)[/math] follows the Wigner semicircle law [math]\gamma_1[/math], as claimed.

■

There are many more things that can be said, in relation with the above, all beautiful mathematics, useful in relation with physics. If interested, have a look at my book ^[4].

General references

Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].

References

M.L. Mehta, Random matrices, Elsevier (2004).
H. Weyl, The theory of groups and quantum mechanics, Princeton Univ. Press (1931).
H. Weyl, The classical groups: their invariants and representations, Princeton Univ. Press (1939).
^4.0 ^4.1 T. Banica, Linear algebra and group theory (2024).

[meh-1] M.L. Mehta, Random matrices, Elsevier (2004).

[wy1-2] H. Weyl, The theory of groups and quantum mechanics, Princeton Univ. Press (1931).

[wy2-3] H. Weyl, The classical groups: their invariants and representations, Princeton Univ. Press (1939).

[ba1-4] 4.0 ^4.1 T. Banica, Linear algebra and group theory (2024).

[1]

[2]

[3]

[4]