Random matrices
6a. Random matrices
We have seen so far the basics of von Neumann algebras [math]A\subset B(H)[/math], with a look into some interesting ramifications too, concerning random matrices and quantum spaces. In what regards these ramifications, the situation is as follows:
(1) The random matrix algebras, [math]A=M_N(L^\infty(X))[/math] acting on [math]H=\mathbb C^N\otimes L^2(X)[/math], are the simplest von Neumann algebras, from a variety of viewpoints. The main problem regarding them is of operator theoretic nature, regarding the computation of the law of individual elements [math]T\in A[/math] with respect to the random matrix trace [math]tr:A\to\mathbb C[/math].
(2) The quantum spaces are exciting abstract objects, obtained by looking at an arbitrary von Neumann algebra [math]A\subset B(H)[/math] coming with a trace [math]tr:A\to\mathbb C[/math], and formally writing the algebra as [math]A=L^\infty(X)[/math], and its trace as [math]tr=\int_X[/math]. In this picture, [math]X[/math] is our quantum probability space, and [math]\int_X[/math] is the integration over it, or expectation.
All this is quite interesting, and we will further explore these two topics, random matrices and quantum spaces, with some basic theory for them, in this chapter and in the next one. As a first observation, these two topics are closely related, due to:
\begin{fact}
A random matrix algebra can be written in the following way,
so the underlying quantum space is something very simple, [math]Y=M_N\times X[/math]. \end{fact} With this understood, the philosophical problem is now, what to do with our quantum spaces, be them of random matrix type [math]Y=M_N\times X[/math], or more general. Good question, and do not expect a simple answer to it. Indeed, quantum spaces are more or less the same thing as operator algebras, and from this perspective, our question becomes “what are the operator algebras, and what is to be done with them”, obviously difficult.
And there is even worse, because when remembering that operator algebras are more or less the same thing as quantum mechanics, our question becomes something of type “what is quantum mechanics, and what is to be done with it”. So, modesty.
Getting back to Earth, now that we have our questions and philosophy, for the whole remainder of this book, let us get into random matrices. Quite remarkably, these provide us with an epsilon of answer to our philosophical questions, as follows:
\begin{answer}
The simplest quantum spaces are those coming from random matrix algebras, which are as follows, with [math]X[/math] being a usual probability space,
and what is to be done with them is the computation of the law of individual elements, the random matrices [math]T\in L^\infty(Y)=M_N(L^\infty(X))[/math], in the [math]N \gt \gt 0[/math] regime. \end{answer} Which looks very nice, we eventually reached to some concrete questions, and time now for mathematics and computations. Getting started, we must first further build on the material from chapter 5. We recall from there that given a von Neumann algebra [math]A\subset B(H)[/math] coming with a trace [math]tr:A\to\mathbb C[/math], any normal element [math]T\in A[/math] has a law, which is the complex probability measure [math]\mu\in\mathcal P(\mathbb C)[/math] given by the following formula:
In the non-normal case, [math]TT^*\neq T^*T[/math], the law does not exist as a complex probability measure [math]\mu\in\mathcal P(\mathbb C)[/math], as also explained in chapter 5. However, we can trick a bit, and talk about the law of non-normal elements as well, in the following abstract way:
Let [math]A[/math] be a von Neumann algebra, given with a trace [math]tr:A\to\mathbb C[/math].
- The elements [math]T\in A[/math] are called random variables.
- The moments of such a variable are the numbers [math]M_k(T)=tr(T^k)[/math].
- The law of such a variable is the functional [math]\mu:P\to tr(P(T))[/math].
Here [math]k=\circ\bullet\bullet\circ\ldots[/math] is by definition a colored integer, and the powers [math]T^k[/math] are defined by multiplicativity and the usual formulae, namely:
As for the polynomial [math]P[/math], this is a noncommuting [math]*[/math]-polynomial in one variable:
Observe that the law is uniquely determined by the moments, because:
Generally speaking, the above definition, due to Voiculescu [1], is something quite abstract, but there is no other way of doing things, at least at this level of generality. However, in the special case where our variable [math]T\in A[/math] is self-adjoint, or more generally normal, the theory simplifies, and we recover more familiar objects, as follows:
The law of a normal variable [math]T\in A[/math] can be identified with the corresponding spectral measure [math]\mu\in\mathcal P(\mathbb C)[/math], according to the following formula,
This is something that we know well, from chapter 5, coming from the spectral theorem for the normal operators, as developed in chapter 3.
Getting back now to the random matrices, we have all we need, as general formalism, and we are ready for doing some computations. As a first observation, we have:
The laws of basic random matrices [math]T\in M_N(L^\infty(X))[/math] are as follows:
- In the case [math]N=1[/math] the random matrix is a usual random variable, [math]T\in L^\infty(X)[/math], automatically normal, and its law as defined above is the usual law.
- In the case [math]X=\{.\}[/math] the random matrix is a usual scalar matrix, [math]T\in M_N(\mathbb C)[/math], and in the diagonalizable case, the law is [math]\mu=\frac{1}{N}\left(\delta_{\lambda_1}+\ldots+\delta_{\lambda_N}\right)[/math].
This is something that we know, once again, from chapter 5, and which is elementary. Indeed, the first assertion follows from definitions, and the above discussion. As for the second assertion, this follows by diagonalizing the matrix.
In general, what we have can only be a mixture of (1) and (2) above. Our plan will be that of discussing more in detail (1), and then getting into the general case, or rather into the case of the most interesting random matrices, with inspiration from (2).
6b. Probability theory
So, let us set [math]N=1[/math]. Here our algebra is [math]A=L^\infty(X)[/math], an arbitrary commutative von Neumann algebra. The most interesting linear operators [math]T\in A[/math], that we will rather denote as complex functions [math]f:X\to\mathbb C[/math], and call random variables, as it is customary, are the normal, or Gaussian variables, which are defined as follows:
A variable [math]f:X\to\mathbb R[/math] is called standard normal when its law is:
Observe that these normal laws have indeed mass 1, as they should, as shown by a quick change of variable, and the Gauss formula, namely:
Let us start with some basic results regarding the normal laws. We first have:
The normal law [math]g_t[/math] with [math]t \gt 0[/math] has the following properties:
- The variance is [math]V=t[/math].
- The density is even, so the odd moments vanish.
- The even moments are [math]M_k=t^{k/2}\times k!![/math], with [math]k!!=(k-1)(k-3)(k-5)\ldots\,[/math].
- Equivalently, the moments are [math]M_k=\sum_{\pi\in P_2(k)}t^{|\pi|}[/math], for any [math]k\in\mathbb N[/math].
- The Fourier transform [math]F_f(x)=\mathbb E(e^{ixf})[/math] is given by [math]F(x)=e^{-tx^2/2}[/math].
- We have the convolution semigroup formula [math]g_s*g_t=g_{s+t}[/math], for any [math]s,t \gt 0[/math].
All this is very standard, with the various notations used in the statement being explained below, the idea being as follows:
(1) The normal law [math]g_t[/math] being centered, its variance is the second moment, [math]V=M_2[/math]. Thus the result follows from (3), proved below, which gives in particular:
(2) This is indeed something self-explanatory.
(3) We have indeed the following computation, by partial integration:
The initial value being [math]M_0=1[/math], we obtain the result.
(4) We know from (2,3) that the moments of the normal law [math]g_t[/math] satisfy the following recurrence formula, with the initial data [math]M_0=1,M_1=0[/math]:
Now let us look at [math]P_2(k)[/math], the set of pairings of [math]\{1,\ldots,k\}[/math]. In order to have such a pairing, we must pair 1 with a number chosen among [math]2,\ldots,k[/math], and then come up with a pairing of the remaining [math]k-2[/math] numbers. Thus, the number [math]N_k=|P_2(k)|[/math] of such pairings is subject to the following recurrence formula, with initial data [math]N_0=1,N_1=0[/math]:
But this solves our problem at [math]t=1[/math], because in this case we obtain the following formula, with [math]|.|[/math] standing as usual for the number of blocks of a partition:
Now back to the general case, [math]t \gt 0[/math], our problem here is solved in fact too, because the number of blocks of a pairing [math]\pi\in P_2(k)[/math] being constant, [math]|\pi|=k/2[/math], we obtain:
(5) The Fourier transform formula can be established as follows:
(6) This follows indeed from (5), because [math]\log F_{g_t}[/math] is linear in [math]t[/math].
We are now ready to establish the Central Limit Theorem (CLT), which is a key result, telling us why the normal laws appear a bit everywhere, in the real life:
Given a sequence of real random variables [math]f_1,f_2,f_3,\ldots\in L^\infty(X)[/math], which are i.i.d., centered, and with variance [math]t \gt 0[/math], we have
In terms of moments, the Fourier transform [math]F_f(x)=\mathbb E(e^{ixf})[/math] is given by:
Thus, the Fourier transform of the variable in the statement is:
But this latter function being the Fourier transform of [math]g_t[/math], we obtain the result.
Let us discuss as well the “discrete” counterpart of the above results, that we will need too a bit later, in relation with the random matrices. We have:
The Poisson law of parameter [math]1[/math] is the following measure,
We will see in a moment why these laws appear everywhere, in discrete probability, the reasons behind this coming from the Poisson Limit Theorem (PLT). Getting started now, in analogy with the normal laws, the Poisson laws have the following properties:
The Poisson law [math]p_t[/math] with [math]t \gt 0[/math] has the following properties:
- The variance is [math]V=t[/math].
- The moments are [math]M_k=\sum_{\pi\in P(k)}t^{|\pi|}[/math].
- The Fourier transform is [math]F(x)=\exp\left((e^{ix}-1)t\right)[/math].
- We have the semigroup formula [math]p_s*p_t=p_{s+t}[/math], for any [math]s,t \gt 0[/math].
We have four formulae to be proved, the idea being as follows:
(1) The variance is [math]V=M_2-M_1^2[/math], and by using the formulae [math]M_1=t[/math] and [math]M_2=t+t^2[/math], coming from (2), proved below, we obtain as desired, [math]V=t[/math].
(2) This is something more tricky. Consider indeed the set [math]P(k)[/math] of all partitions of [math]\{1,\ldots,k\}[/math]. At [math]t=1[/math], to start with, the formula that we want to prove is:
We have the following recurrence formula for the moments of [math]p_1[/math]:
Our claim is that the numbers [math]B_k=|P(k)|[/math] satisfy the same recurrence formula. Indeed, since a partition of [math]\{1,\ldots,k+1\}[/math] appears by choosing [math]r[/math] neighbors for [math]1[/math], among the [math]k[/math] numbers available, and then partitioning the [math]k-r[/math] elements left, we have:
Thus we obtain by recurrence [math]M_k=B_k[/math], as desired. Regarding now the general case, [math]t \gt 0[/math], we can use here a similar method. We have the following recurrence formula for the moments of [math]p_t[/math], obtained by using the binomial formula:
On the other hand, consider the numbers in the statement, [math]S_k=\sum_{\pi\in P(k)}t^{|\pi|}[/math]. As before, since a partition of [math]\{1,\ldots,k+1\}[/math] appears by choosing [math]r[/math] neighbors for [math]1[/math], among the [math]k[/math] numbers available, and then partitioning the [math]k-r[/math] elements left, we have:
Thus we obtain by recurrence [math]M_k=B_k[/math], as desired.
(3) The Fourier transform formula can be established as follows:
(4) This follows from (3), because [math]\log F_{p_t}[/math] is linear in [math]t[/math].
We are now ready to establish the Poisson Limit Theorem (PLT), as follows:
We have the following convergence, in moments,
Let us denote by [math]\mu_n[/math] the Bernoulli measure appearing under the convolution sign. We have then the following computation:
Thus, we obtain the Fourier transform of [math]p_t[/math], as desired.
As a third and last topic from classical probability, let us discuss now the complex normal laws, that we will need too. To start with, we have the following definition:
The complex Gaussian law of parameter [math]t \gt 0[/math] is
As in the real case, these measures form convolution semigroups:
The complex Gaussian laws have the property
This follows indeed from the real result, namely [math]g_s*g_t=g_{s+t}[/math], established above, simply by taking real and imaginary parts.
We have the following complex analogue of the CLT:
Given complex random variables [math]f_1,f_2,f_3,\ldots\in L^\infty(X)[/math] which are i.i.d., centered, and with variance [math]t \gt 0[/math], we have, with [math]n\to\infty[/math], in moments,
This follows indeed from the real CLT, established above, simply by taking the real and imaginary parts of all the variables involved.
Regarding now the moments, we use the general formalism from Definition 6.3, involving colored integer exponents [math]k=\circ\bullet\bullet\circ\ldots\,[/math] We say that a pairing [math]\pi\in P_2(k)[/math] is matching when it pairs [math]\circ-\bullet[/math] symbols. With this convention, we have the following result:
The moments of the complex normal law are the numbers
This is something well-known, which can be established as follows:
(1) As a first observation, by using a standard dilation argument, it is enough to do this at [math]t=1[/math]. So, let us first recall from the above that the moments of the real Gaussian law [math]g_1[/math], with respect to integer exponents [math]k\in\mathbb N[/math], are the following numbers:
Numerically, we have the following formula, explained as well in the above:
(2) We will show here that in what concerns the complex Gaussian law [math]G_1[/math], similar results hold. Numerically, we will prove that we have the following formula, where a colored integer [math]k=\circ\bullet\bullet\circ\ldots[/math] is called uniform when it contains the same number of [math]\circ[/math] and [math]\bullet[/math]\,, and where [math]|k|\in\mathbb N[/math] is the length of such a colored integer:
Now since the matching partitions [math]\pi\in\mathcal P_2(k)[/math] are counted by exactly the same numbers, and this for trivial reasons, we will obtain the formula in the statement, namely:
(3) This was for the plan. In practice now, we must compute the moments, with respect to colored integer exponents [math]k=\circ\bullet\bullet\circ\ldots[/math]\,, of the variable in the statement:
As a first observation, in the case where such an exponent [math]k=\circ\bullet\bullet\circ\ldots[/math] is not uniform in [math]\circ,\bullet[/math]\,, a rotation argument shows that the corresponding moment of [math]c[/math] vanishes. To be more precise, the variable [math]c'=wc[/math] can be shown to be complex Gaussian too, for any [math]w\in\mathbb C[/math], and from [math]M_k(c)=M_k(c')[/math] we obtain [math]M_k(c)=0[/math], in this case.
(4) In the uniform case now, where [math]k=\circ\bullet\bullet\circ\ldots[/math] consists of [math]p[/math] copies of [math]\circ[/math] and [math]p[/math] copies of [math]\bullet[/math]\,, the corresponding moment can be computed as follows:
(5) In order to finish now the computation, let us recall that we have the following formula, coming from the generalized binomial formula, or from the Taylor formula:
By taking the square of this series, we obtain the following formula:
Now by looking at the coefficient of [math]t^p[/math] on both sides, we conclude that the sum on the right equals [math]4^p[/math]. Thus, we can finish the moment computation in (4), as follows:
(6) As a conclusion, if we denote by [math]|k|[/math] the length of a colored integer [math]k=\circ\bullet\bullet\circ\ldots[/math]\,, the moments of the variable [math]c[/math] in the statement are given by:
On the other hand, the numbers [math]|\mathcal P_2(k)|[/math] are given by exactly the same formula. Indeed, in order to have matching pairings of [math]k[/math], our exponent [math]k=\circ\bullet\bullet\circ\ldots[/math] must be uniform, consisting of [math]p[/math] copies of [math]\circ[/math] and [math]p[/math] copies of [math]\bullet[/math], with [math]p=|k|/2[/math]. But then the matching pairings of [math]k[/math] correspond to the permutations of the [math]\bullet[/math] symbols, as to be matched with [math]\circ[/math] symbols, and so we have [math]p![/math] such matching pairings. Thus, we have the same formula as for the moments of [math]c[/math], and we are led to the conclusion in the statement.
This was for the basic probability theory, which is in a certain sense advanced operator theory, inside the commutative von Neumann algebras, [math]A=L^\infty(X)[/math]. We will be back to this, with some further limiting theorems, in chapter 8 below.
6c. Wigner matrices
Let us exit now the classical world, that of the commutative von Neumann algebras [math]A=L^\infty(X)[/math], and do as promised some random matrix theory. We recall that a random matrix algebra is a von Neumann algebra of type [math]A=M_N(L^\infty(X))[/math], and that we are interested in the computation of the laws of the operators [math]T\in A[/math], called random matrices. Regarding the precise classes of random matrices that we are interested in, first we have the complex Gaussian matrices, which are constructed as follows:
A complex Gaussian matrix is a random matrix of type
We will see that the above matrices have an interesting, and “central” combinatorics, among all kinds of random matrices, with the study of the other random matrices being usually obtained as a modification of the study of the Gaussian matrices.
As a somewhat surprising remark, using real normal variables in Definition 6.16, instead of the complex ones appearing there, leads nowhere. The correct real versions of the Gaussian matrices are the Wigner random matrices, constructed as follows:
A Wigner matrix is a random matrix of type
In other words, a Wigner matrix must be as follows, with the diagonal entries being real normal variables, [math]a_i\sim g_t[/math], for some [math]t \gt 0[/math], the upper diagonal entries being complex normal variables, [math]b_{ij}\sim G_t[/math], the lower diagonal entries being the conjugates of the upper diagonal entries, as indicated, and with all the variables [math]a_i,b_{ij}[/math] being independent:
As a comment here, for many concrete applications the Wigner matrices are in fact the central objects in random matrix theory, and in particular, they are often more important than the Gaussian matrices. In fact, these are the random matrices which were first considered and investigated, a long time ago, by Wigner himself [2].
Finally, we will be interested as well in the complex Wishart matrices, which are the positive versions of the above random matrices, constructed as follows:
A complex Wishart matrix is a random matrix of type
As before with the Gaussian and Wigner matrices, there are many possible comments that can be made here, of technical or historical nature. First, using real Gaussian variables instead of complex ones leads to a less interesting combinatorics. Also, these matrices were introduced and studied by Marchenko-Pastur not long after Wigner, in [3], and so historically came second. Finally, in what regards their combinatorics and applications, these matrices quite often come first, before both the Gaussian and the Wigner ones, with all this being of course a matter of knowledge and taste.
Summarizing, we have three main types of random matrices, which can be somehow designated as “complex”, “real” and “positive”, and that we will study in what follows. Let us also mention that there are many other interesting classes of random matrices, usually appearing as modifications of the above. More on these later.
In order to compute the asymptotic laws of the above matrices, we will use the moment method. We have the following result, which will be our main tool here:
Given independent variables [math]X_i[/math], each following the complex normal law [math]G_t[/math], with [math]t \gt 0[/math] being a fixed parameter, we have the Wick formula
This is something well-known, and the basis for all possible computations with complex normal variables, which can be proved in two steps, as follows:
(1) Let us first discuss the case where we have a single complex normal variable [math]X[/math], which amounts in taking [math]X_i=X[/math] for any [math]i[/math] in the formula in the statement. What we have to compute here are the moments of [math]X[/math], with respect to colored integer exponents [math]k=\circ\bullet\bullet\circ\ldots\,[/math], and the formula in the statement tells us that these moments must be:
But this is something that we know well from the above, the idea being that at [math]t=1[/math] this follows by doing some combinatorics and calculus, in analogy with the combinatorics and calculus from the real case, where the moment formula is identical, save for the matching pairings [math]\mathcal P_2[/math] being replaced by the usual pairings [math]P_2[/math], and then that the general case [math]t \gt 0[/math] follows from this, by rescaling. Thus, we are done with this case.
(2) In general now, the point is that we obtain the formula in the statement. Indeed, when expanding the product [math]X_{i_1}^{k_1}\ldots X_{i_s}^{k_s}[/math] and rearranging the terms, we are left with doing a number of computations as in (1), and then making the product of the expectations that we found. But this amounts precisely in counting the partitions in the statement, with the condition [math]\pi\leq\ker i[/math] there standing precisely for the fact that we are doing the various type (1) computations independently, and then making the product.
Now by getting back to the Gaussian matrices, we have the following result, with [math]\mathcal{NC}_2(k)=\mathcal P_2(k)\cap NC(k)[/math] standing for the noncrossing pairings of a colored integer [math]k[/math]:
Given a sequence of Gaussian random matrices
This is something standard, which can be done as follows:
(1) We fix [math]N\in\mathbb N[/math], and we let [math]Z=Z_N[/math]. Let us first compute the trace of [math]Z^k[/math]. With [math]k=k_1\ldots k_s[/math], and with the convention [math](ij)^\circ=ij,(ij)^\bullet=ji[/math], we have:
(2) Next, we rescale our variable [math]Z[/math] by a [math]\sqrt{N}[/math] factor, as in the statement, and we also replace the usual trace by its normalized version, [math]tr=Tr/N[/math]. Our formula becomes:
Thus, the moment that we are interested in is given by:
(3) Let us apply now the Wick formula, from Theorem 6.19. We conclude that the moment that we are interested in is given by the following formula:
(4) Our claim now is that in the [math]N\to\infty[/math] limit the combinatorics of the above sum simplifies, with only the noncrossing partitions contributing to the sum, and with each of them contributing precisely with a 1 factor, so that we will have, as desired:
(5) In order to prove this, the first observation is that when [math]k[/math] is not uniform, in the sense that it contains a different number of [math]\circ[/math], [math]\bullet[/math] symbols, we have [math]\mathcal P_2(k)=\emptyset[/math], and so:
(6) Thus, we are left with the case where [math]k[/math] is uniform. Let us examine first the case where [math]k[/math] consists of an alternating sequence of [math]\circ[/math] and [math]\bullet[/math] symbols, as follows:
In this case it is convenient to relabel our multi-index [math]i=(i_1,\ldots,i_s)[/math], with [math]s=2p[/math], in the form [math](j_1,l_1,j_2,l_2,\ldots,j_p,l_p)[/math]. With this done, our moment formula becomes:
Now observe that, with [math]k[/math] being as above, we have an identification [math]\mathcal P_2(k)\simeq S_p[/math], obtained in the obvious way. With this done too, our moment formula becomes:
(7) We are now ready to do our asymptotic study, and prove the claim in (4). Let indeed [math]\gamma\in S_p[/math] be the full cycle, which is by definition the following permutation:
In terms of [math]\gamma[/math], the conditions [math]j_r=j_{\pi(r)+1}[/math] and [math]l_r=l_{\pi(r)}[/math] found above read:
Counting the number of free parameters in our moment formula, we obtain:
(8) The point now is that the last exponent is well-known to be [math]\leq 0[/math], with equality precisely when the permutation [math]\pi\in S_p[/math] is geodesic, which in practice means that [math]\pi[/math] must come from a noncrossing partition. Thus we obtain, in the [math]N\to\infty[/math] limit, as desired:
This finishes the proof in the case of the exponents [math]k[/math] which are alternating, and the case where [math]k[/math] is an arbitrary uniform exponent is similar, by permuting everything.
As a conclusion to this, we have obtained as asymptotic law for the Gaussian matrices a certain mysterious distribution, having as moments some numbers which are similar to the moments of the usual normal laws, but with the “underlying matching pairings being now replaced by underlying matching noncrossing pairings”. More on this later.
Regarding now the Wigner matrices, we have here the following result, coming as a consequence of Theorem 6.20, via some simple algebraic manipulations:
Given a sequence of Wigner random matrices
This can be deduced from a direct computation based on the Wick formula, similar to that from the proof of Theorem 6.20, but the best is to deduce this result from Theorem 6.20 itself. Indeed, we know from there that for Gaussian matrices [math]Y_N\in M_N(L^\infty(X))[/math] we have the following formula, valid for any colored integer [math]K=\circ\bullet\bullet\circ\ldots\,[/math], in the [math]N\to\infty[/math] limit, with [math]\mathcal{NC}_2[/math] standing for noncrossing matching pairings:
By doing some combinatorics, we deduce from this that we have the following formula for the moments of the matrices [math]Re(Y_N)[/math], with respect to usual exponents, [math]k\in\mathbb N[/math]:
Now since the matrices [math]Z_N=\sqrt{2}Re(Y_N)[/math] are of Wigner type, this gives the result.
Summarizing, all this brings us into counting noncrossing pairings. So, let us start with some preliminaries here. We first have the following well-known result:
The Catalan numbers, which are by definition given by
their generating series [math]f(z)=\sum_{k\geq0}C_kz^k[/math] satisfies the equation
and we have the following explicit formula for these numbers:
We must count the noncrossing pairings of [math]\{1,\ldots,2k\}[/math]. Now observe that such a pairing appears by pairing 1 to an odd number, [math]2a+1[/math], and then inserting a noncrossing pairing of [math]\{2,\ldots,2a\}[/math], and a noncrossing pairing of [math]\{2a+2,\ldots,2l\}[/math]. We conclude that we have the following recurrence formula for the Catalan numbers:
In terms of the generating series [math]f(z)=\sum_{k\geq0}C_kz^k[/math], this recurrence formula reads:
Thus [math]f[/math] satisfies [math]zf^2-f+1=0[/math], and by solving this equation, and choosing the solution which is bounded at [math]z=0[/math], we obtain the following formula:
In order to finish, we use the generalized binomial formula, which gives:
Now back to our series [math]f[/math], we obtain the following formula for it:
It follows that the Catalan numbers are given by:
Thus, we are led to the conclusion in the statement.
In order to recapture now the Wigner measure from its moments, we can use:
The Catalan numbers are the even moments of
The even moments of the semicircle law in the statement can be computed with the change of variable [math]x=2\cos t[/math], and we are led to the following formula:
As for the odd moments, these all vanish, because the density of [math]\gamma_1[/math] is an even function. Thus, we are led to the conclusion in the statement.
More generally, we have the following result, involving a parameter [math]t \gt 0[/math]:
Given [math]t \gt 0[/math], the real measure having as even moments the numbers [math]M_{2k}=t^kC_k[/math] and having all odd moments [math]0[/math] is the measure
This follows indeed from Proposition 6.23, via a change of variables.
Now by putting everything together, we obtain the Wigner theorem, as follows:
Given a sequence of Wigner random matrices
This follows indeed from all the above, and more specifically, by combining Theorem 6.21, Theorem 6.22 and Proposition 6.24.
Regarding now the complex Gaussian matrices, in view of this result, it is natural to think at the law found in Theorem 6.20 as being “circular”. But this is just a thought, and more on this later, in chapter 8 below, when doing free probability.
6d. Wishart matrices
Let us discuss now the Wishart matrices, which are the positive analogues of the Wigner matrices. Quite surprisingly, the computation here leads to the Catalan numbers, but not in the same way as for the Wigner matrices, the result being as follows:
Given a sequence of complex Wishart matrices
There are several possible proofs for this result, as follows:
(1) A first method is by using the formula that we have in Theorem 6.20, for the Gaussian matrices [math]Y_N[/math]. Indeed, we know from there that we have the following formula, valid for any colored integer [math]K=\circ\bullet\bullet\circ\ldots\,[/math], in the [math]N\to\infty[/math] limit:
With [math]K=\circ\bullet\circ\bullet\ldots\,[/math], alternating word of length [math]2k[/math], with [math]k\in\mathbb N[/math], this gives:
Thus, in terms of the Wishart matrix [math]W_N=Y_NY_N^*[/math] we have, for any [math]k\in\mathbb N[/math]:
The point now is that, by doing some combinatorics, we have:
Thus, we are led to the formula in the statement.
(2) A second method, that we will explain now as well, is by proving the result directly, starting from definitions. The matrix entries of our matrix [math]W=W_N[/math] are given by:
Thus, the normalized traces of powers of [math]W[/math] are given by the following formula:
By rescaling now [math]W[/math] by a [math]1/N[/math] factor, as in the statement, we obtain:
By using now the Wick rule, we obtain the following formula for the moments, with [math]K=\circ\bullet\circ\bullet\ldots\,[/math], alternating word of lenght [math]2k[/math], and with [math]I=(i_1r_1,i_2r_1,\ldots,i_kr_k,i_1r_k)[/math]:
In order to compute this quantity, we use the standard bijection [math]\mathcal P_2(K)\simeq S_k[/math]. By identifying the pairings [math]\pi\in\mathcal P_2(K)[/math] with their counterparts [math]\pi\in S_k[/math], we obtain:
Now let [math]\gamma\in S_k[/math] be the full cycle, which is by definition the following permutation:
The general factor in the product computed above is then 1 precisely when following two conditions are simultaneously satisfied:
Counting the number of free parameters in our moment formula, we obtain:
The point now is that the last exponent is well-known to be [math]\leq 0[/math], with equality precisely when the permutation [math]\pi\in S_k[/math] is geodesic, which in practice means that [math]\pi[/math] must come from a noncrossing partition. Thus we obtain, in the [math]N\to\infty[/math] limit:
Thus, we are led to the conclusion in the statement.
As a consequence of the above result, we have a new look on the Catalan numbers, which is more adapted to our present Wishart matrix considerations, as follows:
The Catalan numbers [math]C_k=|NC_2(2k)|[/math] appear as well as
This follows indeed from the proof of Theorem 6.26. Observe that we obtain as well a formula in terms of matching pairings of alternating colored integers.
The direct explanation for the above formula, relating noncrossing partitions and pairings, comes form the following result, which is very useful, and good to know:
We have a bijection between noncrossing partitions and pairings
- The application [math]NC(k)\to NC_2(2k)[/math] is the “fattening” one, obtained by doubling all the legs, and doubling all the strings as well.
- Its inverse [math]NC_2(2k)\to NC(k)[/math] is the “shrinking” application, obtained by collapsing pairs of consecutive neighbors.
The fact that the two operations in the statement are indeed inverse to each other is clear, by computing the corresponding two compositions, with the remark that the construction of the fattening operation requires the partitions to be noncrossing.
Getting back now to probability, we are led to the question of finding the law having the Catalan numbers as moments, in the above way. The result here is as follows:
The real measure having the Catalan numbers as moments is
The moments of the law [math]\pi_1[/math] in the statement can be computed with the change of variable [math]x=4\cos^2t[/math], as follows:
Thus, we are led to the conclusion in the statement.
Now back to the Wishart matrices, we are led to the following result:
Given a sequence of complex Wishart matrices
This follows indeed from Theorem 6.26 and Proposition 6.29.
As a comment now, while the above result is definitely something interesting at [math]t=1[/math], at general [math]t \gt 0[/math] this looks more like a “fake” generalization of the [math]t=1[/math] result, because the law [math]\pi_1[/math] stays the same, modulo a trivial rescaling. The reasons behind this phenomenon are quite subtle, and skipping some discussion, the point is that Theorem 6.30 is indeed something “fake” at general [math]t \gt 0[/math], and the correct generalization of the [math]t=1[/math] computation, involving more general classes of complex Wishart matrices, is as follows:
Given a sequence of general complex Wishart matrices
This follows once again by using the moment method, the limiting moments in the [math]M=tN\to\infty[/math] regime being as follows, after doing the combinatorics:
But these numbers are the moments of the Marchenko-Pastur law [math]\pi_t[/math], which in addition has the density given by the formula in the statement, and this gives the result.
As a philosophical conclusion now, we have 4 main laws in what we have been doing so far, namely the Gaussian laws [math]g_t[/math], the Poisson laws [math]p_t[/math], the Wigner laws [math]\gamma_t[/math] and the Marchenko-Pastur laws [math]\pi_t[/math]. These laws naturally form a diagram, as follows:
We will see in chapter 8 that [math]\pi_t,\gamma_t[/math] appear as “free analogues” of [math]p_t,g_t[/math], and that a full theory can be developed, with central limiting theorems for all 4 laws, convolution semigroup results for all 4 laws too, and Lie group type results for all 4 laws too. And also, we will be back to the random matrices as well, with further results about them.
General references
Banica, Teo (2024). "Principles of operator algebras". arXiv:2208.03600 [math.OA].
References
- D.V. Voiculescu, K.J. Dykema and A. Nica, Free random variables, AMS (1992).
- E. Wigner, Characteristic vectors of bordered matrices with infinite dimensions, Ann. of Math. 62 (1955), 548--564.
- V.A. Marchenko and L.A. Pastur, Distribution of eigenvalues in certain sets of random matrices, Mat. Sb. 72 (1967), 507--536.