Free probability
9a. Freeness
Welcome to free probability. We have met some already, and in this chapter and in the next three ones we discuss the foundations and main results of free probability, in analogy with the foundations and main results of classical probability.
The common framework for classical and free probability is “noncommutative probability”. This is something very general, that we already met in connection with the random matrices, in chapters 5-8. We first recall this material. Let us start with:
A [math]C^*[/math]-algebra is a complex algebra [math]A[/math], having a norm [math]||.||[/math] making it a Banach algebra, and an involution [math]*[/math], related to the norm by the formula
As a basic example, the algebra [math]B(H)[/math] of the bounded linear operators [math]T:H\to H[/math] on a complex Hilbert space [math]H[/math] is a [math]C^*[/math]-algebra, with the usual norm and involution:
More generally, any closed [math]*[/math]-subalgebra of [math]B(H)[/math] is a [math]C^*[/math]-algebra. It is possible to prove that any [math]C^*[/math]-algebra appears in this way, as explained in chapter 5:
In finite dimensions we have [math]H=\mathbb C^N[/math], and so the operator algebra [math]B(H)[/math] is the usual matrix algebra [math]M_N(\mathbb C)[/math], with the usual norm and involution, namely:
As explained in chapter 4, in the context of Peter-Weyl theory, some algebra shows that the finite dimensional [math]C^*[/math]-algebras are the direct sums of matrix algebras:
Summarizing, the [math]C^*[/math]-algebra formalism is something in between the [math]*[/math]-algebras, which are purely algebraic objects, and whose theory basically leads nowhere, and the fully advanced operator algebras, which are the von Neumann algebras. More on this later.
As yet another class of examples now, which are of particular importance for us, we have various algebras of functions [math]f:X\to\mathbb C[/math]. The theory here is as follows:
The commutative [math]C^*[/math]-algebras are the algebras of type [math]C(X)[/math], with [math]X[/math] being a compact space, the correspondence being as follows:
- Given a compact space [math]X[/math], the algebra [math]C(X)[/math] of continuous functions [math]f:X\to\mathbb C[/math] is a commutative [math]C^*[/math]-algebra, with norm and involution as follows:
[[math]] ||f||=\sup_{x\in X}|f(x)|\quad,\quad f^*(x)=\overline{f(x)} [[/math]]
- Conversely, any commutative [math]C^*[/math]-algebra can be written as [math]A=C(X)[/math], with its “spectrum” appearing as the space of Banach algebra characters of [math]A[/math]:
[[math]] X=\big\{\chi:A\to\mathbb C\big\} [[/math]]
In view of this, given an arbitrary [math]C^*[/math]-algebra [math]A[/math], not necessarily commutative, we agree to write [math]A=C(X)[/math], and call the abstract space [math]X[/math] a compact quantum space.
This is something that we know from chapter 5, the idea being as follows:
(1) First of all, the fact that [math]C(X)[/math] is a Banach algebra is clear, because a uniform limit of continuous functions must be continuous. As for the formula [math]||ff^*||=||f||^2[/math], this is something trivial for functions, because on both sides we obtain [math]\sup_{x\in X}|f(x)|^2[/math].
(2) Given a commutative [math]C^*[/math]-algebra [math]A[/math], the character space [math]X=\{\chi:A\to\mathbb C\}[/math] is indeed compact, and we have an evaluation morphism [math]ev:A\to C(X)[/math]. The tricky point, which follows from basic spectral theory, is to prove that [math]ev[/math] is indeed isometric.
The above result is quite interesting for us, because it allows one to formally write any [math]C^*[/math]-algebra as [math]A=C(X)[/math], with [math]X[/math] being a noncommutative compact space. This is certainly something very nice, and in order to do now some probability theory over such spaces [math]X[/math], we would need probability measures [math]\mu[/math]. But, the problem is that these measures [math]\mu[/math] are impossible to define, because our spaces [math]X[/math] have no points in general.
However, we can trick, and do probability theory just by using expectations functionals [math]E:A\to\mathbb C[/math], instead of the probability measures [math]\mu[/math] themselves. These expectations are called traces, are are denoted [math]tr:A\to\mathbb C[/math], and their axiomatization is as follows:
A trace, or expectation, or integration functional, on a [math]C^*[/math]-algebra [math]A[/math] is a linear form [math]tr:A\to\mathbb C[/math] having the following properties:
- [math]tr[/math] is unital, and continuous.
- [math]tr[/math] is positive, [math]a\geq0\implies\varphi(a)\geq0[/math].
- [math]tr[/math] has the trace property [math]tr(ab)=tr(ba)[/math].
We call [math]tr[/math] faithful when [math]a \gt 0\implies\varphi(a) \gt 0[/math].
In the commutative case, [math]A=C(X)[/math], the Riesz theorem shows that the positive traces [math]tr:A\to\mathbb C[/math] appear as integration functionals with respect to positive measures [math]\mu[/math]:
Moreover, the unitality of [math]tr[/math] corresponds to the fact that [math]\mu[/math] has mass one, and the faithfulness of [math]tr[/math] corresponds to the faithfulness of [math]\mu[/math]. Thus, in general, when [math]A[/math] is no longer commutative, in order to do probability theory on the underlying noncommutative compact space [math]X[/math], what we need is a faithful trace [math]tr:A\to\mathbb C[/math] as above.
So, this will be our philosophy in what follows, a noncommutative probability space [math](X,\mu)[/math] being something abstract, corresponding in practice to a pair [math](A,tr)[/math]. This is of course something a bit simplified, because associated to any space [math]X[/math], noncommutative or even classical, there are in fact many possible [math]C^*[/math]-algebras of functions [math]f:X\to\mathbb C[/math], such as [math]C(X)[/math], [math]L^\infty(X)[/math] and so on, and for a better theory, we would have to make a choice between these various [math]C^*[/math]-algebras associated to [math]X[/math]. But let us not worry with this for the moment, what we have is good for starting some computations, so let us just do these computations, see what we get, and we will come back later to more about formalism.
Going ahead with definitions, everything in what follows will be based on:
Let [math]A[/math] be a [math]C^*[/math]-algebra, given with a trace [math]tr:A\to\mathbb C[/math].
- The elements [math]a\in A[/math] are called random variables.
- The moments of such a variable are the numbers [math]M_k(a)=tr(a^k)[/math].
- The law of such a variable is the functional [math]\mu:P\to tr(P(a))[/math].
Here [math]k=\circ\bullet\bullet\circ\ldots[/math] is by definition a colored integer, and the corresponding powers [math]a^k[/math] are defined by the following formulae, and multiplicativity:
As for the polynomial [math]P[/math], this is a noncommuting [math]*[/math]-polynomial in one variable:
Observe that the law is uniquely determined by the moments, because we have:
Generally speaking, the above definition is something quite abstract, but there is no other way of doing things, at least at this level of generality. However, in certain special cases, the formalism simplifies, and we recover more familiar objects, as follows:
Assuming that [math]a\in A[/math] is normal, [math]aa^*=a^*a[/math], its law corresponds to a probability measure on its spectrum [math]\sigma(a)\subset\mathbb C[/math], according to the following formula:
This is something very standard, coming from the continuous functional calculus in [math]C^*[/math]-algebras, explained in chapter 5. In fact, we can deduce from there that more is true, in the sense that the following formula holds, for any [math]f\in C(\sigma(a))[/math]:
In addition, assuming that we are in the case [math]A\subset B(H)[/math], the measurable functional calculus tells us that the above formula holds in fact for any [math]f\in L^\infty(\sigma(a))[/math].
We have the following independence notion, generalizing the one from chapter 1:
Two subalgebras [math]A,B\subset C[/math] are called independent when the following condition is satisfied, for any [math]a\in A[/math] and [math]b\in B[/math]:
Observe that the above two independence conditions are indeed equivalent, with this following from the following computation, with the convention [math]a'=a-tr(a)[/math]:
The other remark is that the above notion generalizes indeed the usual notion of independence, from the classical case, the precise result here being as follows:
Given two compact measured spaces [math]X,Y[/math], the algebras
We have two assertions here, the idea being as follows:
(1) First of all, given two abstract compact spaces [math]X,Y[/math], we have embeddings of algebras as in the statement, defined by the following formulae:
In the measured space case now, the Fubini theorems tells us that we have:
Thus, the algebras [math]C(X),C(Y)[/math] are independent in the sense of Definition 9.6.
(2) Conversely, assume that [math]A,B\subset C[/math] are independent, with [math]C[/math] being commutative. Let us write our algebras as follows, with [math]X,Y,Z[/math] being certain compact spaces:
In this picture, the inclusions [math]A,B\subset C[/math] must come from quotient maps, as follows:
Regarding now the independence condition from Definition 9.6, in the above picture, this tells us that the following equality must happen:
Thus we are in a Fubini type situation, and we obtain from this:
Thus, the independence of the algebras [math]A,B\subset C[/math] appears as in (1) above.
It is possible to develop some theory here, but this is ultimately not very interesting. As a much more interesting notion now, we have Voiculescu's freeness [1]:
Two subalgebras [math]A,B\subset C[/math] are called free when the following condition is satisfied, for any [math]a_i\in A[/math] and [math]b_i\in B[/math]:
In short, freeness appears by definition as a kind of “free analogue” of usual independence, taking into account the fact that the variables do not necessarily commute. As a first observation, of theoretical nature, there is actually a certain lack of symmetry between Definition 9.6 and Definition 9.8, because in contrast to the former, the latter does not include an explicit formula for the quantities of the following type:
However, this is not an issue, and is simply due to the fact that the formula in the free case is something more complicated, the precise result being as follows:
Assuming that [math]A,B\subset C[/math] are free, the restriction of [math]tr[/math] to [math] \lt A,B \gt [/math] can be computed in terms of the restrictions of [math]tr[/math] to [math]A,B[/math]. To be more precise,
This is something a bit theoretical, so let us begin with an example. Our claim is that if [math]a,b[/math] are free then, exactly as in the case where we have independence:
Indeed, let us go back to the computation performed after Definition 9.6, which was as follows, with the convention [math]a'=a-tr(a)[/math]:
Our claim is that this computation perfectly works under the sole freeness assumption. Indeed, the only non-trivial equality is the last one, which follows from:
In general, the situation is of course more complicated than this, but the same trick applies. To be more precise, we can start our computation as follows:
Observe that we have used here the freeness condition, in the following form:
Now regarding the “other terms”, those which are left, each of them will consist of a product of traces of type [math]tr(a_i)[/math] and [math]tr(b_i)[/math], and then a trace of a product still remaining to be computed, which is of the following form, for some elements [math]\alpha_i\in A[/math] and [math]\beta_i\in B[/math]:
To be more precise, the variables [math]\alpha_i\in A[/math] appear as ordered products of those [math]a_i\in A[/math] not getting into individual traces [math]tr(a_i)[/math], and the variables [math]\beta_i\in B[/math] appear as ordered products of those [math]b_i\in B[/math] not getting into individual traces [math]tr(b_i)[/math]. Now since the length of each such alternating product [math]\alpha_1\beta_1\alpha_2\beta_2\ldots[/math] is smaller than the length of the original product [math]a_1b_1a_2b_2\ldots[/math], we are led into of recurrence, and this gives the result.
Let us discuss now some models for independence and freeness. We have the following result, from [1], which clarifies the analogy between independence and freeness:
Given two algebras [math](A,tr)[/math] and [math](B,tr)[/math], the following hold:
- [math]A,B[/math] are independent inside their tensor product [math]A\otimes B[/math], endowed with its canonical tensor product trace, given by [math]tr(a\otimes b)=tr(a)tr(b)[/math].
- [math]A,B[/math] are free inside their free product [math]A*B[/math], endowed with its canonical free product trace, given by the formulae in Proposition 9.9.
Both the above assertions are clear from definitions, as follows:
(1) This is clear with either of the definitions of the independence, from Definition 9.6, because we have by construction of the product trace:
Observe that there is a relation here with Theorem 9.7 as well, due to the following formula for compact spaces, with [math]\otimes[/math] being a topological tensor product:
To be more precise, the present statement generalizes the first assertion in Theorem 9.7, and the second assertion tells us that this generalization is more or less the same thing as the original statement. All this comes of course from basic measure theory.
(2) This is clear too from definitions, the only point being that of showing that the notion of freeness, or the recurrence formulae in Proposition 9.9, can be used in order to construct a canonical free product trace, on the free product of the algebras involved:
But this can be checked for instance by using a GNS construction. Indeed, consider the GNS constructions for the algebras [math](A,tr)[/math] and [math](B,tr)[/math]:
By taking the free product of these representations, we obtain a representation as follows, with the [math]*[/math] on the right being a free product of pointed Hilbert spaces:
Now by composing with the linear form [math]T\to \lt T\xi,\xi \gt [/math], where [math]\xi=1_A=1_B[/math] is the common distinguished vector of [math]l^2(A)[/math], [math]l^2(B)[/math], we obtain a linear form, as follows:
It is routine then to check that [math]tr[/math] is indeed a trace, and this is the “canonical free product trace” from the statement. Then, an elementary computation shows that [math]A,B[/math] are free inside [math]A*B[/math], with respect to this trace, and this finishes the proof. See [1].
9b. Free convolution
All the above was quite theoretical, and as a concrete application of the above results, bringing us into probability, we have the following result, from [2]:
We have a free convolution operation [math]\boxplus[/math] for the distributions
We have several verifications to be performed here, as follows:
(1) We first have to check that given two variables [math]a,b[/math] which live respectively in certain [math]C^*[/math]-algebras [math]A,B[/math], we can recover inside some [math]C^*[/math]-algebra [math]C[/math], with exactly the same distributions [math]\mu_a,\mu_b[/math], as to be able to sum them and talk about [math]\mu_{a+b}[/math]. But this comes from Theorem 9.10, because we can set [math]C=A*B[/math], as explained there.
(2) The other verification which is needed is that of the fact that if two variables [math]a,b[/math] are free, then the distribution [math]\mu_{a+b}[/math] depends only on the distributions [math]\mu_a,\mu_b[/math]. But for this purpose, we can use the general formula from Proposition 9.9, namely:
Now by plugging in arbitrary powers of [math]a,b[/math] as variables [math]a_i,b_j[/math], we obtain a family of formulae of the following type, with [math]Q[/math] being certain polyomials:
Thus the moments of [math]a+b[/math] depend only on the moments of [math]a,b[/math], with of course colored exponents in all this, according to our moment conventions, and this gives the result.
(3) Finally, in what regards the last assertion, regarding the real measures, this is clear from the fact that if the variables [math]a,b[/math] are self-adjoint, then so is their sum [math]a+b[/math].
Along the same lines, but with some technical subtleties this time, we can talk as well about multiplicative free convolution, following [3], as follows:
We have a free convolution operation [math]\boxtimes[/math] for the distributions
We have two statements here, the idea being as follows:
(1) The verifications for the fact that [math]\boxtimes[/math] as above is indeed well-defined at the general distribution level are identical to those done before for [math]\boxplus[/math], with the result basically coming from the formula in Proposition 9.9, and with Theorem 9.10 invoked as well, in order to say that we have a model, and so we can indeed use this formula.
(2) Regarding now the last assertion, regarding the real measures, this was something trivial for [math]\boxplus[/math], but is something trickier now for [math]\boxtimes[/math], because if we take [math]a,b[/math] to be self-adjoint, thier product [math]ab[/math] will in general not be self-adjoint, and definitely it will be not if we want [math]a,b[/math] to be free, and so the formula [math]\mu_a\boxtimes\mu_b=\mu_{ab}[/math] will apparently makes us exit the world of real probability measures. However, this is not exactly the case. Indeed, let us set:
This new variable is then self-adjoint, and its moments are given by:
Thus, we are led to the conclusion in the statement.
We would like now to have linearization results for [math]\boxplus[/math] and [math]\boxtimes[/math], in the spirit of the known results for [math]*[/math] and [math]\times[/math]. We will do this slowly, in several steps. As a first objective, we would like to convert our one and only modeling result so far, namely Theorem 9.10, which is a rather abstract result, into something more concrete. Let us start with:
Let [math]\Gamma[/math] be a discrete group, and consider the complex group algebra [math]\mathbb C[\Gamma][/math], with involution given by the fact that all group elements are unitaries:
We have two assertions to be proved, the idea being as follows:
(1) In order to prove the first assertion, regarding the maximal seminorm which is a norm, we must find a [math]*[/math]-algebra embedding as follows, with [math]H[/math] being a Hilbert space:
For this purpose, consider the Hilbert space [math]H=l^2(\Gamma)[/math], having the family [math]\{h\}_{h\in\Gamma}[/math] as orthonormal basis. Our claim is that we have an embedding, as follows:
Indeed, since [math]\pi(g)[/math] maps the basis [math]\{h\}_{h\in\Gamma}[/math] into itself, this operator is well-defined and bounded, and is an isometry. It is also clear from the formula [math]\pi(g)(h)=gh[/math] that [math]g\to\pi(g)[/math] is a morphism of algebras, and since this morphism maps the unitaries [math]g\in\Gamma[/math] into isometries, this is a morphism of [math]*[/math]-algebras. Finally, the faithfulness of [math]\pi[/math] is clear.
(2) Regarding the second assertion, we can use here once again the above construction. Indeed, we can define a linear form on the image of [math]C^*(\Gamma)[/math], as follows:
This functional is then positive, and is easily seen to be a trace. Moreover, on the group elements [math]g\in\Gamma[/math], this functional is given by the following formula:
Thus, it remains to show that [math]tr[/math] is faithful on [math]\mathbb C[\Gamma][/math]. But this follows from the fact that [math]tr[/math] is faithful on the image of [math]C^*(\Gamma)[/math], which contains [math]\mathbb C[\Gamma][/math].
As an illustration, we have the following more precise result, in the abelian case:
Given a discrete abelian group [math]\Gamma[/math], we have an isomorphism
We have two assertions to be proved, the idea being as follows:
(1) Since [math]\Gamma[/math] is abelian, [math]A=C^*(\Gamma)[/math] is commutative, so by the Gelfand theorem we have [math]A=C(X)[/math]. The spectrum [math]X=Spec(A)[/math], consisting of the characters [math]\chi:C^*(\Gamma)\to\mathbb C[/math], can be then identified with the Pontrjagin dual [math]G=\widehat{\Gamma}[/math], and this gives the result.
(2) Regarding now the last assertion, we must prove here that we have:
But this is clear via the above identifications, for instance because the linear form [math]tr(g)=\delta_{g1}[/math], when viewed as a functional on [math]C(G)[/math], is left and right invariant.
Getting back now to our questions, we can now formulate a general modelling result for independence and freeness, providing us with large classes of examples, as follows:
We have the following results, valid for group algebras:
- [math]C^*(\Gamma),C^*(\Lambda)[/math] are independent inside [math]C^*(\Gamma\times\Lambda)[/math].
- [math]C^*(\Gamma),C^*(\Lambda)[/math] are free inside [math]C^*(\Gamma*\Lambda)[/math].
In order to prove these results, we have two possible methods:
(1) We can either use the general results in Theorem 9.10, along with the following two isomorphisms, which are both standard:
(2) Or, we can prove this directly, by using the fact that each algebra is spanned by the corresponding group elements. Indeed, this shows that it is enough to check the independence and freeness formulae on group elements, which is in turn trivial.
9c. Linearization
We have seen so far the foundations of free probability, in analogy with those of classical probability, taken with a functional analysis touch. The idea now is that with a bit of luck, the basic theory from the classical case, namely the Fourier transform, and then the CLT, should have free extensions. Let us being our discussion with the following definition, from [2], coming from the theory developed in the above:
The real probability measures are subject to operations [math]*[/math] and [math]\boxplus[/math], called classical and free convolution, given by the formulae
The problem now is that of linearizing these operations [math]*[/math] and [math]\boxplus[/math]. In what regards [math]*[/math], we know from chapter 1 that this operation is linearized by the logarithm [math]\log F[/math] of the Fourier transform, which in the present setting, where [math]E=tr[/math], is given by:
In order to find a similar result for [math]\boxplus[/math], we need some efficient models for the pairs of free random variables [math](a,b)[/math]. This is a priori not a problem, because once we have [math]a\in A[/math] and [math]b\in B[/math], we can form the free product [math]A*B[/math], which contains [math]a,b[/math] as free variables.
However, the initial choice, that of the variables [math]a\in A[/math], [math]b\in B[/math] modeling some given laws [math]\mu,\nu\in\mathcal P(\mathbb R)[/math], matters a lot. Indeed, any kind of abstract choice here would lead us into an abstract algebra [math]A*B[/math], and so into the abstract combinatorics of the free convolution, that cannot be solved with bare hands, and that we want to avoid.
In short, we must be tricky, at least in what concerns the beginning of our computation. Following [2], the idea will be that of temporarily lifting the self-adjointness assumption on our variables [math]a,b[/math], and looking instead for random variables [math]\alpha,\beta[/math], not necessarily self-adjoint, modelling in integer moments our given laws [math]\mu,\nu\in\mathcal P(\mathbb R)[/math], as follows:
To be more precise, assuming that [math]\alpha,\beta[/math] are indeed not self-adjoint, the above formulae are not the general formulae for [math]\alpha,\beta[/math], simply because these latter formulae involve colored integers [math]k=\circ\bullet\bullet\circ\ldots[/math] as exponents. Thus, in the context of the above formulae, [math]\mu,\nu[/math] are not the distributions of [math]\alpha,\beta[/math], but just some “parts” of these distributions.
Now with this idea in mind, due to Voiculescu and quite tricky, the solution to the law modelling problem comes in a quite straightforward way, involving the good old Hilbert space [math]H=l^2(\mathbb N)[/math] and the good old shift operator [math]S\in B(H)[/math], as follows:
Consider the shift operator on the space [math]H=l^2(\mathbb N)[/math], given by [math]S(e_i)=e_{i+1}[/math]. The variables of the following type, with [math]f\in\mathbb C[X][/math] being a polynomial,
We have already met the shift [math]S[/math] in chapter 5, as the simplest example of an isometry which is not a unitary, [math]S^*S=1,SS^*=1[/math], with this coming from:
Consider now a variable as in the statement, namely:
The computation of the moments of [math]T[/math] is then as follows:
-- We first have [math]tr(T)=a_0[/math].
-- Then the computation of [math]tr(T^2)[/math] will involve [math]a_1[/math].
-- Then the computation of [math]tr(T^3)[/math] will involve [math]a_2[/math].
-- And so on.
Thus, we are led to a certain recurrence, that we will not attempt to solve now, with bare hands, but which definitely gives the conclusion in the statement.
Before getting further, with free products of such models, let us work out a very basic example, which is something fundamental, that we will need in what follows:
In the context of the above correspondence, the variable
In order to compute the law of variable [math]T[/math] in the statement, we can use the moment method. The moments of this variable are as follows:
Now since the [math]S[/math] shifts to the right on [math]\mathbb N[/math], and [math]S^*[/math] shifts to the left, while remaining positive, we are left with counting the length [math]k[/math] paths on [math]\mathbb N[/math] starting and ending at 0. Since there are no such paths when [math]k=2r+1[/math] is odd, the odd moments vanish:
In the case where [math]k=2r[/math] is even, such paths on [math]\mathbb N[/math] are best represented as paths in the upper half-plane, starting at 0, and going at each step NE or SE, depending on whether the original path on [math]\mathbb N[/math] goes at right or left, and finally ending at [math]k\in\mathbb N[/math]. With this picture we are led to the following formula for the number of such paths:
But this is exactly the recurrence formula for the Catalan numbers, and so:
Summarizing, the odd moments of [math]T[/math] vanish, and the even moments are the Catalan numbers. But these numbers being the moments of the Wigner semicircle law [math]\gamma_1[/math], as explained in chapter 3, we are led to the conclusion in the statement.
Getting back now to our linearization program for [math]\boxplus[/math], the next step is that of taking a free product of the model found in Theorem 9.17 with itself. There are two approaches here, one being a bit abstract, and the other one being more concrete. We will explain in what follows both of them. The abstract approach, which is quite nice, making a link with our main modeling result so far, involving group algebras, is as follows:
We can talk about semigroup algebras [math]C^*(\Gamma)\subset B(l^2(\Gamma))[/math], exactly as we did for the group algebras, and at the level of examples:
- With [math]\Gamma=\mathbb N[/math] we recover the shift algebra [math]A= \lt S \gt [/math] on [math]H=l^2(\mathbb N)[/math].
- With [math]\Gamma=\mathbb N*\mathbb N[/math], we obtain the algebra [math]A= \lt S_1,S_2 \gt [/math] on [math]H=l^2(\mathbb N*\mathbb N)[/math].
We can talk indeed about semigroup algebras [math]C^*(\Gamma)\subset B(l^2(\Gamma))[/math], exactly as we did for the group algebras, the only difference coming from the fact that the semigroup elements [math]g\in\Gamma[/math] will now correspond to isometries, which are not necessarily unitaries. Now this construction in hand, both the assertions are clear, as follows:
(1) With [math]\Gamma=\mathbb N[/math] we recover indeed the shift algebra [math]A= \lt S \gt [/math] on the Hilbert space [math]H=l^2(\mathbb N)[/math], the shift [math]S[/math] itself being the isometry associated to the element [math]1\in\mathbb N[/math].
(2) With [math]\Gamma=\mathbb N*\mathbb N[/math] we recover the double shift algebra [math]A= \lt S_1,S_2 \gt [/math] on the Hilbert space [math]H=l^2(\mathbb N*\mathbb N)[/math], the two shifts [math]S_1,S_2[/math] themselves being the isometries associated to two copies of the element [math]1\in\mathbb N[/math], one for each of the two copies of [math]\mathbb N[/math] which are present.
In what follows we will rather use an equivalent, second approach to our problem, which is exactly the same thing, but formulated in a less abstract way, as follows:
We can talk about the algebra of creation operators
- With [math]H=\mathbb C[/math] we recover the shift algebra [math]A= \lt S \gt [/math] on [math]H=l^2(\mathbb N)[/math].
- With [math]H=\mathbb C^2[/math], we obtain the algebra [math]A= \lt S_1,S_2 \gt [/math] on [math]H=l^2(\mathbb N*\mathbb N)[/math].
We can talk indeed about the algebra [math]A(H)[/math] of creation operators on the free Fock space [math]F(H)[/math] associated to a real Hilbert space [math]H[/math], with the remark that, in terms of the abstract semigroup notions from Proposition 9.19, we have:
As for the assertions (1,2) in the statement, these are both clear, either directly, or by passing via (1,2) from Proposition 9.19, which were both clear as well.
The advantage with this latter model comes from the following result, from [2], which has a very simple formulation, without linear combinations or anything:
Given a real Hilbert space [math]H[/math], and two orthogonal vectors [math]x\perp y[/math], the corresponding creation operators [math]S_x[/math] and [math]S_y[/math] are free with respect to
In standard tensor product notation for the elements of the free Fock space [math]F(H)[/math], the formula of a creation operator associated to a vector [math]x\in H[/math] is as follows:
As for the formula of the adjoint of this creation operator, called annihilation operator associated to the vector [math]x\in H[/math], this is as follows:
We obtain from this the following formula, which holds for any two vectors [math]x,y\in H[/math]:
With these formulae in hand, the result follows by doing some elementary computations, in the spirit of those done for the group algebras, in the above.
With this technology in hand, let us go back to our linearization program for [math]\boxplus[/math]. We know from Theorem 9.17 how to model the individual distributions [math]\mu\in\mathcal P(\mathbb R)[/math], and by combining this with Proposition 9.10 and Proposition 9.21, we therefore know how to freely model pairs of distributions [math]\mu,\nu\in\mathcal P(\mathbb R)[/math], as required by the convolution problem. We are therefore left with doing the sum in the model, and then computing its distribution. And the point here is that, still following [2], we have:
Given two polynomials [math]f,g\in\mathbb C[X][/math], consider the variables
We have two assertions here, the idea being as follows:
(1) The freeness assertion comes from the general freeness result from Proposition 9.21, via the various identifications coming from the previous results.
(2) Regarding the second assertion, the idea is that this comes from a [math]45^\circ[/math] rotation trick. Let us write indeed the two variables in the statement as follows:
Now let us perform the following [math]45^\circ[/math] base change, on the real span of the vectors [math]s,t\in H[/math] producing our two shifts [math]S,T[/math], as follows:
The new shifts, associated to these vectors [math]r,u\in H[/math], are then given by:
By using now these two new shifts, which are free according to Proposition 9.21, we obtain the following equality of distributions:
To be more precise, here at the end we have used the freeness property of [math]R,U[/math] in order to cut [math]U[/math] from the computation, as it cannot bring anything, and then we did a basic rescaling at the very end. Thus, we are led to the conclusion in the statement.
As a conclusion, the operation [math]\mu\to f[/math] from Theorem 9.17 linearizes [math]\boxplus[/math]. In order to reach now to something concrete, we are left with a computation inside [math]C^*(\mathbb N)[/math], which is elementary, and whose conclusion is that [math]R_\mu=f[/math] can be recaptured from [math]\mu[/math] via the Cauchy transform [math]G_\mu[/math]. The precise result here, due to Voiculescu [2], is as follows:
Given a real probability measure [math]\mu[/math], define its [math]R[/math]-transform as follows:
This can be done by using the above results, in several steps, as follows:
(1) According to Theorem 9.22, the operation [math]\mu\to f[/math] from Theorem 9.17 linearizes the free convolution operation [math]\boxplus[/math]. We are therefore left with a computation inside [math]C^*(\mathbb N)[/math]. To be more precise, consider a variable as in Theorem 9.17:
In order to establish the result, we must prove that the [math]R[/math]-transform of [math]X[/math], constructed according to the procedure in the statement, is the function [math]f[/math] itself.
(2) In order to do so, we fix [math]|z| \lt 1[/math] in the complex plane, and we set:
The shift and its adjoint act then on this vector as follows:
It follows that the adjoint of our operator [math]X[/math] acts on this vector as follows:
Now observe that the above formula can be written as follows:
The point now is that when [math]|z|[/math] is small, the operator appearing on the right is invertible. Thus, we can rewrite the above formula as follows:
Now by applying the trace, we are led to the following formula:
(3) Let us apply now the procedure in the statement to the real probability measure [math]\mu[/math] modelled by [math]X[/math]. The Cauchy transform [math]G_\mu[/math] is then given by:
Now observe that, with the choice [math]\xi=z^{-1}+f(z)[/math] for our complex variable, the trace formula found in (2) above tells us that we have:
Thus, by definition of the [math]R[/math]-transform, we have the following formula:
But this finishes the proof, as explained before in step (1) above.
Summarizing, the situation in free probability is quite similar to the one in classical probability, the product spaces needed for the basic properties of the Fourier transform being replaced by something “noncommutative”, namely the free Fock space models. This is of course something quite surprising, and the credit for this remarkable discovery, which has drastically changed operator algebras, goes to Voiculescu's paper [2].
9d. Central limits
With the above linearization technology in hand, we can do many things. First, we have the following free analogue of the CLT, at variance 1, due to Voiculescu [2]:
Given self-adjoint variables [math]x_1,x_2,x_3,\ldots[/math] which are f.i.d., centered, with variance [math]1[/math], we have, with [math]n\to\infty[/math], in moments,
We follow the same idea as in the proof of the CLT, from chapter 1:
(1) The [math]R[/math]-transform of the variable in the statement on the left can be computed by using the linearization property from Theorem 9.23, and is given by:
(2) Regarding now the right term, our first claim here is that the Cauchy transform of the Wigner law [math]\gamma_1[/math] satisfies the following equation:
Indeed, we know from chapter 3 that the even moments of [math]\gamma_1[/math] are given by:
On the other hand, we also know from chapter 3 that the generating series of the Catalan numbers is given by the following formula:
By using this formula with [math]z=y^{-2}[/math], we obtain the following formula:
Now with [math]y=\xi+\xi^{-1}[/math], this formula becomes, as claimed in the above:
(3) We conclude from the formula found in (2) and from Theorem 9.23 that the [math]R[/math]-transform of the Wigner semicircle law [math]\gamma_1[/math] is given by the following formula:
Observe that this follows in fact as well from the following formula, coming from Proposition 9.18, and from the technical details of the [math]R[/math]-transform:
Thus, the laws in the statement have the same [math]R[/math]-transforms, so they are equal.
Summarizing, we have proved the free CLT at [math]t=1[/math]. The passage to the general case, where [math]t \gt 0[/math] is arbitrary, is routine, and still following Voiculescu [2], we have:
Given self-adjoint variables [math]x_1,x_2,x_3,\ldots[/math] which are f.i.d., centered, with variance [math]t \gt 0[/math], we have, with [math]n\to\infty[/math], in moments,
We follow the above proof at [math]t=1[/math], by making changes where needed:
(1) The [math]R[/math]-transform of the variable in the statement on the left can be computed by using the linearization property from Theorem 9.23, and is given by:
(2) Regarding now the right term, our claim here is that we have:
Indeed, we know from chapter 5 that the even moments of [math]\gamma_t[/math] are given by:
On the other hand, we know from chapter 3 that we have the following formula:
By using this formula with [math]z=ty^{-2}[/math], we obtain the following formula:
Now with [math]y=t\xi+\xi^{-1}[/math], this formula becomes, as claimed in the above:
(3) We conclude from the formula found in (2) and from Theorem 9.23 that the [math]R[/math]-transform of the Wigner semicircle law [math]\gamma_t[/math] is given by the following formula:
Thus, the laws in the statement have the same [math]R[/math]-transforms, so they are equal.
Regarding the limiting measures [math]\gamma_t[/math], that we already met in the previous chapters, in relation with the Wigner matrices, one problem that we were having was that of understanding how [math]\gamma_t[/math] exactly appears, out of [math]\gamma_1[/math]. We can now solve this question:
The Wigner semicircle laws have the property
This follows either from Theorem 9.25, or from Theorem 9.23, by using the fact that the [math]R[/math]-transform of [math]\gamma_t[/math], which is given by [math]R_{\gamma_t}(\xi)=t\xi[/math], is linear in [math]t[/math].
As a conclusion to what we have so far, we have:
The Gaussian laws [math]g_t[/math] and the Wigner laws [math]\gamma_t[/math], given by
- They appear via the CLT, and the free CLT.
- They form semigroups with respect to [math]*[/math] and [math]\boxplus[/math].
- Their transforms are [math]\log F_{g_t}(x)=-tx^2/2[/math], [math]R_{\gamma_t}(x)=tx[/math].
- Their moments are [math]M_k=\sum_{\pi\in D(k)}t^{|\pi|}[/math], with [math]D=P_2,NC_2[/math].
These are all results that we already know, the idea being as follows:
(1,2) These assertions follow from (3,4), via the general theory.
(3,4) These assertions follow by doing some combinatorics and calculus.
To summarize, our initial purpose for this chapter was to vaguely explore the basics of free probability, but all of a sudden, due to the power of Voiculescu's [math]R[/math]-transform [2], we are now into stating and proving results which are on par with what we have been doing in the first part of this book, namely reasonably advanced probability theory.
This is certainly quite encouraging, and we will keep developing free probability in what follows, in the remainder of this book, with free analogues of everything, or almost, of what we have been doing in chapters 1-4, in relation with classical probability and its applications, and also with some conceptual explanations, and technical enhancements, of what we have been doing in chapters 5-8, in relation with the random matrices.
General references
Banica, Teo (2024). "Calculus and applications". arXiv:2401.00911 [math.CO].
References
- 1.0 1.1 1.2 D.V. Voiculescu, Symmetries of some reduced free product [math]{\rm C}^*[/math]-algebras, in “Operator algebras and their connections with topology and ergodic theory”, Springer (1985), 556--588.
- 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 D.V. Voiculescu, Addition of certain noncommuting random variables, J. Funct. Anal. 66 (1986), 323--346.
- D.V. Voiculescu, Multiplication of certain noncommuting random variables, J. Operator Theory 18 (1987), 223--235.