Compact operators
4a. Polar decomposition
We have seen so far the basic theory of bounded operators, in the arbitrary, normal and self-adjoint cases, and in a few other cases of interest. In this chapter we discuss a number of more specialized questions, for the most dealing with the compact operators, which are particularly close, conceptually speaking, to the usual complex matrices.
We have in fact considerably many interesting things that we can talk about, in this final chapter on operator theory, and our choices will be as follows:
(1) Before anything, at the general level, we would like to understand the matrix and operator theory analogues of the various things that we know about the complex numbers [math]z\in M_1(\mathbb C)[/math], such as [math]z\bar{z}=|z|^2[/math], or [math]z=re^{it}[/math] and so on. We will discuss this first.
(2) Then, motivated by advanced linear algebra, we will go on a lengthy discussion on the algebra of compact operators [math]K(H)\subset B(H)[/math], which for many advanced operator theory purposes is the correct generalization of the matrix algebra [math]M_N(\mathbb C)[/math].
(3) Our discussion on the compact operators will feature as well some more specialized types of operators, [math]F(H)\subset B_1(H)\subset B_2(H)\subset K(H)[/math], with [math]F(H)[/math] being the finite rank ones, [math]B_1(H)[/math] being the trace class ones, and [math]B_2(H)[/math] being the Hilbert-Schmidt ones.
And that is pretty much it, all basic things, that must be known. Of course this will be just the tip of the iceberg, and more of an introduction to modern operator theory.
Getting started now, we would first like to systematically develop the theory of positive operators, and then establish polar decomposition results for the operators [math]T\in B(H)[/math]. We first have the following result, improving our knowledge from chapter 2:
For an operator [math]T\in B(H)[/math], the following are equivalent:
- [math] \lt Tx,x \gt \geq0[/math], for any [math]x\in H[/math].
- [math]T[/math] is normal, and [math]\sigma(T)\subset[0,\infty)[/math].
- [math]T=S^2[/math], for some [math]S\in B(H)[/math] satisfying [math]S=S^*[/math].
- [math]T=R^*R[/math], for some [math]R\in B(H)[/math].
If these conditions are satisfied, we call [math]T[/math] positive, and write [math]T\geq0[/math].
We have already seen some implications in chapter 2, but the best is to forget the few partial results that we know, and prove everything, as follows:
[math](1)\implies(2)[/math] Assuming [math] \lt Tx,x \gt \geq0[/math], with [math]S=T-T^*[/math] we have:
The next step is to use a polarization trick, as follows:
Thus we must have [math] \lt Sx,y \gt \in\mathbb R[/math], and with [math]y\to iy[/math] we obtain [math] \lt Sx,y \gt \in i\mathbb R[/math] too, and so [math] \lt Sx,y \gt =0[/math]. Thus [math]S=0[/math], which gives [math]T=T^*[/math]. Now since [math]T[/math] is self-adjoint, it is normal as claimed. Moreover, by self-adjointness, we have:
In order to prove now that we have indeed [math]\sigma(T)\subset[0,\infty)[/math], as claimed, we must invert [math]T+\lambda[/math], for any [math]\lambda \gt 0[/math]. For this purpose, observe that we have:
But this shows that [math]T+\lambda[/math] is injective. In order to prove now the surjectivity, and the boundedness of the inverse, observe first that we have:
Thus [math]Im(T+\lambda)[/math] is dense. On the other hand, observe that we have:
Thus for any vector in the image [math]y\in Im(T+\lambda)[/math] we have:
As a conclusion to what we have so far, [math]T+\lambda[/math] is bijective and invertible as a bounded operator from [math]H[/math] onto its image, with the following norm bound:
But this shows that [math]Im(T+\lambda)[/math] is complete, hence closed, and since we already knew that [math]Im(T+\lambda)[/math] is dense, our operator [math]T+\lambda[/math] is surjective, and we are done.
[math](2)\implies(3)[/math] Since [math]T[/math] is normal, and with spectrum contained in [math][0,\infty)[/math], we can use the continuous functional calculus formula for the normal operators from chapter 3, with the function [math]f(x)=\sqrt{x}[/math], as to construct a square root [math]S=\sqrt{T}[/math].
[math](3)\implies(4)[/math] This is trivial, because we can set [math]R=S[/math].
[math](4)\implies(1)[/math] This is clear, because we have the following computation:
Thus, we have the equivalences in the statement.
In analogy with what happens in finite dimensions, where among the positive matrices [math]A\geq0[/math] we have the strictly positive ones, [math]A \gt 0[/math], given by the fact that the eigenvalues are strictly positive, we have as well a “strict” version of the above result, as follows:
For an operator [math]T\in B(H)[/math], the following are equivalent:
- [math]T[/math] is positive and invertible.
- [math]T[/math] is normal, and [math]\sigma(T)\subset(0,\infty)[/math].
- [math]T=S^2[/math], for some [math]S\in B(H)[/math] invertible, satisfying [math]S=S^*[/math].
- [math]T=R^*R[/math], for some [math]R\in B(H)[/math] invertible.
If these conditions are satisfied, we call [math]T[/math] strictly positive, and write [math]T \gt 0[/math].
Our claim is that the above conditions (1-4) are precisely the conditions (1-4) in Theorem 4.1, with the assumption “[math]T[/math] is invertible” added. Indeed:
(1) This is clear by definition.
(2) In the context of Theorem 4.1 (2), namely when [math]T[/math] is normal, and [math]\sigma(T)\subset[0,\infty)[/math], the invertibility of [math]T[/math], which means [math]0\notin\sigma(T)[/math], gives [math]\sigma(T)\subset(0,\infty)[/math], as desired.
(3) In the context of Theorem 4.1 (3), namely when [math]T=S^2[/math], with [math]S=S^*[/math], by using the basic properties of the functional calculus for normal operators, the invertibility of [math]T[/math] is equivalent to the invertibility of its square root [math]S=\sqrt{T}[/math], as desired.
(4) In the context of Theorem 4.1 (4), namely when [math]T=RR^*[/math], the invertibility of [math]T[/math] is equivalent to the invertibility of [math]R[/math]. This can be either checked directly, or deduced via the equivalence [math](3)\iff(4)[/math] from Theorem 4.1, by using the above argument (3).
As a subtlety now, we have the following complement to the above result:
For a strictly positive operator, [math]T \gt 0[/math], we have
We have several things to be proved, the idea being as follows:
(1) Regarding the main assertion, the inequality can be deduced as follows, by using the fact that the operator [math]S=\sqrt{T}[/math] is invertible, and in particular injective:
(2) In finite dimensions, assuming [math] \lt Tx,x \gt \gt 0[/math] for any [math]x\neq0[/math], we know from Theorem 4.1 that we have [math]T\geq0[/math]. Thus we have [math]\sigma(T)\subset[0,\infty)[/math], and assuming by contradiction [math]0\in\sigma(T)[/math], we obtain that [math]T[/math] has [math]\lambda=0[/math] as eigenvalue, and the corresponding eigenvector [math]x\neq0[/math] has the property [math] \lt Tx,x \gt =0[/math], contradiction. Thus [math]T \gt 0[/math], as claimed.
(3) Regarding now the counterexample, consider the following operator on [math]l^2(\mathbb N)[/math]:
This operator [math]T[/math] is well-defined and bounded, and we have [math] \lt Tx,x \gt \gt 0[/math] for any [math]x\neq0[/math]. However [math]T[/math] is not invertible, and so the converse does not hold, as stated.
With this done, let us discuss now some decomposition results for the bounded operators [math]T\in B(H)[/math]. We know that any [math]z\in\mathbb C[/math] can be written as follows, with [math]a,b\in\mathbb R[/math]:
Also, we know that both the real and imaginary parts [math]a,b\in\mathbb R[/math], and more generally any real number [math]c\in\mathbb R[/math], can be written as follows, with [math]r,s\geq0[/math]:
Here are the operator theoretic generalizations of these results:
Given an operator [math]T\in B(H)[/math], the following happen:
- We can write [math]T=A+iB[/math], with [math]A,B\in B(H)[/math] being self-adjoint.
- When [math]T=T^*[/math], we can write [math]T=R-S[/math], with [math]R,S\in B(H)[/math] being positive.
- Thus, we can write any [math]T[/math] as a linear combination of [math]4[/math] positive elements.
All this follows from basic spectral theory, as follows:
(1) This is something that we have already met in chapter 3, when proving the spectral theorem in its general form, the decomposition formula being as follows:
(2) This follows from the measurable functional calculus. Indeed, assuming [math]T=T^*[/math] we have [math]\sigma(T)\subset\mathbb R[/math], so we can use the following decomposition formula on [math]\mathbb R[/math]:
To be more precise, let us multiply by [math]z[/math], and rewrite this formula as follows:
Now by applying these measurable functions to [math]T[/math], we obtain as formula as follows, with both the operators [math]T_+,T_-\in B(H)[/math] being positive, as desired:
(3) This follows indeed by combining the results in (1) and (2) above.
Going ahead with our decomposition results, another basic thing that we know about complex numbers is that any [math]z\in\mathbb C[/math] appears as a real multiple of a unitary:
Finding the correct operator theoretic analogue of this is quite tricky, and this even for the usual matrices [math]A\in M_N(\mathbb C)[/math]. As a basic result here, we have:
Given an operator [math]T\in B(H)[/math], the following happen:
- When [math]T=T^*[/math] and [math]||T||\leq1[/math], we can write [math]T[/math] as an average of [math]2[/math] unitaries:
[[math]] T=\frac{U+V}{2} [[/math]]
- In the general [math]T=T^*[/math] case, we can write [math]T[/math] as a rescaled sum of unitaries:
[[math]] T=\lambda(U+V) [[/math]]
- Thus, in general, we can write [math]T[/math] as a rescaled sum of [math]4[/math] unitaries.
This follows from the results that we have, as follows:
(1) Assuming [math]T=T^*[/math] and [math]||T||\leq1[/math] we have [math]1-T^2\geq0[/math], and the decomposition that we are looking for is as follows, with both the components being unitaries:
To be more precise, the square root can be extracted as in Theorem 4.1 (3), and the check of the unitarity of the components goes as follows:
(2) This simply follows by applying (1) to the operator [math]T/||T||[/math].
(3) Assuming first [math]||T||\leq1[/math], we know from Proposition 4.4 (1) that we can write [math]T=A+iB[/math], with [math]A,B[/math] being self-adjoint, and satisfying [math]||A||,||B||\leq1[/math]. Now by applying (1) to both [math]A[/math] and [math]B[/math], we obtain a decomposition of [math]T[/math] as follows:
In general, we can apply this to the operator [math]T/||T||[/math], and we obtain the result.
All this gets us into the multiplicative theory of the complex numbers, that we will attempt to generalize now. As a first construction, that we would like to generalize to the bounded operator setting, we have the construction of the modulus, as follows:
The point now is that we can indeed generalize this construction, as follows:
Given an operator [math]T\in B(H)[/math], we can construct a positive operator [math]|T|\in B(H)[/math] as follows, by using the fact that [math]T^*T[/math] is positive:
We have several things to be proved, the idea being as follows:
(1) The first assertion follows from Theorem 4.1. Indeed, according to (4) there the operator [math]T^*T[/math] is indeed positive, and then according to (2) there we can extract the square root of this latter positive operator, by applying to it the function [math]\sqrt{z}[/math].
(2) By functional calculus we have then [math]|T|^2=T^*T[/math], as desired.
(3) In the case [math]H=\mathbb C[/math], we obtain indeed the absolute value of complex numbers.
(4) In the case where the space [math]H[/math] is finite dimensional, [math]H=\mathbb C^N[/math], we obtain indeed the usual moduli of the complex matrices [math]A\in M_N(\mathbb C)[/math].
As a comment here, it is possible to talk as well about [math]\sqrt{TT^*}[/math], which is in general different from [math]\sqrt{T^*T}[/math]. Note that when [math]T[/math] is normal, no issue, because we have:
Regarding now the polar decomposition formula, let us start with a weak version of this statement, regarding the invertible operators, as follows:
We have the polar decomposition formula
According to our definition of the modulus, [math]|T|=\sqrt{T^*T}[/math], we have:
Thus we can define a unitary operator [math]U\in B(H)[/math] by the following formula:
But this formula shows that we have [math]T=U|T|[/math], as desired.
Observe that we have uniqueness in the above result, in what regards the choice of the unitary [math]U\in B(H)[/math], due to the fact that we can write this unitary as follows:
More generally now, we have the following result:
We have the polar decomposition formula
As before, we have the following equality, for any two vectors [math]x,y\in H[/math]:
We conclude that the following linear application is well-defined, and isometric:
Now by continuity we can extend this isometry [math]U[/math] into an isometry between certain Hilbert subspaces of [math]H[/math], as follows:
Moreover, we can further extend [math]U[/math] into a partial isometry [math]U:H\to H[/math], by setting [math]Ux=0[/math], for any [math]x\in\overline{Im|T|}^\perp[/math], and with this convention, the result follows.
4b. Compact operators
We have seen so far the basic theory of the bounded operators, in the arbitrary, normal and self-adjoint cases, and in a few other cases of interest. We will keep building on this, with a number of more specialized results, regarding the finite rank operators and compact operators, and other special classes of related operators, namely the trace class operators, and the Hilbert-Schmidt operators. Let us start with a basic definition, as follows:
An operator [math]T\in B(H)[/math] is said to be of finite rank if its image
is finite dimensional. The set of such operators is denoted [math]F(H)[/math].
There are many interesting examples of finite rank operators, the most basic ones being the finite rank projections, on the finite dimensional subspaces [math]K\subset H[/math]. Observe also that in the case where [math]H[/math] is finite dimensional, any operator [math]T\in B(H)[/math] is automatically of finite rank. In general, this is of course wrong, but we have the following result:
The set of finite rank operators
We have several assertions to be proved, the idea being as follows:
(1) It is clear from definitions that [math]F(H)[/math] is indeed a vector space, with this due to the following formulae, valid for any [math]S,T\in B(H)[/math], which are both clear:
(2) Let us prove now that [math]F(H)[/math] is stable under [math]*[/math]. Given [math]T\in F(H)[/math], we can regard it as an invertible operator between finite dimensional Hilbert spaces, as follows:
We conclude from this that we have the following dimension equality:
Our claim now, in relation with our problem, is that we have equalities as follows:
Indeed, the third equality is the one above, and the second equality is something that we know too, from chapter 2. Now by combining these two equalities we deduce that [math]Im(T^*)[/math] is finite dimensional, and so the first equality holds as well. Thus, our equalities are proved, and this shows that we have [math]T^*\in F(H)[/math], as desired.
(3) Finally, regarding the ideal property, this follows from the following two formulae, valid for any [math]S,T\in B(H)[/math], which are once again clear from definitions:
Thus, we are led to the conclusion in the statement.
Let us discuss now the compact operators, which will be the main topic of discussion, for the present chapter. These are best introduced as follows:
An operator [math]T\in B(H)[/math] is said to be compact if the closed set
Equivalently, an operator [math]T\in B(H)[/math] is compact when for any sequence [math]\{x_n\}\subset B_1[/math], or more generally for any bounded sequence [math]\{x_n\}\subset H[/math], the sequence [math]\{T(x_n)\}[/math] has a convergence subsequence. We will see later some further criteria of compactness.
In finite dimensions any operator is compact. In general, as a first observation, any finite rank operator is compact. We have in fact the following result:
Any finite rank operator is compact,
The first assertion is clear, because if [math]Im(T)[/math] is finite dimensional, then the following subset is closed and bounded, and so it is compact:
Regarding the second assertion, let us pick a compact operator [math]T\in K(H)[/math], and a number [math]\varepsilon \gt 0[/math]. By compactness of [math]T[/math] we can find a finite set [math]S\subset B_1[/math] such that:
Consider now the orthogonal projection [math]P[/math] onto the following finite dimensional space:
Since the set [math]S[/math] is finite, this space [math]E[/math] is finite dimensional, and so [math]P[/math] is of finite rank, [math]P\in F(H)[/math]. Now observe that for any norm one [math]y\in H[/math] and any [math]x\in S[/math] we have:
Now by picking [math]x\in S[/math] such that the ball [math]B_\varepsilon(Tx)[/math] covers the point [math]Ty[/math], we conclude from this that we have the following estimate:
Thus we have [math]||T-PT||\leq\varepsilon[/math], which gives the density result.
Quite remarkably, the set of compact operators is closed, and we have:
The set of compact operators
We have several assertions here, the idea being as follows:
(1) It is clear from definitions that [math]K(H)[/math] is indeed a vector space, with this due to the following formulae, valid for any [math]S,T\in B(H)[/math], which are both clear:
(2) In order to prove now that [math]K(H)[/math] is closed, assume that a sequence [math]T_n\in K(H)[/math] converges to [math]T\in B(H)[/math]. Given [math]\varepsilon \gt 0[/math], let us pick [math]N\in\mathbb N[/math] such that:
By compactness of [math]T_N[/math] we can find a finite set [math]S\subset B_1[/math] such that:
We conclude that for any [math]y\in B_1[/math] there exists [math]x\in S[/math] such that:
Thus, we have an inclusion as follows, with [math]S\subset B_1[/math] being finite:
But this shows that our limiting operator [math]T[/math] is compact, as desired.
(3) Regarding the fact that [math]K(H)[/math] is stable under involution, this follows from Proposition 4.10, Proposition 4.12 and (2). Indeed, by using Proposition 4.12, given [math]T\in K(H)[/math] we can write it as a limit of finite rank operators, as follows:
Now by applying the adjoint, we obtain that we have as well:
We know from Proposition 4.10 that the operators [math]T_n^*[/math] are of finite rank, and so compact by Proposition 4.12, and by using (2) we obtain that [math]T^*[/math] is compact too, as desired.
(4) Finally, regarding the ideal property, this follows from the following two formulae, valid for any [math]S,T\in B(H)[/math], which are once again clear from definitions:
Thus, we are led to the conclusion in the statement.
Here is now a second key result regarding the compact operators:
A bounded operator [math]T\in B(H)[/math] is compact precisely when
We have two implications to be proved, the idea being as follows:
“[math]\implies[/math]” Assume that [math]T[/math] is compact. By contradiction, assume [math]Te_n\not\to0[/math]. This means that there exists [math]\varepsilon \gt 0[/math] and a subsequence satisfying [math]||Te_{n_k}|| \gt \varepsilon[/math], and by replacing [math]\{e_n\}[/math] with this subsequence, we can assume that the following holds, with [math]\varepsilon \gt 0[/math]:
Since [math]T[/math] was assumed to be compact, and the sequence [math]\{e_n\}[/math] is bounded, a certain subsequence [math]\{Te_{n_k}\}[/math] must converge. Thus, by replacing once again [math]\{e_n\}[/math] with a subsequence, we can assume that the following holds, with [math]x\neq0[/math]:
But this is a contradiction, because we obtain in this way:
Thus our assumption [math]Te_n\not\to0[/math] was wrong, and we obtain the result.
“[math]\Longleftarrow[/math]” Assume [math]Te_n\to0[/math], for any orthonormal system [math]\{e_n\}\subset H[/math]. In order to prove that [math]T[/math] is compact, we use the various results established above, which show that this is the same as proving that [math]T[/math] is in the closure of the space of finite rank operators:
We do this by contradiction. So, assume that the above is wrong, and so that there exists [math]\varepsilon \gt 0[/math] such that the following holds:
As a first observation, by using [math]S=0[/math] we obtain [math]||T|| \gt \varepsilon[/math]. Thus, we can find a norm one vector [math]e_1\in H[/math] such that the following holds:
Our claim, which will bring the desired contradiction, is that we can construct by recurrence vectors [math]e_1,\ldots,e_n[/math] such that the following holds, for any [math]i[/math]:
Indeed, assume that we have constructed such vectors [math]e_1,\ldots,e_n[/math]. Let [math]E\subset H[/math] be the linear space spanned by these vectors, and let us set:
Since the operator [math]TP[/math] has finite rank, our assumption above shows that we have:
Thus, we can find a vector [math]x\in H[/math] such that the following holds:
We have then [math]x\not\in E[/math], and so we can consider the following nonzero vector:
With this nonzero vector [math]y[/math] constructed, in this way, now let us set:
This vector [math]e_{n+1}[/math] is then orthogonal to [math]E[/math], has norm one, and satisfies:
Thus we are done with our construction by recurrence, and this contradicts our assumption that [math]Te_n\to0[/math], for any orthonormal system [math]\{e_n\}\subset H[/math], as desired.
Summarizing, we have so far a number of results regarding the compact operators, in analogy with what we know about the usual complex matrices. Let us discuss now the spectral theory of the compact operators. We first have the following result:
Assuming that [math]T\in B(H)[/math], with [math]\dim H=\infty[/math], is compact and self-adjoint, the following happen:
- The eigenvalues of [math]T[/math] form a sequence [math]\lambda_n\to0[/math].
- All eigenvalues [math]\lambda_n\neq0[/math] have finite multiplicity.
We prove both the assertions at the same time. For this purpose, we fix a number [math]\varepsilon \gt 0[/math], we consider all the eigenvalues satisfying [math]|\lambda|\geq\varepsilon[/math], and for each such eigenvalue we consider the corresponding eigenspace [math]E_\lambda\subset H[/math]. Let us set:
Our claim, which will prove both (1) and (2), is that this space [math]E[/math] is finite dimensional. In now to prove now this claim, we can proceed as follows:
(1) We know that we have [math]E\subset Im(T)[/math]. Our claim is that we have:
Indeed, assume that we have a sequence [math]g_n\in E[/math] which converges, [math]g_n\to g\in\bar{E}[/math]. Let us write [math]g_n=Tf_n[/math], with [math]f_n\in H[/math]. By definition of [math]E[/math], the following condition is satisfied:
Now since the sequence [math]\{g_n\}[/math] is Cauchy we obtain from this that the sequence [math]\{f_n\}[/math] is Cauchy as well, and with [math]f_n\to f[/math] we have [math]Tf_n\to Tf[/math], as desired.
(2) Consider now the projection [math]P\in B(H)[/math] onto the closure [math]\bar{E}[/math] of the above vector space [math]E[/math]. The composition [math]PT[/math] is then as follows, surjective on its target:
On the other hand since [math]T[/math] is compact so must be [math]PT[/math], and if follows from this that the space [math]\bar{E}[/math] is finite dimensional. Thus [math]E[/math] itself must be finite dimensional too, and as explained in the beginning of the proof, this gives (1) and (2), as desired.
In order to construct now eigenvalues, we will need:
If [math]T[/math] is compact and self-adjoint, one of the numbers
We know from the spectral theory of the self-adjoint operators that the spectral radius [math]||T||[/math] of our operator [math]T[/math] is attained, and so one of the numbers [math]||T||,-||T||[/math] must be in the spectrum. In order to prove now that one of these numbers must actually appear as an eigenvalue, we must use the compactness of [math]T[/math], as follows:
(1) First, we can assume [math]||T||=1[/math]. By functional calculus this implies [math]||T^3||=1[/math] too, and so we can find a sequence of norm one vectors [math]x_n\in H[/math] such that:
By using our assumption [math]T=T^*[/math], we can rewrite this formula as follows:
Now since [math]T[/math] is compact, and [math]\{x_n\}[/math] is bounded, we can assume, up to changing the sequence [math]\{x_n\}[/math] to one of its subsequences, that the sequence [math]Tx_n[/math] converges:
Thus, the convergence formula found above reformulates as follows, with [math]y\neq0[/math]:
(2) Our claim now, which will finish the proof, is that this latter formula implies [math]Ty=\pm y[/math]. Indeed, by using Cauchy-Schwarz and [math]||T||=1[/math], we have:
We know that this must be an equality, so [math]Ty,y[/math] must be proportional. But since [math]T[/math] is self-adjoint the proportionality factor must be [math]\pm1[/math], and so we obtain, as claimed:
Thus, we have constructed an eigenvector for [math]\lambda=\pm1[/math], as desired.
We can further build on the above results in the following way:
If [math]T[/math] is compact and self-adjoint, there is an orthogonal basis of [math]H[/math] made of eigenvectors of [math]T[/math].
We use Proposition 4.15. According to the results there, we can arrange the nonzero eigenvalues of [math]T[/math], taken with multiplicities, into a sequence [math]\lambda_n\to0[/math]. Let [math]y_n\in H[/math] be the corresponding eigenvectors, and consider the following space:
The result follows then from the following observations:
(1) Since we have [math]T=T^*[/math], both [math]E[/math] and its orthogonal [math]E^\perp[/math] are invariant under [math]T[/math].
(2) On the space [math]E[/math], our operator [math]T[/math] is by definition diagonal.
(3) On the space [math]E^\perp[/math], our claim is that we have [math]T=0[/math]. Indeed, assuming that the restriction [math]S=T_{E^\perp}[/math] is nonzero, we can apply Proposition 4.16 to this restriction, and we obtain an eigenvalue for [math]S[/math], and so for [math]T[/math], contradicting the maximality of [math]E[/math].
With the above results in hand, we can now formulate a first spectral theory result for compact operators, which closes the discussion in the self-adjoint case:
Assuming that [math]T\in B(H)[/math], with [math]\dim H=\infty[/math], is compact and self-adjoint, the following happen:
- The spectrum [math]\sigma(T)\subset\mathbb R[/math] consists of a sequence [math]\lambda_n\to0[/math].
- All spectral values [math]\lambda\in\sigma(T)-\{0\}[/math] are eigenvalues.
- All eigenvalues [math]\lambda\in\sigma(T)-\{0\}[/math] have finite multiplicity.
- There is an orthogonal basis of [math]H[/math] made of eigenvectors of [math]T[/math].
This follows from the various results established above:
(1) In view of Proposition 4.15 (1), this will follow from (2) below.
(2) Assume that [math]\lambda\neq0[/math] belongs to the spectrum [math]\sigma(T)[/math], but is not an eigenvalue. By using Proposition 4.17, let us pick an orthonormal basis [math]\{e_n\}[/math] of [math]H[/math] consisting of eigenvectors of [math]T[/math], and then consider the following operator:
Then [math]S[/math] is an inverse for [math]T-\lambda[/math], and so we have [math]\lambda\notin\sigma(T)[/math], as desired.
(3) This is something that we know, from Proposition 4.15 (2).
(4) This is something that we know too, from Proposition 4.17.
Finally, we have the following result, regarding the general case:
The compact operators [math]T\in B(H)[/math], with [math]\dim H=\infty[/math], are the operators of the following form, with [math]\{e_n\}[/math], [math]\{f_n\}[/math] being orthonormal families, and with [math]\lambda_n\searrow0[/math]:
This basically follows from Theorem 4.8 and Theorem 4.18, as follows:
(1) Given two orthonormal families [math]\{e_n\}[/math], [math]\{f_n\}[/math], and a sequence of real numbers [math]\lambda_n\searrow0[/math], consider the linear operator given by the formula in the statement, namely:
Our first claim is that [math]T[/math] is bounded. Indeed, when assuming [math]|\lambda_n|\leq\varepsilon[/math] for any [math]n[/math], which is something that we can do if we want to prove that [math]T[/math] is bounded, we have:
(2) The next observation is that this operator is indeed compact, because it appears as the norm limit, [math]T_N\to T[/math], of the following sequence of finite rank operators:
(3) Regarding now the polar decomposition assertion, for the above operator, this follows once again from definitions. Indeed, the adjoint is given by:
Thus, when composing [math]T^*[/math] with [math]T[/math], we obtain the following operator:
Now by extracting the square root, we obtain the formula in the statement, namely:
(4) Conversely now, assume that [math]T\in B(H)[/math] is compact. Then [math]T^*T[/math], which is self-adjoint, must be compact as well, and so by Theorem 4.18 we have a formula as follows, with [math]\{e_n\}[/math] being a certain orthonormal family, and with [math]\lambda_n\searrow0[/math]:
By extracting the square root we obtain the formula of [math]|T|[/math] in the statement, and then by setting [math]U(e_n)=f_n[/math] we obtain a second orthonormal family, [math]\{f_n\}[/math], such that:
Thus, our compact operator [math]T\in B(H)[/math] appears indeed as in the statement.
As a technical remark here, it is possible to slightly improve a part of the above statement. Consider indeed an operator of the following form, with [math]\{e_n\}[/math], [math]\{f_n\}[/math] being orthonormal families as before, and with [math]\lambda_n\to0[/math] being now complex numbers:
Then the same proof as before shows that [math]T[/math] is compact, and that the polar decomposition of [math]T[/math] is given by [math]T=U|T|[/math], with the modulus [math]|T|[/math] being as follows:
As for the partial isometry [math]U[/math], this is given by [math]Ue_n=w_nf_n[/math], and [math]U=0[/math] on the complement of [math]span(e_i)[/math], where [math]w_n\in\mathbb T[/math] are such that [math]\lambda_n=|\lambda_n|w_n[/math].
4c. Trace class operators
We have not talked so far about the trace of operators [math]T\in B(H)[/math], in analogy with the trace of the usual matrices [math]M\in M_N(\mathbb C)[/math]. This is because the trace can be finite or infinite, or even not well-defined, and we will discuss this now. Let us start with:
Given a positive operator [math]T\in B(H)[/math], the quantity
If [math]\{f_n\}[/math] is another orthonormal basis, we have:
Since this quantity is symmetric in [math]e,f[/math], this gives the result.
We can now introduce the trace class operators, as follows:
An operator [math]T\in B(H)[/math] is said to be of trace class if:
In finite dimensions, any operator is of course of trace class. In arbitrary dimension, finite or not, we first have the following result, regarding such operators:
Any finite rank operator is of trace class, and any trace class operator is compact, so that we have embeddings as follows:
We have several assertions here, the idea being as follows:
(1) If [math]T[/math] is of finite rank, it is clearly of trace class.
(2) In order to prove now the second assertion, assume first that [math]T \gt 0[/math] is of trace class. For any orthonormal basis [math]\{e_n\}[/math] we have:
But this shows that we have a convergence as follows:
Thus the operator [math]\sqrt{T}[/math] is compact. Now since the compact operators form an ideal, it follows that [math]T=\sqrt{T}\cdot\sqrt{T}[/math] is compact as well, as desired.
(3) In order to prove now the second assertion in general, assume that [math]T\in B(H)[/math] is of trace class. Then [math]|T|[/math] is also of trace class, and so compact by (2), and since we have [math]T=U|T|[/math] by polar decomposition, it follows that [math]T[/math] is compact too.
(4) Finally, in order to prove the last assertion, assume that [math]T[/math] is compact. The singular value decomposition of [math]|T|[/math], from Theorem 4.19, is then as follows:
But this gives the formula for [math]Tr|T|[/math] in the statement, and proves the last assertion.
Here is a useful reformulation of the above result, or rather of the above result coupled with Theorem 4.19, without reference to compact operators:
The trace class operators are precisely the operators of the form
This follows indeed from Proposition 4.22, or rather for step (4) in the proof of Proposition 4.22, coupled with Theorem 4.19.
Next, we have the following result, which comes as a continuation of Proposition 4.22, and is our central result here, regarding the trace class operators:
The space of trace class operators, which appears as an intermediate space between the finite rank operators and the compact operators,
There are several assertions here, the idea being as follows:
(1) In order to prove that [math]B_1(H)[/math] is a linear space, and that [math]||T||_1=Tr|T|[/math] is a norm on it, the only non-trivial point is that of proving the following inequality:
For this purpose, consider the polar decompositions of these operators:
Given an orthonormal basis [math]\{e_n\}[/math], we have the following formula:
The point now is that the first sum can be estimated as follows:
In order to estimate the terms on the right, we can proceed as follows:
The second sum in the above formula of [math]Tr|S+T|[/math] can be estimated in the same way, and in the end we obtain, as desired:
(2) The estimate [math]||T||\leq||T||_1[/math] can be established as follows:
(3) The fact that [math]B_1(H)[/math] is indeed a Banach space follows by constructing a limit for any Cauchy sequence, by using the singular value decomposition.
(4) The fact that [math]B_1(H)[/math] is indeed closed under the involution follows from:
(5) In order to prove now the ideal property of [math]B_1(H)[/math], we use the standard fact, that we know from Proposition 4.5, that any bounded operator [math]T\in B(H)[/math] can be written as a linear combination of 4 unitary operators, as follows:
Indeed, by taking the real and imaginary part we can first write [math]T[/math] as a linear combination of 2 self-adjoint operators, and then by functional calculus each of these 2 self-adjoint operators can be written as a linear linear combination of 2 unitary operators.
(6) With this trick in hand, we can now prove the ideal property of [math]B_1(H)[/math]. Indeed, it is enough to prove that we have:
But this latter result follows by using the polar decomposition theorem.
(7) With a bit more care, we obtain from this the estimate [math]||ST||_1\leq||S||\cdot||T||_1[/math] from the statement. As for the last assertion, this is clear as well.
This was for the basic theory of the trace class operators. Much more can be said, and we refer here to the literature, such as Lax [1]. In what concerns us, we will be back to these operators later in this book, in Part III, when discussing operator algebras.
4d. Hilbert-Schmidt operators
As a last topic of this chapter, let us discuss yet another important class of operators, namely the Hilbert-Schmidt ones. These operators, that we will need on several key occasions in what follows, when talking operator algebras, are introduced as follows:
An operator [math]T\in B(H)[/math] is said to be Hilbert-Schmidt if:
As before with other sets of operators, in finite dimensions we obtain in this way all the operators. In general, we have the following result, regarding such operators:
The space [math]B_2(H)[/math] of Hilbert-Schmidt operators, which appears as an intermediate space between the trace class operators and the compact operators,
All this is quite standard, from the results that we have already, and more specifically from the singular value decomposition theorem, and its applications. To be more precise, the proof of the various assertions goes as follows:
(1) First of all, the fact that the space of Hilbert-Schmidt operators [math]B_2(H)[/math] is stable under taking sums, and so is a vector space, follows from:
Regarding now multiplicative properties, we can use here the following inequality:
Thus, the space [math]B_2(H)[/math] is a two-sided [math]*[/math]-ideal of [math]K(H)[/math], as claimed.
(2) In order to prove now that the product of any two Hilbert-Schmidt operators is a trace class operator, we can use the following formula, which is elementary:
Conversely, given an arbitrary trace class operator [math]T\in B_1(H)[/math], we have:
Thus, by using the polar decomposition [math]T=U|T|[/math], we obtain the following decomposition for [math]T[/math], with both components being Hilbert-Schmidt operators:
(3) The condition for the singular values is clear.
(4) The fact that we have a scalar product is clear as well.
(5) The proof of the completness property is routine as well.
We have as well the following key result, regarding the Hilbert-Schmidt operators:
We have the following formula,
We can prove this in two steps, as follows:
(1) Assume first that [math]|S|[/math] is trace class. Consider the polar decomposition [math]S=U|S|[/math], and choose an orthonormal basis [math]\{x_i\}[/math] for the image of [math]U[/math], suitably extended to an orthonormal basis of [math]H[/math]. We have then the following computation, as desired:
(2) Assume now that we are in the general case, where [math]S[/math] is only assumed to be Hilbert-Schmidt. For any finite rank operator [math]S'[/math] we have then:
Thus by choosing [math]S'[/math] with [math]||S-S'||_2\to0[/math], we obtain the result.
This was for the basic theory of bounded operators on a Hilbert space, [math]T\in B(H)[/math]. In the remainder of this book we will be rather interested in the operator algebras [math]A\subset B(H)[/math] that these operators can form. This is of course related to operator theory, because we can, at least in theory, take [math]A= \lt T \gt [/math], and then study [math]T[/math] via the properties of [math]A[/math]. Actually, this is something that we already did a few times, when doing spectral theory, and notably when talking about functional calculus for normal operators.
For further operator theory, however, nothing beats a good operator theory book, and various ad-hoc methods, depending on the type of operators involved, and especially, on what you want to do with them. As before, in relation with topics to be later discussed in this book, we recommend here the books of Lax [1] and Blackadar [2].
Let us mention as well that there is a lot of interesting theory regarding the unbounded operators [math]T\in\mathcal L(H)[/math] too, which is something quite technical, and here once again, we warmly recommend a good operator theory book. In addition, we recommend as well a good PDE book, because most of the questions making appear unbounded operators usually have PDE formulations as well, which are extremely efficient.
General references
Banica, Teo (2024). "Principles of operator algebras". arXiv:2208.03600 [math.OA].