Spectral theorems
3a. Basic theory
We discuss in this chapter the diagonalization problem for the operators [math]T\in B(H)[/math], in analogy with the diagonalization problem for the usual matrices [math]A\in M_N(\mathbb C)[/math]. As a first observation, we can talk about eigenvalues and eigenvectors, as follows:
Given an operator [math]T\in B(H)[/math], assuming that we have
We know many things about eigenvalues and eigenvectors, in the finite dimensional case. However, most of these will not extend to the infinite dimensional case, or at least not extend in a straightforward way, due to a number of reasons:
- Most of basic linear algebra is based on the fact that [math]Tx=\lambda x[/math] is equivalent to [math](T-\lambda)x=0[/math], so that [math]\lambda[/math] is an eigenvalue when [math]T-\lambda[/math] is not invertible. In the infinite dimensional setting [math]T-\lambda[/math] might be injective and not surjective, or vice versa, or invertible with [math](T-\lambda)^{-1}[/math] not bounded, and so on.
- Also, in linear algebra [math]T-\lambda[/math] is not invertible when [math]\det(T-\lambda)=0[/math], and with this leading to most of the advanced results about eigenvalues and eigenvectors. In infinite dimensions, however, it is impossible to construct a determinant function [math]\det:B(H)\to\mathbb C[/math], and this even for the diagonal operators on [math]l^2(\mathbb N)[/math].
Summarizing, we are in trouble with our extension program, and this right from the beginning. In order to have some theory started, however, let us forget about (2), which obviously leads nowhere, and focus on the difficulties in (1).
In order to cut short the discussion there, regarding the various properties of [math]T-\lambda[/math], we can just say that [math]T-\lambda[/math] is either invertible with bounded inverse, the “good case”, or not. We are led in this way to the following definition:
The spectrum of an operator [math]T\in B(H)[/math] is the set
As a basic example, in the finite dimensional case, [math]H=\mathbb C^N[/math], the spectrum of a usual matrix [math]A\in M_N(\mathbb C)[/math] is the collection of its eigenvalues, taken without multiplicities. We will see many other examples. In general, the spectrum has the following properties:
The spectrum of [math]T\in B(H)[/math] contains the eigenvalue set
We have several assertions here, the idea being as follows:
(1) First of all, the eigenvalue set is indeed the one in the statement, because [math]Tx=\lambda x[/math] tells us precisely that [math]T-\lambda[/math] must be not injective. The fact that we have [math]\varepsilon(T)\subset\sigma(T)[/math] is clear as well, because if [math]T-\lambda[/math] is not injective, it is not bijective.
(2) In finite dimensions we have [math]\varepsilon(T)=\sigma(T)[/math], because [math]T-\lambda[/math] is injective if and only if it is bijective, with the boundedness of the inverse being automatic.
(3) In infinite dimensions we can assume [math]H=l^2(\mathbb N)[/math], and the shift operator [math]S(e_i)=e_{i+1}[/math] is injective but not surjective. Thus [math]0\in\sigma(T)-\varepsilon(T)[/math].
We will see more examples and counterexamples, and some general theory, in a moment. Philosophically speaking, the best way of thinking at all this is as follows:
-- The numbers [math]\lambda\notin\sigma(T)[/math] are good, because we can invert [math]T-\lambda[/math].
-- The numbers [math]\lambda\in\sigma(T)-\varepsilon(T)[/math] are bad.
-- The eigenvalues [math]\lambda\in\varepsilon(T)[/math] are evil.
Note that this is somewhat contrary to what happens in linear algebra, where the eigenvalues are highly valued, and cherished, and regarded as being the source of all good things on Earth. Welcome to operator theory, where some things are upside down.
Let us develop now some general theory for the spectrum, or perhaps for its complement, with the promise to come back to eigenvalues later. As a first result, we would like to prove that the spectra are non-empty. This is something tricky, and we will need:
The following happen:
- [math]||T|| \lt 1\implies(1-T)^{-1}=1+T+T^2+\ldots[/math]
- The set [math]B(H)^{-1}[/math] is open.
- The map [math]T\to T^{-1}[/math] is differentiable.
All these assertions are elementary, as follows:
(1) This follows as in the scalar case, the computation being as follows, provided that everything converges under the norm, which amounts in saying that [math]||T|| \lt 1[/math]:
(2) Assuming [math]T\in B(H)^{-1}[/math], let us pick [math]S\in B(H)[/math] such that:
We have then the following estimate:
Thus we have [math]T^{-1}S\in B(H)^{-1}[/math], and so [math]S\in B(H)^{-1}[/math], as desired.
(3) In the scalar case, the derivative of [math]f(t)=t^{-1}[/math] is [math]f'(t)=-t^{-2}[/math]. In the present normed space setting the derivative is no longer a number, but rather a linear transformation, which can be found by developing [math]f(T)=T^{-1}[/math] at order 1, as follows:
Thus [math]f(T)=T^{-1}[/math] is indeed differentiable, with derivative [math]f'(T)S=-T^{-1}ST^{-1}[/math].
We can now formulate our first theorem about spectra, as follows:
The spectrum of a bounded operator [math]T\in B(H)[/math] is:
- Compact.
- Contained in the disc [math]D_0(||T||)[/math].
- Non-empty.
This can be proved by using Proposition 3.4, along with a bit of complex and functional analysis, for which we refer to Rudin [1] and Lax [2], as follows:
(1) In view of (2) below, it is enough to prove that [math]\sigma(T)[/math] is closed. But this follows from the following computation, with [math]|\varepsilon|[/math] being small:
(2) This follows from the following computation:
(3) Assume by contradiction [math]\sigma(T)=\emptyset[/math]. Given a linear form [math]f\in B(H)^*[/math], consider the following map, which is well-defined, due to our assumption [math]\sigma(T)=\emptyset[/math]:
By using the fact that [math]T\to T^{-1}[/math] is differentiable, that we know from Proposition 3.4, we conclude that this map is differentiable, and so holomorphic. Also, we have:
Thus by the Liouville theorem we obtain [math]\varphi=0[/math]. But, in view of the definition of [math]\varphi[/math], this gives [math](T-\lambda)^{-1}=0[/math], which is a contradiction, as desired.
Here is now a second basic result regarding the spectra, inspired from what happens in finite dimensions, for the usual complex matrices, and which shows that things do not necessarily extend without troubles to the infinite dimensional setting:
We have the following formula, valid for any operators [math]S,T[/math]:
There are several assertions here, the idea being as follows:
(1) This is something that we know in finite dimensions, coming from the fact that the characteristic polynomials of the associated matrices [math]A,B[/math] coincide:
Thus we obtain [math]\sigma(ST)=\sigma(TS)[/math] in this case, as claimed. Observe that this improves twice the general formula in the statement, first because we have no issues at 0, and second because what we obtain is actually an equality of sets with mutiplicities.
(2) In general now, let us first prove the main assertion, stating that [math]\sigma(ST),\sigma(TS)[/math] coincide outside 0. We first prove that we have the following implication:
Assume indeed that [math]1-ST[/math] is invertible, with inverse denoted [math]R[/math]:
We have then the following formulae, relating our variables [math]R,S,T[/math]:
By using [math]RST=R-1[/math], we have the following computation:
A similar computation, using [math]STR=R-1[/math], shows that we have:
Thus [math]1-TS[/math] is invertible, with inverse [math]1+TRS[/math], which proves our claim. Now by multiplying by scalars, we deduce from this that for any [math]\lambda\in\mathbb C-\{0\}[/math] we have:
But this leads to the conclusion in the statement.
(3) Regarding now the counterexample to the formula [math]\sigma(ST)=\sigma(TS)[/math], in general, let us take [math]S[/math] to be the shift on [math]H=L^2(\mathbb N)[/math], given by the following formula:
As for [math]T[/math], we can take it to be the adjoint of [math]S[/math], which is the following operator:
Let us compose now these two operators. In one sense, we have:
In the other sense, however, the situation is different, as follows:
Thus, the spectra do not match on [math]0[/math], and we have our counterexample, as desired.
3b. Spectral radius
Let us develop now some systematic theory for the computation of the spectra, based on what we know about the eigenvalues of the usual complex matrices. As a first result, which is well-known for the usual matrices, and extends well, we have:
We have the “polynomial functional calculus” formula
We pick a scalar [math]\lambda\in\mathbb C[/math], and we decompose the polynomial [math]P-\lambda[/math]:
We have then the following equivalences:
Thus, we are led to the formula in the statement.
The above result is something very useful, and generalizing it will be our next task. As a first ingredient here, assuming that [math]A\in M_N(\mathbb C)[/math] is invertible, we have:
It is possible to extend this formula to the arbitrary operators, and we will do this in a moment. Before starting, however, we have to think in advance on how to unify this potential result, that we have in mind, with Theorem 3.7 itself.
What we have to do here is to find a class of functions generalizing both the polynomials [math]P\in\mathbb C[X][/math] and the inverse function [math]x\to x^{-1}[/math], and the answer to this question is provided by the rational functions, which are as follows:
A rational function [math]f\in\mathbb C(X)[/math] is a quotient of polynomials:
Here the term “poles” comes from the fact that, if you want to imagine the graph of such a rational function [math]f[/math], in two complex dimensions, what you get is some sort of tent, supported by poles of infinite height, situated at the zeros of [math]Q[/math]. For more on all this, and on complex analysis in general, we refer as usual to Rudin [1]. Although a look at an abstract algebra book can be interesting as well.
Now that we have our class of functions, the next step consists in applying them to operators. Here we cannot expect [math]f(T)[/math] to make sense for any [math]f[/math] and any [math]T[/math], for instance because [math]T^{-1}[/math] is defined only when [math]T[/math] is invertible. We are led in this way to:
Given an operator [math]T\in B(H)[/math], and a rational function [math]f=P/Q[/math] having poles outside [math]\sigma(T)[/math], we can construct the following operator,
To be more precise, [math]f(T)[/math] is indeed well-defined, and the fraction notation is justified too. In more formal terms, we can say that we have a morphism of complex algebras as follows, with [math]\mathbb C(X)^T[/math] standing for the rational functions having poles outside [math]\sigma(T)[/math]:
Summarizing, we have now a good class of functions, generalizing both the polynomials and the inverse map [math]x\to x^{-1}[/math]. We can now extend Theorem 3.7, as follows:
We have the “rational functional calculus” formula
We pick a scalar [math]\lambda\in\mathbb C[/math], we write [math]f=P/Q[/math], and we set:
By using now Theorem 3.7, for this polynomial, we obtain:
Thus, we are led to the formula in the statement.
As an application of the above methods, we can investigate certain special classes of operators, such as the self-adjoint ones, and the unitary ones. Let us start with:
The following happen:
- We have [math]\sigma(T^*)=\overline{\sigma(T)}[/math], for any [math]T\in B(H)[/math].
- If [math]T=T^*[/math] then [math]X=\sigma(T)[/math] satisfies [math]X=\overline{X}[/math].
- If [math]U^*=U^{-1}[/math] then [math]X=\sigma(U)[/math] satisfies [math]X^{-1}=\overline{X}[/math].
We have several assertions here, the idea being as follows:
(1) The spectrum of the adjoint operator [math]T^*[/math] can be computed as follows:
(2) This is clear indeed from (1).
(3) For a unitary operator, [math]U^*=U^{-1}[/math], Theorem 3.10 and (1) give:
Thus, we are led to the conclusion in the statement.
In analogy with what happens for the usual matrices, we would like to improve now (2,3) above, with results stating that the spectrum [math]X=\sigma(T)[/math] satisfies [math]X\subset\mathbb R[/math] for self-adjoints, and [math]X\subset\mathbb T[/math] for unitaries. This will be tricky. Let us start with:
The spectrum of a unitary operator
Assuming [math]U^*=U^{-1}[/math], we have the following norm computation:
Now if we denote by [math]D[/math] the unit disk, we obtain from this:
On the other hand, once again by using [math]U^*=U^{-1}[/math], we have as well:
Thus, as before with [math]D[/math] being the unit disk in the complex plane, we have:
Now by using Theorem 3.10, we obtain [math]\sigma(U)
\subset D\cap D^{-1}
=\mathbb T[/math], as desired.
We have as well a similar result for self-adjoints, as follows:
The spectrum of a self-adjoint operator
The idea is that we can deduce the result from Theorem 3.12, by using the following remarkable rational function, depending on a parameter [math]r\in\mathbb R[/math]:
Indeed, for [math]r \gt \gt 0[/math] the operator [math]f(T)[/math] is well-defined, and we have:
Thus [math]f(T)[/math] is unitary, and by using Theorem 3.12 we obtain:
Thus, we are led to the conclusion in the statement.
As a theoretical remark, it is possible to deduce as well Theorem 3.12 from Theorem 3.13, by performing the above computation in the other sense. Indeed, by assuming that Theorem 3.13 holds indeed, and starting with a unitary [math]U\in B(H)[/math], we obtain:
As a conclusion now, we have so far a beginning of spectral theory, with results allowing us to investigate the unitaries and the self-adjoints, and with the remark that these two classes of operators are related by a certain wizarding rational function, namely:
Let us keep building on this, with more complex analysis involved. One key thing that we know about matrices, and which follows for instance by using the fact that the diagonalizable matrices are dense, is the following formula:
We would like to have such formulae for the general operators [math]T\in B(H)[/math], but this is something quite technical. Consider the rational calculus morphism from Definition 3.9, which is as follows, with the exponent standing for “having poles outside [math]\sigma(T)[/math]”:
As mentioned before, the rational functions are holomorphic outside their poles, and this raises the question of extending this morphism, as follows:
Normally this can be done in several steps. Let us start with:
We can exponentiate any operator [math]T\in B(H)[/math], by setting:
Similarly, we can define [math]f(T)[/math], for any holomorphic function [math]f:\mathbb C\to\mathbb C[/math].
We must prove that the series defining [math]e^T[/math] converges, and this follows from:
The case of the arbitrary holomorphic functions [math]f:\mathbb C\to\mathbb C[/math] is similar.
In general, the holomorphic functions are not entire, and the above method won't cover the rational functions [math]f\in\mathbb C(X)^T[/math] that we want to generalize. Thus, we must use something else. And the answer here comes from the Cauchy formula:
Indeed, given a rational function [math]f\in\mathbb C(X)^T[/math], the operator [math]f(T)\in B(H)[/math], constructed in Definition 3.9, can be recaptured in an analytic way, as follows:
Now given an arbitrary function [math]f\in Hol(\sigma(T))[/math], we can define [math]f(T)\in B(H)[/math] by the exactly same formula, and we obtain in this way the desired correspondence:
This was for the plan. In practice now, all this needs a bit of care, with many verifications needed, and with the technical remark that a winding number must be added to the above Cauchy formulae, for things to be correct. The result is as follows:
We have the “holomorphic functional calculus” formula
This is something that we will not really need, for the purposes of the present book, which is more algebraic than analytic, but here is the general idea:
(1) As explained above, given a rational function [math]f\in\mathbb C(X)^T[/math], the corresponding operator [math]f(T)\in B(H)[/math] can be recaptured in an analytic way, as follows:
(2) Now given an arbitrary function [math]f\in Hol(\sigma(T))[/math], we can define [math]f(T)\in B(H)[/math] by the exactly same formula, and we obtain in this way the desired correspondence:
(3) In practice now, all this needs a bit of care, notably with the verification of the fact that the operator [math]f(T)\in B(H)[/math] does not depend on [math]\gamma[/math], and with the technical remark that a winding number must be added to the above Cauchy formulae, for things to be correct. But this can be done via a standard study, keeping in mind the fact that in the case [math]H=\mathbb C[/math], where our operators are usual numbers, [math]B(H)=\mathbb C[/math], what we want to do is simply proving that the usual Cauchy formula holds indeed.
(4) Now with this correspondence [math]f\to f(T)[/math] constructed, and so with the formula in the statement, namely [math]\sigma(f(T))=f(\sigma(T))[/math], making now sense, it remains to prove that this formula holds indeed. But this follows as well via a careful use of the Cauchy formula, or by using approximation by polynomials, or rational functions.
As already said, the above result is important for advanced operator theory and applications, and we will not get further into this subject. We will be back, however, to all this in the special case of the normal operators, which is of particular interest for us.
In order to formulate now our next result, we will need the following notion:
Given an operator [math]T\in B(H)[/math], its spectral radius
Here we have included for convenience a number of basic results from Theorem 3.5, namely the fact that the spectrum is non-empty, and is contained in the disk [math]D_0(||T||)[/math], which provide us respectively with the inequalities [math]\rho(T)\geq0[/math], with the usual convention [math]\sup\emptyset=-\infty[/math], and [math]\rho(T)\leq||T||[/math]. Now with this notion in hand, we have the following key result, improving our key result so far, namely [math]\sigma(T)\neq\emptyset[/math], from Theorem 3.5:
The spectral radius of an operator [math]T\in B(H)[/math] is given by
We have several things to be proved, the idea being as follows:
(1) Our first claim is that the numbers [math]u_n=||T^n||^{1/n}[/math] satisfy:
Indeed, we have the following estimate, using the Young inequality [math]ab\leq a^p/p+b^q/q[/math], with exponents [math]p=(n+m)/n[/math] and [math]q=(n+m)/m[/math]:
(2) Our second claim is that the second assertion holds, namely:
For this purpose, we just need the inequality found in (1). Indeed, fix [math]m\geq1[/math], let [math]n\geq1[/math], and write [math]n=lm+r[/math] with [math]0\leq r\leq m-1[/math]. By using twice [math]u_{ab}\leq u_b[/math], we get:
It follows that we have [math]\lim\sup_nu_n\leq u_m[/math], which proves our claim.
(3) Summarizing, we are left with proving the main formula, which is as follows, and with the remark that we already know that the sequence on the right converges:
In one sense, we can use the polynomial calculus formula [math]\sigma(T^n)=\sigma(T)^n[/math]. Indeed, this gives the following estimate, valid for any [math]n[/math], as desired:
(4) For the reverse inequality, we fix a number [math]\rho \gt \rho(T)[/math], and we want to prove that we have [math]\rho\geq\lim_{n\to\infty}||T^n||^{1/n}[/math]. By using the Cauchy formula, we have:
By applying the norm we obtain from this formula:
Since the sup does not depend on [math]n[/math], by taking [math]n[/math]-th roots, we obtain in the limit:
Now recall that [math]\rho[/math] was by definition an arbitrary number satisfying [math]\rho \gt \rho(T)[/math]. Thus, we have obtained the following estimate, valid for any [math]T\in B(H)[/math]:
Thus, we are led to the conclusion in the statement.
In the case of the normal elements, we have the following finer result:
The spectral radius of a normal element,
We can proceed in two steps, as follows:
\underline{Step 1}. In the case [math]T=T^*[/math] we have [math]||T^n||=||T||^n[/math] for any exponent of the form [math]n=2^k[/math], by using the formula [math]||TT^*||=||T||^2[/math], and by taking [math]n[/math]-th roots we get:
Thus, we are done with the self-adjoint case, with the result [math]\rho(T)=||T||[/math].
\underline{Step 2}. In the general normal case [math]TT^*=T^*T[/math] we have [math]T^n(T^n)^*=(TT^*)^n[/math], and by using this, along with the result from Step 1, applied to [math]TT^*[/math], we obtain:
Thus, we are led to the conclusion in the statement.
As a first comment, the spectral radius formula [math]\rho(T)=||T||[/math] does not hold in general, the simplest counterexample being the following non-normal matrix:
As another comment, we can combine the formula [math]\rho(T)=||T||[/math] for normal operators with the formula [math]||TT^*||=||T||^2[/math], and we are led to the following statement:
The norm of [math]B(H)[/math] is given by
We have the following computation, using the formula [math]||TT^*||=||T||^2[/math], then the spectral radius formula for [math]TT^*[/math], and finally the definition of the spectral radius:
Thus, we are led to the conclusion in the statement.
The above result is quite interesting, philosophically speaking. We will be back to this, with further results and comments on [math]B(H)[/math], and other algebras of the same type.
3c. Normal operators
By using Theorem 3.18 we can say a number of non-trivial things concerning the normal operators, commonly known as “spectral theorem for normal operators”. As a first result here, we can improve the polynomial functional calculus formula:
Given [math]T\in B(H)[/math] normal, we have a morphism of algebras
This is an improvement of Theorem 3.7 in the normal case, with the extra assertion being the norm estimate. But the element [math]P(T)[/math] being normal, we can apply to it the spectral radius formula for normal elements, and we obtain:
Thus, we are led to the conclusions in the statement.
We can improve as well the rational calculus formula, and the holomorphic calculus formula, in the same way. Importantly now, at a more advanced level, we have:
Given [math]T\in B(H)[/math] normal, we have a morphism of algebras
The idea here is to “complete” the morphism in Theorem 3.20, namely:
Indeed, we know from Theorem 3.20 that this morphism is continuous, and is in fact isometric, when regarding the polynomials [math]P\in\mathbb C[X][/math] as functions on [math]\sigma(T)[/math]:
We conclude from this that we have a unique isometric extension, as follows:
It remains to prove [math]\sigma(f(T))=f(\sigma(T))[/math], and we can do this by double inclusion:
“[math]\subset[/math]” Given a continuous function [math]f\in C(\sigma(T))[/math], we must prove that we have:
For this purpose, consider the following function, which is well-defined:
We can therefore apply this function to [math]T[/math], and we obtain:
In particular [math]f(T)-\lambda[/math] is invertible, so [math]\lambda\notin\sigma(f(T))[/math], as desired.
“[math]\supset[/math]” Given a continuous function [math]f\in C(\sigma(T))[/math], we must prove that we have:
But this is the same as proving that we have:
For this purpose, we approximate our function by polynomials, [math]P_n\to f[/math], and we examine the following convergence, which follows from [math]P_n\to f[/math]:
We know from polynomial functional calculus that we have:
Thus, the operators [math]P_n(T)-P_n(\mu)[/math] are not invertible. On the other hand, we know that the set formed by the invertible operators is open, so its complement is closed. Thus the limit [math]f(T)-f(\mu)[/math] is not invertible either, and so [math]f(\mu)\in\sigma(f(T))[/math], as desired.
As an important comment, Theorem 3.21 is not exactly in final form, because it misses an important point, namely that our correspondence maps:
However, this is something non-trivial, and we will be back to this later. Observe however that Theorem 3.21 is fully powerful for the self-adjoint operators, [math]T=T^*[/math], where the spectrum is real, and so where [math]z=\bar{z}[/math] on the spectrum. We will be back to this.
As a second result now, along the same lines, we can further extend Theorem 3.21 into a measurable functional calculus theorem, as follows:
Given [math]T\in B(H)[/math] normal, we have a morphism of algebras as follows, with [math]L^\infty[/math] standing for abstract measurable functions, or Borel functions,
As before, the idea will be that of “completing” what we have. To be more precise, we can use the Riesz theorem and a polarization trick, as follows:
(1) Given a vector [math]x\in H[/math], consider the following functional:
By the Riesz theorem, this functional must be the integration with respect to a certain measure [math]\mu[/math] on the space [math]\sigma(T)[/math]. Thus, we have a formula as follows:
Now given an arbitrary Borel function [math]f\in L^\infty(\sigma(T))[/math], as in the statement, we can define a number [math] \lt f(T)x,x \gt \in\mathbb C[/math], by using exactly the same formula, namely:
Thus, we have managed to define numbers [math] \lt f(T)x,x \gt \in\mathbb C[/math], for all vectors [math]x\in H[/math], and in addition we can recover these numbers as follows, with [math]g_n\in C(\sigma(T))[/math]:
(2) In order to define now numbers [math] \lt f(T)x,y \gt \in\mathbb C[/math], for all vectors [math]x,y\in H[/math], we can use a polarization trick. Indeed, for any operator [math]S\in B(H)[/math] we have:
By replacing [math]y\to iy[/math], we have as well the following formula:
By multiplying this latter formula by [math]i[/math], we obtain the following formula:
Now by summing this latter formula with the first one, we obtain:
(3) But with this, we can now finish. Indeed, by combining (1,2), given a Borel function [math]f\in L^\infty(\sigma(T))[/math], we can define numbers [math] \lt f(T)x,y \gt \in\mathbb C[/math] for any [math]x,y\in H[/math], and it is routine to check, by using approximation by continuous functions [math]g_n\to f[/math] as in (1), that we obtain in this way an operator [math]f(T)\in B(H)[/math], having all the desired properties.
The same comments as before apply. Theorem 3.22 is not exactly in final form, because it misses an important point, namely that our correspondence maps:
However, this is something non-trivial, and we will be back to this later. Observe however that Theorem 3.22 is fully powerful for the self-adjoint operators, [math]T=T^*[/math], where the spectrum is real, and so where [math]z=\bar{z}[/math] on the spectrum. We will be back to this.
As another comment, the above result and its proof provide us with more than a Borel functional calculus, because what we got is a certain measure on the spectrum [math]\sigma(T)[/math], along with a functional calculus for the [math]L^\infty[/math] functions with respect to this measure. We will be back to this later, and for the moment we will only need Theorem 3.22 as formulated, with [math]L^\infty(\sigma(T))[/math] standing, a bit abusively, for the Borel functions on [math]\sigma(T)[/math].
3d. Diagonalization
We can now diagonalize the normal operators. We will do this in 3 steps, first for the self-adjoint operators, then for the families of commuting self-adjoint operators, and finally for the general normal operators, by using a trick of the following type:
The diagonalization in infinite dimensions is more tricky than in finite dimensions, and instead of writing a formula of type [math]T=UDU^*[/math], with [math]U,D\in B(H)[/math] being respectively unitary and diagonal, we will express our operator as [math]T=U^*MU[/math], with [math]U:H\to K[/math] being a certain unitary, and with [math]M\in B(K)[/math] being a certain diagonal operator.
This is indeed how the spectral theorem is best formulated, in view of applications. In practice, the explicit construction of [math]U,M[/math], which will be actually rather part of the proof, is also needed. For the self-adjoint operators, the statement and proof are as follows:
Any self-adjoint operator [math]T\in B(H)[/math] can be diagonalized,
The construction of [math]U,f[/math] can be done in several steps, as follows:
(1) We first prove the result in the special case where our operator [math]T[/math] has a cyclic vector [math]x\in H[/math], with this meaning that the following holds:
For this purpose, let us go back to the proof of Theorem 3.22. We will use the following formula from there, with [math]\mu[/math] being the measure on [math]X=\sigma(T)[/math] associated to [math]x[/math]:
Our claim is that we can define a unitary [math]U:H\to L^2(X)[/math], first on the dense part spanned by the vectors [math]T^kx[/math], by the following formula, and then by continuity:
Indeed, the following computation shows that [math]U[/math] is well-defined, and isometric:
We can then extend [math]U[/math] by continuity into a unitary [math]U:H\to L^2(X)[/math], as claimed. Now observe that we have the following formula:
Thus our result is proved in the present case, with [math]U[/math] as above, and with [math]f(z)=z[/math].
(2) We discuss now the general case. Our first claim is that [math]H[/math] has a decomposition as follows, with each [math]H_i[/math] being invariant under [math]T[/math], and admitting a cyclic vector [math]x_i[/math]:
Indeed, this is something elementary, the construction being by recurrence in finite dimensions, in the obvious way, and by using the Zorn lemma in general. Now with this decomposition in hand, we can make a direct sum of the diagonalizations obtained in (1), for each of the restrictions [math]T_{|H_i}[/math], and we obtain the formula in the statement.
We have the following technical generalization of the above result:
Any family of commuting self-adjoint operators [math]T_i\in B(H)[/math] can be jointly diagonalized,
This is similar to the proof of Theorem 3.23, by suitably modifying the measurable calculus formula, and the measure [math]\mu[/math] itself, as to have this formula working for all the operators [math]T_i[/math]. With this modification done, everything extends.
In order to discuss now the case of the arbitrary normal operators, we will need:
Any operator [math]T\in B(H)[/math] can be written as
This is something elementary, the idea being as follows:
(1) As a first observation, in the case [math]H=\mathbb C[/math] our operators are usual complex numbers, and the formula in the statement corresponds to the following basic fact:
(2) In general now, we can use the same formulae for the real and imaginary part as in the complex number case, the decomposition formula being as follows:
To be more precise, both the operators on the right are self-adjoint, and the summing formula holds indeed, and so we have our decomposition result, as desired.
(3) Regarding now the uniqueness, by linearity it is enough to show that [math]R+iS=0[/math] with [math]R,S[/math] both self-adjoint implies [math]R=S=0[/math]. But this follows by applying the adjoint to [math]R+iS=0[/math], which gives [math]R-iS=0[/math], and so [math]R=S=0[/math], as desired.
As a comment here, the above result is just the “tip of the iceberg”, in what regards decomposition results for the operators [math]T\in B(H)[/math], in analogy with decomposition results for the complex numbers [math]z\in\mathbb C[/math]. As a sample result here, improving Proposition 3.25, we can write any operator [math]T\in B(H)[/math] as a linear combination of 4 positive operators, by writing both [math]Re(T),Im(T)[/math] as differences of positive operators. More on this later.
Good news, after all these preliminaries, that you enjoyed I hope, as much as I did, we can eventually discuss the case of arbitrary normal operators. We have here the following result, generalizing what we know from chapter 1 about the normal matrices:
Any normal operator [math]T\in B(H)[/math] can be diagonalized,
This is our main diagonalization theorem, the idea being as follows:
(1) Consider the decomposition of [math]T[/math] into its real and imaginary parts, as constructed in the proof of Proposition 3.25, namely:
We know that the real and imaginary parts are self-adjoint operators. Now since [math]T[/math] was assumed to be normal, [math]TT^*=T^*T[/math], these real and imaginary parts commute:
Thus Theorem 3.24 applies to these real and imaginary parts, and gives the result.
(2) Alternatively, we can use methods similar to those that we used in chapter 1, in order to deal with the usual normal matrices, involving the special relation between [math]T[/math] and the operator [math]TT^*[/math], which is self-adjoint. We will leave this as an instructive exercise.
This was for our series of diagonalization theorems. There is of course one more result here, regarding the families of commuting normal operators, as follows:
Any family of commuting normal operators [math]T_i\in B(H)[/math] can be jointly diagonalized,
This is similar to the proof of Theorem 3.24 and Theorem 3.26, by combining the arguments there. To be more precise, this follows as Theorem 3.24, by using the decomposition trick from the proof of Theorem 3.26.
With the above diagonalization results in hand, we can now “fix” the continuous and measurable functional calculus theorems, with a key complement, as follows:
Given a normal operator [math]T\in B(H)[/math], the following hold, for both the functional calculus and the measurable calculus morphisms:
- These morphisms are [math]*[/math]-morphisms.
- The function [math]\bar{z}[/math] gets mapped to [math]T^*[/math].
- The functions [math]Re(z),Im(z)[/math] get mapped to [math]Re(T),Im(T)[/math].
- The function [math]|z|^2[/math] gets mapped to [math]TT^*=T^*T[/math].
- If [math]f[/math] is real, then [math]f(T)[/math] is self-adjoint.
These assertions are more or less equivalent, with (1) being the main one, which obviously implies everything else. But this assertion (1) follows from the diagonalization result for normal operators, from Theorem 3.26.
This was for the spectral theory of arbitrary and normal operators, or at least for the basics of this theory. As a conclusion here, our main results are as follows:
- Regarding the arbitrary operators, the main results here, or rather the most advanced results, are the holomorphic calculus formula from Theorem 3.15, and the spectral radius estimate from Theorem 3.17.
- For the self-adjoint operators, the main results are the spectral radius formula from Theorem 3.18, the measurable calculus formula from Theorem 3.22, and the diagonalization result from Theorem 3.23.
- For general normal operators, the main results are the spectral radius formula from Theorem 3.18, the measurable calculus formula from Theorem 3.22, complemented by Theorem 3.28, and the diagonalization result in Theorem 3.26.
There are of course many other things that can be said about the spectral theory of the bounded operators [math]T\in B(H)[/math], and on that of the unbounded operators too. As a complement, we recommend any good operator theory book, with the comment however that there is a bewildering choice here, depending on taste, and on what exactly you want to do with your operators [math]T\in B(H)[/math]. In what concerns us, who are rather into general quantum mechanics, but with our operators being bounded, good choices are the functional analysis book of Lax [2], or the operator algebra book of Blackadar [3].
General references
Banica, Teo (2024). "Principles of operator algebras". arXiv:2208.03600 [math.OA].