Infinite dimensions

[math] \newcommand{\mathds}{\mathbb}[/math]

8a. Hilbert spaces

We have seen so far the basics of linear algebra, concerning the relation between linear maps and matrices, the determinant, the diagonalization procedure, along with some applications. In this chapter we discuss what happens in infinite dimensions. Among our motivations is the fact that spaces of infinite dimensions are of great importance in various branches of theoretical physics, such as quantum mechanics.


To be more precise, among the main discoveries of the 1920s, due to Heisenberg, Schrödinger and others was the fact that small particles like electrons cannot really be described by their position vectors [math]v\in\mathbb R^3[/math], and instead we must use their so-called wave functions [math]\psi:\mathbb R^3\to\mathbb C[/math]. Thus, the natural space for quantum mechanics, or at least for the quantum mechanics of the 1920s, is not our usual [math]V=\mathbb R^3[/math], but rather the infinite dimensional space [math]H=L^2(\mathbb R^3)[/math] of such wave functions [math]\psi[/math]. And more recent versions of quantum mechanics are built on the same idea, namely infinite dimensional spaces.


For an introduction to all this, have a look at some good old books on quantum mechanics, such as Dirac [1], von Neumann [2] or Weyl [3], and also at Kumar [4] for the story. And for more recent aspects, including all sorts of particles, and the whole mess coming with them, both mathematical and physical, go with Griffiths [5].


Getting started now, we would like to look at linear algebra over infinite dimensional spaces. However, this is not very interesting, due to a number of technical reasons, the idea being that the infinite dimensionality prevents us from doing many basic things, to the point that we cannot even have things started. So, the idea will be that of using infinite dimensional vector spaces with some extra structure, as follows:

Definition

A scalar product on a complex vector space [math]H[/math] is an operation

[[math]] H\times H\to\mathbb C [[/math]]
denoted [math](x,y)\to \lt x,y \gt [/math], satisfying the following conditions:

  • [math] \lt x,y \gt [/math] is linear in [math]x[/math], and antilinear in [math]y[/math].
  • [math]\overline{ \lt x,y \gt }= \lt y,x \gt [/math], for any [math]x,y[/math].
  • [math] \lt x,x \gt \gt 0[/math], for any [math]x\neq0[/math].

As a basic example here, we have the finite dimensional vector space [math]H=\mathbb C^N[/math], with its usual scalar product, which is as follows:

[[math]] \lt x,y \gt =\sum_ix_i\bar{y}_i [[/math]]

There are many other examples, and notably various spaces of [math]L^2[/math] functions, which naturally appear in problems coming from physics. We will discuss them later on.


In order to study the scalar products, let us formulate the following definition:

Definition

The norm of a vector [math]x\in H[/math] is the following quantity:

[[math]] ||x||=\sqrt{ \lt x,x \gt } [[/math]]
We also call this number length of [math]x[/math], or distance from [math]x[/math] to the origin.

In analogy with what happens in finite dimensions, we have two important results regarding the norms. First is the Cauchy-Schwarz inequality, as follows:

Theorem

We have the Cauchy-Schwarz inequality

[[math]] | \lt x,y \gt |\leq||x||\cdot||y|| [[/math]]
and the equality case holds precisely when [math]x,y[/math] are proportional.


Show Proof

Consider the following quantity, depending on a real variable [math]t\in\mathbb R[/math], and on a variable on the unit circle, [math]w\in\mathbb T[/math]:

[[math]] f(t)=||twx+y||^2 [[/math]]

By developing [math]f[/math], we see that this is a degree 2 polynomial in [math]t[/math]:

[[math]] \begin{eqnarray*} f(t) &=& \lt twx+y,twx+y \gt \\ &=&t^2 \lt x,x \gt +tw \lt x,y \gt +t\bar{w} \lt y,x \gt + \lt y,y \gt \\ &=&t^2||x||^2+2tRe(w \lt x,y \gt )+||y||^2 \end{eqnarray*} [[/math]]


Since [math]f[/math] is obviously positive, its discriminant must be negative:

[[math]] 4Re(w \lt x,y \gt )^2-4||x||^2\cdot||y||^2\leq0 [[/math]]

But this is equivalent to the following condition:

[[math]] |Re(w \lt x,y \gt )|\leq||x||\cdot||y|| [[/math]]

Now the point is that we can arrange for the number [math]w\in\mathbb T[/math] to be such that the quantity [math]w \lt x,y \gt [/math] is real. Thus, we obtain the following inequality:

[[math]] | \lt x,y \gt |\leq||x||\cdot||y|| [[/math]]

Finally, the study of the equality case is straightforward, by using the fact that the discriminant of [math]f[/math] vanishes precisely when we have a root. But this leads to the conclusion in the statement, namely that the vectors [math]x,y[/math] must be proportional.

As a second main result now, we have the Minkowski inequality:

Theorem

We have the Minkowski inequality

[[math]] ||x+y||\leq||x||+||y|| [[/math]]
and the equality case holds precisely when [math]x,y[/math] are proportional.


Show Proof

This follows indeed from the Cauchy-Schwarz inequality, as follows:

[[math]] \begin{eqnarray*} &&||x+y||\leq||x||+||y||\\ &\iff&||x+y||^2\leq(||x||+||y||)^2\\ &\iff&||x||^2+||y||^2+2Re \lt x,y \gt \leq||x||^2+||y||^2+2||x||\cdot||y||\\ &\iff&Re \lt x,y \gt \leq||x||\cdot||y|| \end{eqnarray*} [[/math]]


As for the equality case, this is clear from Cauchy-Schwarz as well.

As a consequence of this, we have the following result:

Theorem

The following function is a distance on [math]H[/math],

[[math]] d(x,y)=||x-y|| [[/math]]
in the usual sense, that of the abstract metric spaces.


Show Proof

This follows indeed from the Minkowski inequality, which corresponds to the triangle inequality, the other two axioms for a distance being trivially satisfied.

The above result is quite important, because it shows that we can do geometry in our present setting, a bit as in the finite dimensional case.


Finally, in connection with doing geometry, we have the following key technical result, which shows that everything can be recovered in terms of distances:

Proposition

The scalar products can be recovered from distances, via the formula

[[math]] \begin{eqnarray*} 4 \lt x,y \gt &=&||x+y||^2-||x-y||^2\\ &+&i||x+iy||^2-i||x-iy||^2 \end{eqnarray*} [[/math]]
called complex polarization identity.


Show Proof

This is something that we have already met in finite dimensions. In arbitrary dimensions the proof is similar, as follows:

[[math]] \begin{eqnarray*} &&||x+y||^2-||x-y||^2+i||x+iy||^2-i||x-iy||^2\\ &=&||x||^2+||y||^2-||x||^2-||y||^2+i||x||^2+i||y||^2-i||x||^2-i||y||^2\\ &&+2Re( \lt x,y \gt )+2Re( \lt x,y \gt )+2iIm( \lt x,y \gt )+2iIm( \lt x,y \gt )\\ &=&4 \lt x,y \gt \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

Let us discuss now some more advanced aspects. In order to do analysis on our spaces, we need the Cauchy sequences that we construct to converge. This is something which is automatic in finite dimensions, but in arbitrary dimensions, this can fail.


Thus, we must add an extra axiom, stating that [math]H[/math] is complete with respect to the norm. It is convenient here to formulate a detailed new definition, as follows, which will be the starting point for our various considerations to follow:

Definition

A Hilbert space is a complex vector space [math]H[/math] given with a scalar product [math] \lt x,y \gt [/math], satisfying the following conditions:

  • [math] \lt x,y \gt [/math] is linear in [math]x[/math], and antilinear in [math]y[/math].
  • [math]\overline{ \lt x,y \gt }= \lt y,x \gt [/math], for any [math]x,y[/math].
  • [math] \lt x,x \gt \gt 0[/math], for any [math]x\neq0[/math].
  • [math]H[/math] is complete with respect to the norm [math]||x||=\sqrt{ \lt x,x \gt }[/math].

In other words, we have taken here Definition 8.1, and added the condition that [math]H[/math] must be complete with respect to the norm [math]||x||=\sqrt{ \lt x,x \gt }[/math], that we know indeed to be a norm, according to the Minkowski inequality proved above.


As a basic example, we have the space [math]H=\mathbb C^N[/math], with its usual scalar product:

[[math]] \lt x,y \gt =\sum_ix_i\bar{y}_i [[/math]]

More generally now, we have the following construction of Hilbert spaces:

Proposition

The sequences of numbers [math]x=(x_i)[/math] which are square-summable,

[[math]] \sum_i|x_i|^2 \lt \infty [[/math]]
form a Hilbert space, denoted [math]l^2(\mathbb N)[/math], with the following scalar product:

[[math]] \lt x,y \gt =\sum_ix_i\bar{y}_i [[/math]]
In fact, given any index set [math]I[/math], we can construct a Hilbert space [math]l^2(I)[/math], in this way.


Show Proof

The fact that we have indeed a complex vector space with a scalar product is elementary, and the fact that this space is indeed complete is very standard too.

On the other hand, we can talk as well about spaces of functions, as follows:

Proposition

Given an interval [math]X\subset\mathbb R[/math], the quantity

[[math]] \lt f,g \gt =\int_Xf(x)\overline{g(x)}dx [[/math]]
is a scalar product, making [math]H=L^2(X)[/math] a Hilbert space.


Show Proof

Once again this is routine, coming this time from basic measure theory, with [math]H=L^2(X)[/math] being the space of square-integrable functions [math]f:X\to\mathbb C[/math], with the convention that two such functions are identified when they coincide almost everywhere.

We can unify the above two constructions, as follows:

Theorem

Given a measured space [math]X[/math], the quantity

[[math]] \lt f,g \gt =\int_Xf(x)\overline{g(x)}dx [[/math]]
is a scalar product, making [math]H=L^2(X)[/math] a Hilbert space.


Show Proof

Here the first assertion is clear, and the fact that the Cauchy sequences converge is clear as well, by taking the pointwise limit, and using a standard argument.

Observe that with [math]X=\{1,\ldots,N\}[/math] we obtain the space [math]H=\mathbb C^N[/math]. Also, with [math]X=\mathbb N[/math], with the counting measure, we obtain the space [math]H=l^2(\mathbb N)[/math]. In fact, with an arbitrary set [math]I[/math], once again with the counting mesure, we obtain the space [math]H=l^2(I)[/math]. Thus, the construction in Theorem 8.10 unifies all the Hilbert space constructions that we have.


Quite remarkably, the converse of this holds, in the sense that any Hilbert space must be of the form [math]L^2(X)[/math]. This follows indeed from the following key result, which tells us that, in addition to this, we can always assume that [math]X=I[/math] is a discrete space:

Theorem

Let [math]H[/math] be a Hilbert space.

  • Any algebraic basis of this space [math]\{f_i\}_{i\in I}[/math] can be turned into an orthonormal basis [math]\{e_i\}_{i\in I}[/math], by using the Gram-Schmidt procedure.
  • Thus, [math]H[/math] has an orthonormal basis, and so we have [math]H\simeq l^2(I)[/math], with [math]I[/math] being the indexing set for this orthonormal basis.


Show Proof

This is standard, by recurrence in finite dimensions, using Gram-Schmidt, as stated, and by recurrence as well in infinite, countable dimensions. As for the case of infinite, uncountable dimensions, here the result holds as well, with the proof using transfinite recurrence arguments from logic, and more specifically, the Zorn lemma.

We have the following definition, based on the above:

Definition

A Hilbert space [math]H[/math] is called separable when the following equivalent conditions are satisfied:

  • [math]H[/math] has a countable algebraic basis [math]\{f_i\}_{i\in\mathbb N}[/math].
  • [math]H[/math] has a countable orthonormal basis [math]\{e_i\}_{i\in\mathbb N}[/math].
  • We have [math]H\simeq l^2(\mathbb N)[/math], isomorphism of Hilbert spaces.

In what follows we will be mainly interested in the separable Hilbert spaces, where most of the questions coming from physics take place. In view of the above, the following philosophical question appears: why not simply talking about [math]l^2(\mathbb N)[/math]?


In answer to this, we cannot really do so, because many of the separable spaces that we are interested in appear as spaces of functions, and such spaces do not necessarily have a very simple or explicit orthonormal basis, as shown by the following result:

Proposition

The Hilbert space [math]H=L^2[0,1][/math] is separable, having as orthonormal basis the orthonormalized version of the algebraic basis

[[math]] f_n=x^n [[/math]]
with [math]n\in\mathbb N[/math], coming from the Weierstrass density theorem.


Show Proof

The fact that the space [math]H=L^2[0,1][/math] is indeed separable is clear from the Weierstrass theorem, which provides us with the algebraic basis [math]f_n=x^n[/math], which can be orthogonalized by using the Gram-Schmidt procedure, as explained in Theorem 8.11. Working out the details here is actually an excellent exercise.

As a conclusion to all this, we are interested in 1 space, namely the unique separable Hilbert space [math]H[/math], but due to various technical reasons, it is often better to forget that we have [math]H=l^2(\mathbb N)[/math], and say instead that we have [math]H=L^2(X)[/math], with [math]X[/math] being a separable measured space, or simply say that [math]H[/math] is an abstract separable Hilbert space.

8b. Linear operators

Let us get now into the study of linear operators [math]T:H\to H[/math], which will eventually lead us into the correct infinite dimensional version of linear algebra. We first have:

Theorem

Let [math]H[/math] be an arbitrary Hilbert space, coming with an orthonormal basis [math]\{e_i\}_{i\in I}[/math]. The algebra of all linear operators from [math]H[/math] to itself,

[[math]] \mathcal L(H)=\Big\{T:H\to H\ {\rm linear}\Big\} [[/math]]
embeds then into the space of the [math]I\times I[/math] complex matrices,

[[math]] M_I(\mathbb C)=\left\{(M_{ij})_{i,j\in I}\Big| M_{ij}\in\mathbb C\right\} [[/math]]
with an operator [math]T[/math] corresponding to the following matrix:

[[math]] M_{ij}= \lt Te_j,e_i \gt [[/math]]
In the case [math]H=\mathbb C^N[/math] we obtain in this way the usual isomorphism [math]\mathcal L(H)\simeq M_N(\mathbb C)[/math]. In the separable case we obtain in this way a proper embedding [math]\mathcal L(H)\subset M_\infty(\mathbb C)[/math].


Show Proof

We have three assertions to be proved, the idea being as follows:


(1) The correspondence [math]T\to M[/math] constructed in the statement is indeed linear, and its kernel is [math]\{0\}[/math], so we have indeed an embedding as follows, as claimed:

[[math]] \mathcal L(H)\subset M_I(\mathbb C) [[/math]]

(2) In finite dimensions we obtain an isomorphism, because any matrix [math]M\in M_N(\mathbb C)[/math] determines an operator [math]T:\mathbb C^N\to\mathbb C^N[/math], according to the formula [math] \lt Te_j,e_i \gt =M_{ij}[/math].


(3) In infinite dimensions, however, we do not have an isomorphism. For instance on [math]H=l^2(\mathbb N)[/math] the following matrix does not define an operator:

[[math]] M=\begin{pmatrix}1&1&\ldots\\ 1&1&\ldots\\ \vdots&\vdots \end{pmatrix} [[/math]]

Indeed, [math]T(e_1)[/math] should be the all-1 vector, but this vector is not square-summable.

In connection with our previous comments, the above result is something quite theoretical, because for basic Hilbert spaces like [math]L^2[0,1][/math], which do not have an obvious orthonormal basis, the embedding [math]\mathcal L(H)\subset M_\infty(\mathbb C)[/math] that we obtain is not something very useful. In short, while the operators [math]T:H\to H[/math] are basically some infinite matrices, it is better to think of these operators as being objects on their own.


In what follows we will be interested in the operators [math]T:H\to H[/math] which are bounded. Regarding such operators, we have the following result:

Theorem

Given a Hilbert space [math]H[/math], the linear operators [math]T:H\to H[/math] which are bounded, in the sense that we have

[[math]] ||T||=\sup_{||x||\leq1}||Tx|| \lt \infty [[/math]]
form a complex algebra with unit [math]B(H)[/math], having the property

[[math]] ||ST||\leq||S||\cdot||T|| [[/math]]
and which is complete with respect to the norm.


Show Proof

The fact that we have indeed an algebra, satisfying the product condition in the statement, follows from the following estimates, which are all elementary:

[[math]] ||S+T||\leq||S||+||T||\quad,\quad ||\lambda T||=|\lambda|\cdot||T||\quad,\quad ||ST||\leq||S||\cdot||T|| [[/math]]

Regarding now the last assertion, if [math]\{T_n\}\subset B(H)[/math] is Cauchy then [math]\{T_nx\}[/math] is Cauchy for any [math]x\in H[/math], so we can define the limit [math]T=\lim_{n\to\infty}T_n[/math] by setting:

[[math]] Tx=\lim_{n\to\infty}T_nx [[/math]]

Let us first check that the application [math]x\to Tx[/math] is linear. We have:

[[math]] \begin{eqnarray*} T(x+y) &=&\lim_{n\to\infty}T_n(x+y)\\ &=&\lim_{n\to\infty}T_n(x)+T_n(y)\\ &=&\lim_{n\to\infty}T_n(x)+\lim_{n\to\infty}T_n(y)\\ &=&T(x)+T(y) \end{eqnarray*} [[/math]]


Similarly, we have as well the following computation:

[[math]] \begin{eqnarray*} T(\lambda x) &=&\lim_{n\to\infty}T_n(\lambda x)\\ &=&\lambda\lim_{n\to\infty}T_n(x)\\ &=&\lambda T(x) \end{eqnarray*} [[/math]]


Thus we have [math]T\in\mathcal L(H)[/math]. It remains now to prove that we have [math]T\in B(H)[/math], and that we have [math]T_n\to T[/math] in norm. For this purpose, observe that we have:

[[math]] \begin{eqnarray*} &&||T_n-T_m||\leq\varepsilon\ ,\ \forall n,m\geq N\\ &\implies&||T_nx-T_mx||\leq\varepsilon\ ,\ \forall||x||=1\ ,\ \forall n,m\geq N\\ &\implies&||T_nx-Tx||\leq\varepsilon\ ,\ \forall||x||=1\ ,\ \forall n\geq N\\ &\implies&||T_Nx-Tx||\leq\varepsilon\ ,\ \forall||x||=1\\ &\implies&||T_N-T||\leq\varepsilon \end{eqnarray*} [[/math]]


As a first consequence, we obtain [math]T\in B(H)[/math], because we have:

[[math]] \begin{eqnarray*} ||T|| &=&||T_N+(T-T_N)||\\ &\leq&||T_N||+||T-T_N||\\ &\leq&||T_N||+\varepsilon\\ & \lt &\infty \end{eqnarray*} [[/math]]


As a second consequence, we obtain [math]T_N\to T[/math] in norm, and we are done.

In relation with the construction from Theorem 8.14, we have:

Proposition

We have embeddings as follows,

[[math]] B(H)\subset\mathcal L(H)\subset M_I(\mathbb C) [[/math]]
which are both proper, in the infinite dimensional case.


Show Proof

According to Theorem 8.14, the algebra [math]B(H)[/math] consists of the [math]I\times I[/math] complex matrices which define indeed linear maps [math]T:H\to H[/math], and which satisfy as well a second boundedness condition, namely the boundedness of the norm of [math]T[/math]:

[[math]] ||T|| \lt \infty [[/math]]

In finite dimensions we have equalities everywhere, but in general this is not true, the standard example of a matrix not producing an operator being as follows:

[[math]] M=\begin{pmatrix}1&1&\ldots\\ 1&1&\ldots\\ \vdots&\vdots \end{pmatrix} [[/math]]

As for the examples of linear operators which are not bounded, these are more complicated, coming from logic, and we will not need them in what follows.

As already mentioned after Theorem 8.14, all this is something quite theoretical, because for basic function spaces like [math]L^2[0,1][/math], which do not have a simple orthonormal basis, the embedding [math]B(H)\subset M_I(\mathbb C)[/math] that we obtain is not very useful. Thus, as a reiterated conclusion, while the bounded operators [math]T:H\to H[/math] are basically some infinite matrices, it is better to think of these operators as being objects on their own.

8c. Spectral theory

We will be interested in what follows in [math]B(H)[/math] and its closed subalgebras [math]A\subset B(H)[/math]. It is convenient to formulate the following definition:

Definition

A Banach algebra is a complex algebra with unit [math]A[/math], having a vector space norm [math]||.||[/math] satisfying

[[math]] ||ab||\leq||a||\cdot||b|| [[/math]]
and which makes it a Banach space, in the sense that the Cauchy sequences converge.

As said above, the basic examples of Banach algebras, or at least the basic examples that we will be interested in here, are the operator algebra [math]B(H)[/math], and its norm closed subalgebras [math]A\subset B(H)[/math], such as the algebras [math]A= \lt T \gt [/math] generated by a single operator [math]T\in B(H)[/math]. There are many other examples, and more on this later.


Generally speaking, the elements [math]a\in A[/math] of a Banach algebra can be thought of as being bounded operators on some Hilbert space, which is not present. With this idea in mind, we can emulate spectral theory in our setting, the starting point being:

Definition

The spectrum of an element [math]a\in A[/math] is the set

[[math]] \sigma(a)=\left\{\lambda\in\mathbb C\Big|a-\lambda\not\in A^{-1}\right\} [[/math]]
where [math]A^{-1}\subset A[/math] is the set of invertible elements.

As a basic example, the spectrum of a usual matrix [math]M\in M_N(\mathbb C)[/math] is the collection of its eigenvalues, taken of course without multiplicities. In the case of the trivial algebra [math]A=\mathbb C[/math], appearing at [math]N=1[/math], the spectrum of an element is the element itself.


As a first, basic result regarding spectra, we have:

Proposition

We have the following formula, valid for any [math]a,b\in A[/math]:

[[math]] \sigma(ab)\cup\{0\}=\sigma(ba)\cup\{0\} [[/math]]
Also, there are examples where [math]\sigma(ab)\neq\sigma(ba)[/math].


Show Proof

We will first prove that we have the following implication:

[[math]] 1\notin\sigma(ab)\implies1\notin\sigma(ba) [[/math]]

For this purpose, assume that [math]1-ab[/math] is invertible, with inverse denoted [math]c[/math]:

[[math]] c=(1-ab)^{-1} [[/math]]

We have then the following formulae, relating our variables [math]a,b,c[/math]:

[[math]] abc=cab=c-1 [[/math]]

By using these formulae, we obtain the following equality:

[[math]] \begin{eqnarray*} (1+bca)(1-ba) &=&1+bca-ba-bcaba\\ &=&1+bca-ba-bca+ba\\ &=&1 \end{eqnarray*} [[/math]]


A similar computation shows that we have as well:

[[math]] (1-ba)(1+bca)=1 [[/math]]

Thus [math]1-ba[/math] is invertible, with inverse [math]1+bca[/math], which proves our claim. Now by multiplying by scalars, we deduce from this that for any [math]\lambda\in\mathbb C-\{0\}[/math] we have:

[[math]] \lambda\notin\sigma(ab)\implies\lambda\notin\sigma(ba) [[/math]]

But this leads to the conclusion in the statement, namely:

[[math]] \sigma(ab)\cup\{0\}=\sigma(ba)\cup\{0\} [[/math]]

Regarding now the last claim, we know from linear algebra that [math]\sigma(ab)=\sigma(ba)[/math] holds for the usual matrices, for instance because of the above, and because [math]ab[/math] is invertible if any only if [math]ba[/math] is. However, this latter fact fails for general operators on Hilbert spaces. Indeed, we can take our operator [math]a[/math] to be the shift on the space [math]l^2(\mathbb N)[/math], given by:

[[math]] S(e_i)=e_{i+1} [[/math]]

As for [math]b[/math], we can take the adjoint of [math]S[/math], which is the following operator:

[[math]] S^*(e_i)=\begin{cases} e_{i-1}&{\rm if}\ i \gt 0\\ 0&{\rm if}\ i=0 \end{cases} [[/math]]

Let us compose now these two operators. In one sense, we have:

[[math]] S^*S=1\implies 0\notin\sigma(SS^*) [[/math]]

In the other sense, however, the situation is different, as follows:

[[math]] SS^*=Proj(e_0^\perp)\implies 0\in\sigma(SS^*) [[/math]]

Thus, the spectra do not match on [math]0[/math], and we have our counterexample, as desired.

Let us discuss now a second basic result about spectra, which is something very useful. Given an arbitrary Banach algebra element [math]a\in A[/math], and a rational function [math]f=P/Q[/math] having poles outside the spectrum [math]\sigma(a)[/math], we can construct the following element:

[[math]] f(a)=P(a)Q(a)^{-1} [[/math]]

For simplicity, and due to the fact that the elements [math]P(a),Q(a)[/math] commute, so that the order is irrelevant, we write this element as a usual fraction, as follows:

[[math]] f(a)=\frac{P(a)}{Q(a)} [[/math]]

With this convention, we have the following result:

Theorem

We have the “rational functional calculus” formula

[[math]] \sigma(f(a))=f(\sigma(a)) [[/math]]
valid for any rational function [math]f\in\mathbb C(X)[/math] having poles outside [math]\sigma(a)[/math].


Show Proof

In order to prove this result, we can proceed in two steps, as follows:


(1) Assume first that we are in the polynomial function case, [math]f\in\mathbb C[X][/math]. We pick a scalar [math]\lambda\in\mathbb C[/math], and we decompose the polynomial [math]f-\lambda[/math] into factors:

[[math]] f(X)-\lambda=c(X-r_1)\ldots(X-r_n) [[/math]]

By using this formula, we have then, as desired:

[[math]] \begin{eqnarray*} \lambda\notin\sigma(f(a)) &\iff&f(a)-\lambda\in A^{-1}\\ &\iff&c(a-r_1)\ldots(a-r_n)\in A^{-1}\\ &\iff&a-r_1,\ldots,a-r_n\in A^{-1}\\ &\iff&r_1,\ldots,r_n\notin\sigma(a)\\ &\iff&\lambda\notin f(\sigma(a)) \end{eqnarray*} [[/math]]


(2) Assume now that we are in the general rational function case, [math]f\in\mathbb C(X)[/math]. We pick a scalar [math]\lambda\in\mathbb C[/math], we write [math]f=P/Q[/math], and we set:

[[math]] F=P-\lambda Q [[/math]]

By using now what we found in (1), for this polynomial, we obtain:

[[math]] \begin{eqnarray*} \lambda\in\sigma(f(a)) &\iff&F(a)\notin A^{-1}\\ &\iff&0\in\sigma(F(a))\\ &\iff&0\in F(\sigma(a))\\ &\iff&\exists\mu\in\sigma(a),F(\mu)=0\\ &\iff&\lambda\in f(\sigma(a)) \end{eqnarray*} [[/math]]


Thus, we have obtained the formula in the statement.

Summarizing, we have so far a beginning of theory. Let us prove now something that we do not know yet, namely that the spectra are non-empty:

[[math]] \sigma(a)\neq\emptyset [[/math]]

This is something that we know well for the usual matrices. However, a bit of thinking tells us that, even for the usual matrices, this is something rather advanced.


In the present Banach algebra setting, this is definitely something non-trivial. In order to establish this result, we will need a number of analytic preliminaries, as follows:

Proposition

Let [math]A[/math] be a Banach algebra.

  • [math]||a|| \lt 1\implies(1-a)^{-1}=1+a+a^2+\ldots[/math]
  • The set [math]A^{-1}[/math] is open.
  • The map [math]a\to a^{-1}[/math] is differentiable.


Show Proof

All these assertions are elementary, as follows:


(1) This follows as in the scalar case, the computation being as follows, provided that everything converges under the norm, which amounts in saying that [math]||a|| \lt 1[/math]:

[[math]] \begin{eqnarray*} (1-a)(1+a+a^2+\ldots) &=&1-a+a-a^2+a^2-a^3+\ldots\\ &=&1 \end{eqnarray*} [[/math]]


(2) Assuming [math]a\in A^{-1}[/math], let us pick [math]b\in A[/math] such that:

[[math]] ||a-b|| \lt \frac{1}{||a^{-1}||} [[/math]]

We have then the following norm estimate:

[[math]] \begin{eqnarray*} ||1-a^{-1}b|| &=&||a^{-1}(a-b)||\\ &\leq&||a^{-1}||\cdot||a-b||\\ & \lt &1 \end{eqnarray*} [[/math]]


Thus by (1) we obtain [math]a^{-1}b\in A^{-1}[/math], and so [math]b\in A^{-1}[/math], as desired.


(3) This follows as in the scalar case, where the derivative of [math]f(t)=t^{-1}[/math] is:

[[math]] f'(t)=-t^{-2} [[/math]]

To be more precise, in the present Banach algebra setting the derivative is no longer a number, but rather a linear transformation. But this linear transformation can be found by developing the function [math]f(a)=a^{-1}[/math] at order 1, as follows:

[[math]] \begin{eqnarray*} (a+h)^{-1} &=&((1+ha^{-1})a)^{-1}\\ &=&a^{-1}(1+ha^{-1})^{-1}\\ &=&a^{-1}(1-ha^{-1}+(ha^{-1})^2-\ldots)\\ &\simeq&a^{-1}(1-ha^{-1})\\ &=&a^{-1}-a^{-1}ha^{-1} \end{eqnarray*} [[/math]]


We conclude that the derivative that we are looking for is:

[[math]] f'(a)h=-a^{-1}ha^{-1} [[/math]]

Thus, we are led to the conclusion in the statement.

We can now formulate a key theorem about the Banach algebras, as follows:

Theorem

The spectrum of any Banach algebra element [math]\sigma(a)\subset\mathbb C[/math] is:

  • Compact.
  • Contained in the disc [math]D_0(||a||)[/math].
  • Non-empty.


Show Proof

This can be proved by using the above results, as follows:


(1) In view of (2) below, it is enough to prove that [math]\sigma(a)[/math] is closed. But this follows from the following computation, with [math]|\varepsilon|[/math] being small:

[[math]] \begin{eqnarray*} \lambda\notin\sigma(a) &\implies&a-\lambda\in A^{-1}\\ &\implies&a-\lambda-\varepsilon\in A^{-1}\\ &\implies&\lambda+\varepsilon\notin\sigma(a) \end{eqnarray*} [[/math]]


(2) This follows indeed from the following computation:

[[math]] \begin{eqnarray*} \lambda \gt ||a|| &\implies&\Big|\Big|\frac{a}{\lambda}\Big|\Big| \lt 1\\ &\implies&1-\frac{a}{\lambda}\in A^{-1}\\ &\implies&\lambda-a\in A^{-1}\\ &\implies&\lambda\notin\sigma(a) \end{eqnarray*} [[/math]]


(3) Assume by contradiction [math]\sigma(a)=\emptyset[/math]. Given a linear form [math]f\in A^*[/math], consider the following map, which is well-defined, due to our assumption [math]\sigma(a)=\emptyset[/math]:

[[math]] \varphi:\mathbb C\to\mathbb C\quad,\quad \lambda\to f((a-\lambda)^{-1}) [[/math]]

By using Proposition 8.21 this map is differentiable, and so is a power series:

[[math]] \varphi(\lambda)=\sum_{k=0}^\infty c_k\lambda^k [[/math]]

On the other hand, we have the following estimate:

[[math]] \begin{eqnarray*} \lambda\to\infty &\implies&a-\lambda\to\infty\\ &\implies&(a-\lambda)^{-1}\to0\\ &\implies&\varphi(\lambda)\to0 \end{eqnarray*} [[/math]]


Thus by the Liouville theorem from complex analysis we obtain [math]\varphi=0[/math], and since [math]f\in A^*[/math] was arbitrary, this gives [math](a-\lambda)^{-1}=0[/math]. But this is a contradiction, as desired.

This was for the basic spectral theory in Banach algebras, which notably applies to the case [math]A=B(H)[/math]. It is possible to go beyond the above, for instance with a holomorphic function extension of the rational functional calculus formula [math]\sigma(f(a))=f(\sigma(a))[/math] from Theorem 8.20. Also, in the case of the algebras of operators, more can be said.

8d. Operator algebras

Let us get back now to the operator algebra [math]B(H)[/math], from Theorem 8.15. The point is that this Banach algebra is of a very special type, due to the following fact:

Theorem

The Banach algebra [math]B(H)[/math] has an involution [math]T\to T^*[/math], given by

[[math]] \lt Tx,y \gt = \lt x,T^*y \gt [[/math]]
which is antilinear, antimultiplicative, and is an isometry, in the sense that:

[[math]] ||T||=||T^*|| [[/math]]
Moreover, the norm the involution are related as well by [math]||TT^*||=||T||^2[/math].


Show Proof

We have several things to be proved, the idea being as follows:


(1) As a preliminary fact, that we will need in what follows, our claim is that any linear form [math]\varphi:H\to\mathbb C[/math] must be of the following type, for a certain vector [math]z\in H[/math]:

[[math]] \varphi(x)= \lt x,z \gt [[/math]]

Indeed, this is something clear for any Hilbert space of type [math]H=l^2(I)[/math]. But, by using a basis, any Hilbert space is of this form, and so we have proved our claim.


(2) The existence of the adjoint operator [math]T^*[/math], given by the formula in the statement, comes from the fact that the function [math]\varphi(x)= \lt Tx,y \gt [/math] being a linear map [math]H\to\mathbb C[/math], we must have a formula as follows, for a certain vector [math]T^*y\in H[/math]:

[[math]] \varphi(x)= \lt x,T^*y \gt [[/math]]

Moreover, since this vector is unique, [math]T^*[/math] is unique too, and we have as well:

[[math]] (S+T)^*=S^*+T^*\quad,\quad (\lambda T)^*=\bar{\lambda}T^* [[/math]]

[[math]] (ST)^*=T^*S^*\quad,\quad (T^*)^*=T [[/math]]

Observe also that we have indeed [math]T^*\in B(H)[/math], because:

[[math]] \begin{eqnarray*} ||T|| &=&\sup_{||x||=1}\sup_{||y||=1} \lt Tx,y \gt \\ &=&\sup_{||y||=1}\sup_{||x||=1} \lt x,T^*y \gt \\ &=&||T^*|| \end{eqnarray*} [[/math]]


(3) Regarding now the last assertion, observe that we have:

[[math]] ||TT^*|| \leq||T||\cdot||T^*|| =||T||^2 [[/math]]

On the other hand, we have as well the following estimate:

[[math]] \begin{eqnarray*} ||T||^2 &=&\sup_{||x||=1}| \lt Tx,Tx \gt |\\ &=&\sup_{||x||=1}| \lt x,T^*Tx \gt |\\ &\leq&||T^*T|| \end{eqnarray*} [[/math]]


By replacing [math]T\to T^*[/math] we obtain from this that we have as well [math]||T||^2\leq||TT^*||[/math]. Thus, we have obtained the needed inequality, and we are done.

As an observation here, in the context of the construction [math]T\to M[/math] from Theorem 8.14, the adjoint operation [math]T\to T^*[/math] takes a very simple form, namely:

[[math]] (M^*)_{ij}=\overline{M}_{ji} [[/math]]

The above result suggests the following key definition:

Definition

A [math]C^*[/math]-algebra is a complex algebra with unit [math]A[/math], having:

  • A norm [math]a\to||a||[/math], making it a Banach algebra.
  • An involution [math]a\to a^*[/math], which satisfies [math]||aa^*||=||a||^2[/math], for any [math]a\in A[/math].

At the level of the basic examples, we know from Theorem 8.23 that the full operator algebra [math]B(H)[/math] is a [math]C^*[/math]-algebra, in the above sense. More generally, any closed [math]*[/math]-subalgebra [math]A\subset B(H)[/math] is a [math]C^*[/math]-algebra. We will see later on that any [math]C^*[/math]-algebra appears in fact in this way, as a closed [math]*[/math]-subalgebra [math]A\subset B(H)[/math], for a certain Hilbert space [math]H[/math].


For the moment, we are interested in developing the theory of [math]C^*[/math]-algebras, without reference to operators, or Hilbert spaces. As a first observation, we have:

Proposition

If [math]X[/math] is an abstract compact space, the algebra [math]C(X)[/math] of continuous functions [math]f:X\to\mathbb C[/math] is a [math]C^*[/math]-algebra, with structure as follows:

  • The norm is the usual sup norm, given by:
    [[math]] ||f||=\sup_{x\in X}|f(x)| [[/math]]
  • The involution is the usual involution, given by:
    [[math]] f^*(x)=\overline{f(x)} [[/math]]

This algebra is commutative, in the sense that [math]fg=gf[/math], for any [math]f,g[/math].


Show Proof

Almost everything here is trivial. Observe that we have indeed:

[[math]] \begin{eqnarray*} ||ff^*|| &=&\sup_{x\in X}|f(x)\overline{f(x)}|\\ &=&\sup_{x\in X}|f(x)|^2\\ &=&||f||^2 \end{eqnarray*} [[/math]]


Thus, the axioms are satisfied, and finally [math]fg=gf[/math] is clear.

Our claim now is that any commutative [math]C^*[/math]-algebra appears as above. This is something non-trivial, which requires a number of preliminaries. We will need:

Definition

Given an element [math]a\in A[/math], its spectral radius

[[math]] \rho (a)\in(0,||a||) [[/math]]
is the radius of the smallest disk centered at [math]0[/math] containing [math]\sigma(a)[/math].

Here we have included a number of results that we already know, from Theorem 8.22, namely the fact that the spectrum is nonzero, and contained in the disk [math]D_0(||a||)[/math].


We have the following key result, extending our spectral theory knowledge, from the general Banach algebra setting, to the present [math]C^*[/math]-algebra setting:

Theorem

Let [math]A[/math] be a [math]C^*[/math]-algebra.

  • The spectrum of a unitary element [math](a^*=a^{-1}[/math]) is on the unit circle.
  • The spectrum of a self-adjoint element ([math]a=a^*[/math]) consists of real numbers.
  • The spectral radius of a normal element ([math]aa^*=a^*a[/math]) is equal to its norm.


Show Proof

We use the various results established above, and notably the rational calculus formula from Theorem 8.20, and the various results from Theorem 8.22:


(1) Assuming [math]a^*=a^{-1}[/math], we have the following norm computations:

[[math]] ||a||=\sqrt{||aa^*||}=\sqrt{1}=1 [[/math]]

[[math]] ||a^{-1}||=||a^*||=||a||=1 [[/math]]

Now if we denote by [math]D[/math] the unit disk, we obtain from this:

[[math]] ||a||=1\implies\sigma(a)\subset D [[/math]]

[[math]] ||a^{-1}||=1\implies\sigma(a^{-1})\subset D [[/math]]

On the other hand, by using the rational function [math]f(z)=z^{-1}[/math], we have:

[[math]] \sigma(a^{-1})\subset D\implies \sigma(a)\subset D^{-1} [[/math]]

Now by putting everything together we obtain, as desired:

[[math]] \sigma(a)\subset D\cap D^{-1}=\mathbb T [[/math]]

(2) This follows by using the result (1), just established above, and Theorem 8.20, with the following rational function, depending on a parameter [math]t\in\mathbb R[/math]:

[[math]] f(z)=\frac{z+it}{z-it} [[/math]]

Indeed, for [math]t \gt \gt 0[/math] the element [math]f(a)[/math] is well-defined, and we have:

[[math]] \begin{eqnarray*} \left(\frac{a+it}{a-it}\right)^* &=&\frac{(a+it)^*}{(a-it)^*}\\ &=&\frac{a-it}{a+it}\\ &=&\left(\frac{a+it}{a-it}\right)^{-1} \end{eqnarray*} [[/math]]


Thus the element [math]f(a)[/math] is a unitary, and by using (1) its spectrum is contained in [math]\mathbb T[/math]. We conclude from this that we have the following inclusion:

[[math]] f(\sigma(a))=\sigma(f(a))\subset\mathbb T [[/math]]

But this shows, by applying the inverse of [math]f[/math], that we have, as desired:

[[math]] \sigma(a)\subset f^{-1}(\mathbb T)=\mathbb R [[/math]]

(3) We already know that we have the inequality in one sense, [math]\rho(a)\leq ||a||[/math], and this for any [math]a\in A[/math]. For the reverse inequality, when [math]a[/math] is normal, we fix a number as follows:

[[math]] \rho \gt \rho(a) [[/math]]

We have then the following computation, with the convention that the integration over the circle [math]|z|=\rho[/math] is normalized, as for the integral of the 1 function to be 1:

[[math]] \begin{eqnarray*} \int_{|z|=\rho}\frac{z^n}{z -a}\,dz &=&\int_{|z|=\rho}\sum_{k=0}^\infty z^{n-k-1}a^k\,dz\\ &=&\sum_{k=0}^\infty\left(\int_{|z|=\rho}z^{n-k-1}dz\right)a^k\\ &=&\sum_{k=0}^\infty\delta_{n,k+1}a^k\\ &=&a^{n-1} \end{eqnarray*} [[/math]]


Here we have used the following formula, with [math]m\in\mathbb Z[/math], whose proof is elementary:

[[math]] \int_{|z|=\rho}z^m\,dz=\delta_{m0} [[/math]]

By applying now the norm and taking [math]n[/math]-th roots we obtain from the above formula, modulo some elementary manipulations, the following estimate:

[[math]] \rho\geq\lim_{n\to\infty}||a^n||^{1/n} [[/math]]

Now recall that [math]\rho[/math] was by definiton an arbitrary number satisfying [math]\rho \gt \rho(a)[/math]. Thus, we have obtained the following estimate, valid for any [math]a\in A[/math]:

[[math]] \rho(a)\geq\lim_{n\to\infty}||a^n||^{1/n} [[/math]]

In order to finish, we must prove that when [math]a[/math] is normal, this estimate implies the missing estimate, namely [math]\rho(a)\geq||a||[/math]. We can proceed in two steps, as follows:


\underline{Step 1}. In the case [math]a=a^*[/math] we have [math]||a^n||=||a||^n[/math] for any exponent of the form [math]n=2^k[/math], by using the [math]C^*[/math]-algebra condition [math]||aa^*||=||a||^2[/math], and by taking [math]n[/math]-th roots we get:

[[math]] \rho(a)\geq||a|| [[/math]]

Thus, we are done with the self-adjoint case, with the result [math]\rho(a)=||a||[/math].


\underline{Step 2}. In the general normal case [math]aa^*=a^*a[/math] we have [math]a^n(a^n)^*=(aa^*)^n[/math], and by using this, along with the result from Step 1, applied to [math]aa^*[/math], we obtain:

[[math]] \begin{eqnarray*} \rho(a) &\geq&\lim_{n\to\infty}||a^n||^{1/n}\\ &=&\sqrt{\lim_{n\to\infty}||a^n(a^n)^*||^{1/n}}\\ &=&\sqrt{\lim_{n\to\infty}||(aa^*)^n||^{1/n}}\\ &=&\sqrt{\rho(aa^*)}\\ &=&\sqrt{||a||^2}\\ &=&||a|| \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

As a first comment, the spectral radius formula [math]\rho(a)=||a||[/math] does not hold in general, the simplest counterexample being the following non-normal matrix:

[[math]] M=\begin{pmatrix}0&1\\0&0\end{pmatrix} [[/math]]

As another comment, we can combine the formula [math]\rho(a)=||a||[/math] for normal elements with the formula [math]||aa^*||=||a||^2[/math], and we are led to the following statement:

Proposition

In a [math]C^*[/math]-algebra, the norm is given by

[[math]] ||a||=\sqrt{\sup\left\{\lambda\in\mathbb C\Big| aa^*-\lambda\notin A^{-1}\right\}} [[/math]]
and so is an algebraic quantity.


Show Proof

We have the following computation, using the condition [math]||aa^*||=||a||^2[/math], then the spectral radius formula for [math]aa^*[/math], and finally the definition of the spectral radius:

[[math]] \begin{eqnarray*} ||a|| &=&\sqrt{||aa^*||}\\ &=&\sqrt{\rho(aa^*)}\\ &=&\sqrt{\sup\left\{\lambda\in\mathbb C\Big| \lambda\in\sigma(aa^*)\right\}}\\ &=&\sqrt{\sup\left\{\lambda\in\mathbb C\Big| aa^*-\lambda\notin A^{-1}\right\}} \end{eqnarray*} [[/math]]


Thus, we are led to the conclusion in the statement.

The above result is quite interesting, because it raises the possibility of axiomatizing the [math]C^*[/math]-algebras as being the Banach [math]*[/math]-algebras having the property that the formula in Proposition 8.28 defines a norm, which must satisfy the usual [math]C^*[/math]-algebra conditions. However, this is something rather philosophical, and we will not follow this path.


Good news, we are now in position of proving a key result, namely:

Theorem (Gelfand)

Any commutative [math]C^*[/math]-algebra is the form

[[math]] A=C(X) [[/math]]
with the compact space [math]X[/math], called spectrum of [math]A[/math], and denoted

[[math]] X=Spec(A) [[/math]]
appearing as the space of Banach algebra characters [math]\chi :A\to\mathbb C[/math].


Show Proof

This can be deduced from our spectral theory results, as follows:


(1) Given a commutative [math]C^*[/math]-algebra [math]A[/math], we can define indeed [math]X[/math] to be the set of characters [math]\chi :A\to\mathbb C[/math], with the topology making continuous all the evaluation maps:

[[math]] ev_a:\chi\to\chi(a) [[/math]]

Then [math]X[/math] is a compact space, and [math]a\to ev_a[/math] is a morphism of algebras:

[[math]] ev:A\to C(X) [[/math]]

(2) We first prove that [math]ev[/math] is involutive. We use the following formula:

[[math]] a=\frac{a+a^*}{2}-i\cdot\frac{i(a-a^*)}{2} [[/math]]

Thus it is enough to prove the following equality, for self-adjoint elements [math]a[/math]:

[[math]] ev_{a^*}=ev_a^* [[/math]]

But this is the same as proving that [math]a=a^*[/math] implies that [math]ev_a[/math] is a real function, which is in turn true, because [math]ev_a(\chi)=\chi(a)[/math] is an element of [math]\sigma(a)[/math], contained in [math]\mathbb R[/math].


(3) Since [math]A[/math] is commutative, each element is normal, so [math]ev[/math] is isometric:

[[math]] ||ev_a|| =\rho(a) =||a|| [[/math]]

(4) It remains to prove that [math]ev[/math] is surjective. But this follows from the Stone-Weierstrass theorem, because [math]ev(A)[/math] is a closed subalgebra of [math]C(X)[/math], which separates the points.

As a first consequence of the Gelfand theorem, we can extend the rational calculus formula from Theorem 8.20, in the case of the normal elements, as follows:

Theorem

We have the “continuous functional calculus” formula

[[math]] \sigma(f(a))=f(\sigma(a)) [[/math]]
valid for any normal element [math]a\in A[/math], and any continuous function [math]f\in C(\sigma(a))[/math].


Show Proof

Since our element [math]a[/math] is normal, the [math]C^*[/math]-algebra [math] \lt a \gt [/math] that is generates is commutative, and the Gelfand theorem gives an identification as follows:

[[math]] \lt a \gt =C(X) [[/math]]

In order to compute [math]X[/math], observe that the map [math]X\to\sigma(a)[/math] given by evaluation at [math]a[/math] is bijective. Thus, we have an identification of compact spaces, as follows:

[[math]] X=\sigma(a) [[/math]]

As a conclusion, the Gelfand theorem provides us with an identification as follows:

[[math]] \lt a \gt =C(\sigma(a)) [[/math]]

Now given [math]f\in C(\sigma(a))[/math], we can define indeed an element [math]f(a)\in A[/math], with [math]f\to f(a)[/math] being a morphism of [math]C^*[/math]-algebras, and we have [math]\sigma(f(a))=f(\sigma(a))[/math], as claimed.

The above result adds to a series of similar statements, namely Theorem 8.20, dealing with rational calculus, and the known holomorphic calculus in Banach algebras, briefly mentioned after Theorem 8.22. However, the story is not over here, because in certain special [math]C^*[/math]-algebras, such as the matrix algebras [math]M_N(\mathbb C)[/math], or more generally the so-called von Neumann algebras, we can apply if we want arbitrary measurable functions to the normal elements, and we still have [math]\sigma(f(a))=f(\sigma(a))[/math]. We will not get here into this.


As another important remark, the above result, or rather the formula [math] \lt a \gt =C(\sigma(a))[/math] from its proof, when applied to the normal operators [math]T\in B(H)[/math], is more of less the spectral theorem for such operators. Once again, we will not get here into this.


As a last topic, let us discuss now the GNS representation theorem, providing us with embeddings [math]A\subset B(H)[/math]. We will need some more spectral theory, as follows:

Proposition

For a normal element [math]a\in A[/math], the following are equivalent:

  • [math]a[/math] is positive, in the sense that [math]\sigma(a)\subset[0,\infty)[/math].
  • [math]a=b^2[/math], for some [math]b\in A[/math] satisfying [math]b=b^*[/math].
  • [math]a=cc^*[/math], for some [math]c\in A[/math].


Show Proof

This is something very standard, as follows:


[math](1)\implies(2)[/math] Since [math]a[/math] is normal, we can use Theorem 8.30, and set [math]b=\sqrt{a}[/math].


[math](2)\implies(3)[/math] This is trivial, because we can set [math]c=b[/math].


[math](3)\implies(1)[/math] We proceed by contradiction. By multiplying [math]c[/math] by a suitable element of [math] \lt cc^* \gt [/math], we are led to the existence of an element [math]d\neq0[/math] satisfying [math]-dd^*\geq0[/math]. By writing now [math]d=x+iy[/math] with [math]x=x^*,y=y^*[/math] we have:

[[math]] dd^*+d^*d=2(x^2+y^2)\geq0 [[/math]]

Thus [math]d^*d\geq0[/math]. But this contradicts the elementary fact that [math]\sigma(dd^*),\sigma(d^*d)[/math] must coincide outside [math]\{0\}[/math], that we know from Proposition 8.19.

Here is now the GNS representation theorem for the [math]C^*[/math]-algebras, due to Gelfand, Naimark and Segal, along with the idea of the proof:

Theorem (GNS theorem)

Let [math]A[/math] be a [math]C^*[/math]-algebra.

  • [math]A[/math] appears as a closed [math]*[/math]-subalgebra [math]A\subset B(H)[/math], for some Hilbert space [math]H[/math].
  • When [math]A[/math] is separable (usually the case), [math]H[/math] can be chosen to be separable.
  • When [math]A[/math] is finite dimensional, [math]H[/math] can be chosen to be finite dimensional.


Show Proof

This is something quite tricky, the idea being as follows:


(1) Let us first discuss the commutative case, [math]A=C(X)[/math]. Our claim here is that if we pick a probability measure on [math]X[/math], we have an embedding as follows:

[[math]] C(X)\subset B(L^2(X))\quad,\quad f\to(g\to fg) [[/math]]

Indeed, given a function [math]f\in C(X)[/math], consider the operator [math]T_f(g)=fg[/math], acting on [math]H=L^2(X)[/math]. Observe that [math]T_f[/math] is indeed well-defined, and bounded as well, because:

[[math]] ||fg||_2 =\sqrt{\int_X|f(x)|^2|g(x)|^2dx} \leq||f||_\infty||g||_2 [[/math]]

The application [math]f\to T_f[/math] being linear, involutive, continuous, and injective as well, we obtain in this way a [math]C^*[/math]-algebra embedding [math]C(X)\subset B(H)[/math], as claimed.


(2) In general, we can use a similar idea, with the positivity issues being taken care of by Proposition 8.31. Indeed, assuming that a linear form [math]\varphi:A\to\mathbb C[/math] has suitable positivity properties, making it analogous to the integration functionals [math]\int_X:A\to\mathbb C[/math] from the commutative case, we can define a scalar product on [math]A[/math], by the following formula:

[[math]] \lt a,b \gt =\varphi(ab^*) [[/math]]

By completing we obtain a Hilbert space [math]H[/math], and we have an embedding as follows:

[[math]] A\subset B(H)\quad,\quad a\to(b\to ab) [[/math]]

Thus we obtain the assertion (1), and a careful examination of the construction [math]A\to H[/math], outlined above, shows that the assertions (2,3) are in fact proved as well.

There are of course many other things that can be said about the bounded operators and the operator algebras, but for our purposes here, the above material, and especially the Gelfand theorem, will be basically all that we will need, in what follows. For more on all this, we refer as usual to our favorite analysis authors, namely Rudin [6] and Lax [7]. And for even more, this time in relation with physics, go with Connes [8].


General references

Banica, Teo (2024). "Linear algebra and group theory". arXiv:2206.09283 [math.CO].

References

  1. P.A.M. Dirac, Principles of quantum mechanics, Oxford Univ. Press (1930).
  2. J. von Neumann, Mathematical foundations of quantum mechanics, Princeton Univ. Press (1955).
  3. H. Weyl, The theory of groups and quantum mechanics, Princeton Univ. Press (1931).
  4. M. Kumar, Quantum: Einstein, Bohr, and the great debate about the nature of reality, Norton (2009).
  5. D.J. Griffiths, Introduction to elementary particles, Wiley (2020).
  6. W. Rudin, Real and complex analysis, McGraw-Hill (1966).
  7. P. Lax, Functional analysis, Wiley (2002).
  8. A. Connes, Noncommutative geometry, Academic Press (1994).