Rotation groups

10a. Rotation groups

We have seen that there are many interesting examples of finite groups [math]G[/math], which usually appear as groups of unitary matrices, [math]G\subset U_N[/math]. In this chapter we discuss similar questions, in the continuous case. Things here are quite tricky, and can quickly escalate into complicated mathematics, and we have several schools of thought, as follows:

(1) Pure mathematicians are usually interested in classifying everything, and for this purpose, classifying the continuous groups [math]G\subset U_N[/math], a good method is that of looking at the tangent space at the unit, [math]\mathfrak g=T_1(G)[/math], called Lie algebra of [math]G[/math]. This reduces the classification problem to a linear algebra problem, namely the classification of the Lie algebras [math]\mathfrak g[/math], and this latter problem can indeed be solved. Which is very nice.

(2) Applied mathematicians and physicists, on their side, already know what the groups [math]G\subset U_N[/math] that they are interested in are, and so are mildly enthusiastic about deep classification results. What they are interested in, however, are “tools” in order to deal with the groups [math]G\subset U_N[/math] that they have in mind. And here, surely the Lie algebra [math]\mathfrak g[/math] can be of help, but there are many other things, that can be of help too.

(3) To be more precise, a very efficient tool in order to deal with the groups [math]G\subset U_N[/math] is representation theory, with a touch of probability, and there is a whole string of results here, which are somehow rival to the standard Lie algebra theory, that can be developed, including the theory of the Haar measure, then Peter-Weyl theory, then Tannakian duality, then Brauer algebras and easiness, and then Weingarten calculus.

(4) So, this is the situation, at least at the beginner level, with two ways to be followed, on one hand the Lie algebra way, linearizing the group in the simplest way, by looking at the tangent space at the unit, [math]\mathfrak g=T_1(G)[/math], and on the other hand with the representation theory way, linearizing the group in a more complicated, yet equally natural way, via representation theory invariants, such as the associated Brauer algebras.

In this book we will rather follow the representation theory way, with this being a personal choice. Let me mention too that at the advanced level the theory of Lie algebras and Brauer algebras is more or less the same thing, so in the end, things fine.

This being said, and before getting started, some references too. Algebra in general is a quite wide topic, and for the basics, of all kinds, you have the book of Lang ^[1]. Then, speaking algebra, you definitely need to learn some algebraic geometry, which is some extremely beautiful, and useful, and classical, with standard references here being the books of Shafarevich ^[2] and Harris ^[3]. And finally, in what regards groups, and various methods in order to deal with them, sometimes complementary to what we will be doing here, good references are the books of Humphreys ^[4] and Serre ^[5].

Back to work now, we will be mainly interested in the unitary group [math]U_N[/math] itself, in its real version, which is the orthogonal group [math]O_N[/math], and in various technical versions of these basic groups [math]O_N,U_N[/math]. So, let us start with some reminders, regarding [math]O_N,U_N[/math]:

Theorem

We have the following results:

The rotations of [math]\mathbb R^N[/math] form the orthogonal group [math]O_N[/math], which is given by:
[[math]] O_N=\left\{U\in M_N(\mathbb R)\Big|U^t=U^{-1}\right\} [[/math]]
The rotations of [math]\mathbb C^N[/math] form the unitary group [math]U_N[/math], which is given by:
[[math]] U_N=\left\{U\in M_N(\mathbb C)\Big|U^*=U^{-1}\right\} [[/math]]

In addition, we can restrict the attention to the rotations of the corresponding spheres.

Show Proof

This is something that we already know, the idea being as follows:

(1) We have seen in chapter 1 that a linear map [math]T:\mathbb R^N\to\mathbb R^N[/math], written as [math]T(x)=Ux[/math] with [math]U\in M_N(\mathbb R)[/math], is a rotation, in the sense that it preserves the distances and the angles, precisely when the associated matrix [math]U[/math] is orthogonal, in the following sense:

[[math]] U^t=U^{-1} [[/math]]

Thus, we obtain the result. As for the last assertion, this is clear as well, because an isometry of [math]\mathbb R^N[/math] is the same as an isometry of the unit sphere [math]S^{N-1}_\mathbb R\subset\mathbb R^N[/math].

(2) We have seen in chapter 3 that a linear map [math]T:\mathbb C^N\to\mathbb C^N[/math], written as [math]T(x)=Ux[/math] with [math]U\in M_N(\mathbb C)[/math], is a rotation, in the sense that it preserves the distances and the scalar products, precisely when the associated matrix [math]U[/math] is unitary, in the following sense:

[[math]] U^*=U^{-1} [[/math]]

Thus, we obtain the result. As for the last assertion, this is clear as well, because an isometry of [math]\mathbb C^N[/math] is the same as an isometry of the unit sphere [math]S^{N-1}_\mathbb C\subset\mathbb C^N[/math].

■

In order to introduce some further continuous groups [math]G\subset U_N[/math], we will need:

Proposition

We have the following results:

For an orthogonal matrix [math]U\in O_N[/math] we have [math]\det U\in\{\pm1\}[/math].
For a unitary matrix [math]U\in U_N[/math] we have [math]\det U\in\mathbb T[/math].

Show Proof

This is clear from the equations defining [math]O_N,U_N[/math], as follows:

(1) We have indeed the following implications:

[[math]] \begin{eqnarray*} U\in O_N &\implies&U^t=U^{-1}\\ &\implies&\det U^t=\det U^{-1}\\ &\implies&\det U=(\det U)^{-1}\\ &\implies&\det U\in\{\pm1\} \end{eqnarray*} [[/math]]

(2) We have indeed the following implications:

[[math]] \begin{eqnarray*} U\in U_N &\implies&U^*=U^{-1}\\ &\implies&\det U^*=\det U^{-1}\\ &\implies&\overline{\det U}=(\det U)^{-1}\\ &\implies&\det U\in\mathbb T \end{eqnarray*} [[/math]]

Here we have used the fact that [math]\bar{z}=z^{-1}[/math] means [math]z\bar{z}=1[/math], and so [math]z\in\mathbb T[/math].

■

We can now introduce the subgroups [math]SO_N\subset O_N[/math] and [math]SU_N\subset U_N[/math], as being the subgroups consisting of the rotations which preserve the orientation, as follows:

Theorem

The following are groups of matrices,

[[math]] SO_N=\left\{U\in O_N\Big|\det U=1\right\} [[/math]]

[[math]] SU_N=\left\{U\in U_N\Big|\det U=1\right\} [[/math]]

consisting of the rotations which preserve the orientation.

Show Proof

The fact that we have indeed groups follows from the properties of the determinant, of from the property of preserving the orientation, which is clear as well.

■

Summarizing, we have constructed so far 4 continuous groups of matrices, consisting of various rotations, with inclusions between them, as follows:

[[math]] \xymatrix@R=50pt@C=50pt{ SU_N\ar[r]&U_N\\ SO_N\ar[u]\ar[r]&O_N\ar[u]} [[/math]]

Observe that this is an intersection diagram, in the sense that:

[[math]] SO_N=SU_N\cap O_N [[/math]]

As an illustration, let us work out what happens at [math]N=1,2[/math]. At [math]N=1[/math] the situation is quite trivial, and we obtain very simple groups, as follows:

Proposition

The basic continuous groups at [math]N=1[/math], namely

[[math]] \xymatrix@R=45pt@C=45pt{ SU_1\ar[r]&U_1\\ SO_1\ar[u]\ar[r]&O_1\ar[u]} [[/math]]

are the following groups of complex numbers,

[[math]] \xymatrix@R=45pt@C=45pt{ \{1\}\ar[r]&\mathbb T\\ \{1\}\ar[u]\ar[r]&\{\pm1\}\ar[u]} [[/math]]

or, equivalently, are the following cyclic groups,

[[math]] \xymatrix@R=45pt@C=45pt{ \mathbb Z_1\ar[r]&\mathbb Z_\infty\\ \mathbb Z_1\ar[u]\ar[r]&\mathbb Z_2\ar[u]} [[/math]]

with the convention that [math]\mathbb Z_s[/math] is the group of [math]s[/math]-th roots of unity.

Show Proof

This is clear from definitions, because for a [math]1\times1[/math] matrix the unitarity condition reads [math]\bar{U}=U^{-1}[/math], and so [math]U\in\mathbb T[/math], and this gives all the results.

■

At [math]N=2[/math] now, let us first discuss the real case. The result here is as follows:

Theorem

We have the following results:

[math]SO_2[/math] is the group of usual rotations in the plane, which are given by:
[[math]] R_t=\begin{pmatrix}\cos t&-\sin t\\ \sin t&\cos t\end{pmatrix} [[/math]]
[math]O_2[/math] consists in addition of the usual symmetries in the plane, given by:
[[math]] S_t=\begin{pmatrix}\cos t&\sin t\\ \sin t&-\cos t\end{pmatrix} [[/math]]
Abstractly speaking, we have isomorphisms as follows:
[[math]] SO_2\simeq\mathbb T\quad,\quad O_2=\mathbb T\rtimes\mathbb Z_2 [[/math]]
When discretizing all this, by replacing the [math]2[/math]-dimensional unit sphere [math]\mathbb T[/math] by the regular [math]N[/math]-gon, the latter isomorphism discretizes as [math]D_N=\mathbb Z_N\rtimes\mathbb Z_2[/math].

Show Proof

This follows from some elementary computations, as follows:

(1) The first assertion is clear, because only the rotations of the plane in the usual sense preserve the orientation. As for the formula of [math]R_t[/math], this is something that we already know, from chapter 1, obtained by computing [math]R_t\binom{1}{0}[/math] and [math]R_t\binom{0}{1}[/math].

(2) The first assertion is clear, because rotations left aside, we are left with the symmetries of the plane, in the usual sense. As for formula of [math]S_t[/math], this is something that we basically know too, obtained by computing [math]S_t\binom{1}{0}[/math] and [math]S_t\binom{0}{1}[/math].

(3) The first assertion is clear, because the angles [math]t\in\mathbb R[/math], taken as usual modulo [math]2\pi[/math], form the group [math]\mathbb T[/math]. As for the second assertion, the proof here is similar to the proof of the crossed product decomposition [math]D_N=\mathbb Z_N\rtimes\mathbb Z_2[/math] for the dihedral groups.

(4) This is something more speculative, the idea here being that the isomorphism [math]O_2=\mathbb T\rtimes\mathbb Z_2[/math] appears from [math]D_N=\mathbb Z_N\rtimes\mathbb Z_2[/math] by taking the [math]N\to\infty[/math] limit.

■

In general, the structure of [math]O_N[/math] and [math]SO_N[/math], and the relation between them, is far more complicated than what happens at [math]N=1,2[/math]. We will be back to this later.

10b. Pauli matrices

Moving forward, let us keep working out what happens at [math]N=2[/math], but this time with a study in the complex case. We first have here the following key result:

Theorem

We have the following formula,

[[math]] SU_2=\left\{\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix}\ \Big|\ |a|^2+|b|^2=1\right\} [[/math]]

which makes [math]SU_2[/math] isomorphic to the unit sphere [math]S^1_\mathbb C\subset\mathbb C^2[/math].

Show Proof

Consider indeed an arbitrary [math]2\times 2[/math] matrix, written as follows:

[[math]] U=\begin{pmatrix}a&b\\ c&d\end{pmatrix} [[/math]]

Assuming that we have [math]\det U=1[/math], the inverse must be given by:

[[math]] U^{-1}=\begin{pmatrix}d&-b\\ -c&a\end{pmatrix} [[/math]]

On the other hand, assuming [math]U\in U_2[/math], the inverse must be the adjoint:

[[math]] U^{-1}=\begin{pmatrix}\bar{a}&\bar{c}\\ \bar{b}&\bar{d}\end{pmatrix} [[/math]]

We are therefore led to the following equations, for the matrix entries:

[[math]] d=\bar{a}\quad,\quad c=-\bar{b} [[/math]]

Thus our matrix must be of the following special form:

[[math]] U=\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix} [[/math]]

Moreover, since the determinant is 1, we must have, as stated:

[[math]] |a|^2+|b|^2=1 [[/math]]

Thus, we are done with one inclusion. As for the converse, this is clear, the matrices in the statement being unitaries, and of determinant 1, and so being elements of [math]SU_2[/math]. Finally, regarding the last assertion, recall that the unit sphere [math]S^1_\mathbb C\subset\mathbb C^2[/math] is given by:

[[math]] S^1_\mathbb C=\left\{(a,b)\ \Big|\ |a|^2+|b|^2=1\right\} [[/math]]

Thus, we have an isomorphism of compact spaces, as follows:

[[math]] SU_2\simeq S^1_\mathbb C\quad,\quad \begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix}\to (a,b) [[/math]]

We have therefore proved our theorem.

■

Regarding now the unitary group [math]U_2[/math], the result here is similar, as follows:

Theorem

We have the following formula,

[[math]] U_2=\left\{d\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix}\ \Big|\ |a|^2+|b|^2=1,|d|=1\right\} [[/math]]

which makes [math]U_2[/math] be a quotient compact space, as follows,

[[math]] S^1_\mathbb C\times\mathbb T\to U_2 [[/math]]

but with this parametrization being no longer bijective.

Show Proof

In one sense, this is clear from Theorem 10.6, because we have:

[[math]] |d|=1\implies dSU_2\subset U_2 [[/math]]

In the other sense, let us pick an arbitrary matrix [math]U\in U_2[/math]. We have then:

[[math]] \begin{eqnarray*} |\det(U)|^2 &=&\det(U)\overline{\det(U)}\\ &=&\det(U)\det(U^*)\\ &=&\det(UU^*)\\ &=&\det(1)\\ &=&1 \end{eqnarray*} [[/math]]

Consider now the following complex number, defined up to a sign choice:

[[math]] d=\sqrt{\det U} [[/math]]

We know from Proposition 10.2 that we have [math]|d|=1[/math]. Thus the rescaled matrix [math]V=U/d[/math] is unitary, [math]V\in U_2[/math]. As for the determinant of this matrix, this is given by:

[[math]] \begin{eqnarray*} \det(V) &=&\det(U/d)\\ &=&\det(U)/d^2\\ &=&\det(U)/\det(U)\\ &=&1 \end{eqnarray*} [[/math]]

Thus we have [math]V\in SU_2[/math], and so we can write, with [math]|a|^2+|b|^2=1[/math]:

[[math]] V=\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix} [[/math]]

Thus the matrix [math]U=dV[/math] appears as in the statement. Finally, observe that the result that we have just proved provides us with a quotient map as follows:

[[math]] S^1_\mathbb C\times\mathbb T\to U_2\quad,\quad ((a,b),d)\to d\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix} [[/math]]

However, the parametrization is no longer bijective, because when we globally switch signs, the element [math]((-a,-b),-d)[/math] produces the same element of [math]U_2[/math].

■

Let us record now a few more results regarding [math]SU_2,U_2[/math], which are key groups in mathematics and physics. First, we have the following reformulation of Theorem 10.6:

Theorem

We have the formula

[[math]] SU_2=\left\{\begin{pmatrix}x+iy&z+it\\ -z+it&x-iy\end{pmatrix}\ \Big|\ x^2+y^2+z^2+t^2=1\right\} [[/math]]

which makes [math]SU_2[/math] isomorphic to the unit real sphere [math]S^3_\mathbb R\subset\mathbb R^3[/math].

Show Proof

We recall from Theorem 10.6 that we have:

[[math]] SU_2=\left\{\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix}\ \Big|\ |a|^2+|b|^2=1\right\} [[/math]]

Now let us write our parameters [math]a,b\in\mathbb C[/math], which belong to the complex unit sphere [math]S^1_\mathbb C\subset\mathbb C^2[/math], in terms of their real and imaginary parts, as follows:

[[math]] a=x+iy\quad,\quad b=z+it [[/math]]

In terms of [math]x,y,z,t\in\mathbb R[/math], our formula for a generic matrix [math]U\in SU_2[/math] becomes the one in the statement. As for the condition to be satisfied by the parameters [math]x,y,z,t\in\mathbb R[/math], this comes the condition [math]|a|^2+|b|^2=1[/math] to be satisfied by [math]a,b\in\mathbb C[/math], which reads:

[[math]] x^2+y^2+z^2+t^2=1 [[/math]]

Thus, we are led to the conclusion in the statement. Regarding now the last assertion, recall that the unit sphere [math]S^3_\mathbb R\subset\mathbb R^4[/math] is given by:

[[math]] S^3_\mathbb R=\left\{(x,y,z,t)\ \Big|\ x^2+y^2+z^2+t^2=1\right\} [[/math]]

Thus, we have an isomorphism of compact spaces, as follows:

[[math]] SU_2\simeq S^3_\mathbb R\quad,\quad \begin{pmatrix}x+iy&z+it\\ -z+it&x-iy\end{pmatrix}\to(x,y,z,t) [[/math]]

We have therefore proved our theorem.

■

As a philosophical comment here, the above parametrization of [math]SU_2[/math] is something very nice, because the parameters [math](x,y,z,t)[/math] range now over the sphere of space-time. Thus, we are probably doing some kind of physics here. More on this later.

Regarding now the group [math]U_2[/math], we have here a similar result, as follows:

Theorem

We have the following formula,

[[math]] U_2=\left\{(p+iq)\begin{pmatrix}x+iy&z+it\\ -z+it&x-iy\end{pmatrix}\ \Big|\ x^2+y^2+z^2+t^2=1,\ p^2+q^2=1\right\} [[/math]]

which makes [math]U_2[/math] be a quotient compact space, as follows,

[[math]] S^3_\mathbb R\times S^1_\mathbb R\to U_2 [[/math]]

but with this parametrization being no longer bijective.

Show Proof

We recall from Theorem 10.7 that we have:

[[math]] U_2=\left\{d\begin{pmatrix}a&b\\ -\bar{b}&\bar{a}\end{pmatrix}\ \Big|\ |a|^2+|b|^2=1,\ |d|=1\right\} [[/math]]

Now let us write our parameters [math]a,b\in\mathbb C[/math], which belong to the complex unit sphere [math]S^1_\mathbb C\subset\mathbb C^2[/math], and [math]d\in\mathbb T[/math], in terms of their real and imaginary parts, as follows:

[[math]] a=x+iy\quad,\quad b=z+it\quad,\quad d=p+iq [[/math]]

In terms of these new parameters [math]x,y,z,t,p,q\in\mathbb R[/math], our formula for a generic matrix [math]U\in SU_2[/math], that we established before, reads:

[[math]] U=(p+iq)\begin{pmatrix}x+iy&z+it\\ -z+it&x-iy\end{pmatrix} [[/math]]

As for the condition to be satisfied by the parameters [math]x,y,z,t,p,q\in\mathbb R[/math], this comes the conditions [math]|a|^2+|b|^2=1[/math] and [math]|d|=1[/math] to be satisfied by [math]a,b,d\in\mathbb C[/math], which read:

[[math]] x^2+y^2+z^2+t^2=1\quad,\quad p^2+q^2=1 [[/math]]

Thus, we are led to the conclusion in the statement. Regarding now the last assertion, recall that the unit spheres [math]S^3_\mathbb R\subset\mathbb R^4[/math] and [math]S^1_\mathbb R\subset\mathbb R^2[/math] are given by:

[[math]] S^3_\mathbb R=\left\{(x,y,z,t)\ \Big|\ x^2+y^2+z^2+t^2=1\right\} [[/math]]

[[math]] S^1_\mathbb R=\left\{(p,q)\ \Big|\ p^2+q^2=1\right\} [[/math]]

Thus, we have quotient map of compact spaces, as follows:

[[math]] S^3_\mathbb R\times S^1_\mathbb R\to U_2 [[/math]]

[[math]] ((x,y,z,t),(p,q))\to(p+iq)\begin{pmatrix}x+iy&z+it\\ -z+it&x-iy\end{pmatrix} [[/math]]

However, the parametrization is no longer bijective, because when we globally switch signs, the element [math]((-x,-y,-z,-t),(-p,-q))[/math] produces the same element of [math]U_2[/math].

■

Here is now another reformulation of our main result so far, regarding [math]SU_2[/math], obtained by further building on the parametrization from Theorem 10.8:

Theorem

We have the following formula,

[[math]] SU_2=\left\{xc_1+yc_2+zc_3+tc_4\ \Big|\ x^2+y^2+z^2+t^2=1\right\} [[/math]]

where [math]c_1,c_2,c_3,c_4[/math] are the Pauli matrices, given by:

[[math]] c_1=\begin{pmatrix}1&0\\ 0&1\end{pmatrix}\qquad,\qquad c_2=\begin{pmatrix}i&0\\ 0&-i\end{pmatrix} [[/math]]

[[math]] c_3=\begin{pmatrix}0&1\\ -1&0\end{pmatrix}\qquad,\qquad c_4=\begin{pmatrix}0&i\\ i&0\end{pmatrix} [[/math]]

Show Proof

We recall from Theorem 10.8 that the group [math]SU_2[/math] can be parametrized by the real sphere [math]S^3_\mathbb R\subset\mathbb R^4[/math], in the following way:

[[math]] SU_2=\left\{\begin{pmatrix}x+iy&z+it\\ -z+it&x-iy\end{pmatrix}\ \Big|\ x^2+y^2+z^2+t^2=1\right\} [[/math]]

Thus, the elements [math]U\in SU_2[/math] are precisely the matrices as follows, depending on parameters [math]x,y,z,t\in\mathbb R[/math] satisfying [math]x^2+y^2+z^2+t^2=1[/math]:

[[math]] U=x\begin{pmatrix}1&0\\ 0&1\end{pmatrix} +y\begin{pmatrix}i&0\\ 0&-i\end{pmatrix} +z\begin{pmatrix}0&1\\ -1&0\end{pmatrix}+ t\begin{pmatrix}0&i\\ i&0\end{pmatrix} [[/math]]

But this gives the formula for [math]SU_2[/math] in the statement.

■

The above result is often the most convenient one, when dealing with [math]SU_2[/math]. This is because the Pauli matrices have a number of remarkable properties, which are very useful when doing computations. These properties can be summarized as follows:

Theorem

The Pauli matrices multiply according to the formulae

[[math]] c_2^2=c_3^2=c_4^2=-1 [[/math]]

[[math]] c_2c_3=-c_3c_2=c_4 [[/math]]

[[math]] c_3c_4=-c_4c_3=c_2 [[/math]]

[[math]] c_4c_2=-c_2c_4=c_3 [[/math]]

they conjugate according to the following rules,

[[math]] c_1^*=c_1\ ,\ c_2^*=-c_2\ ,\ c_3^*=-c_3\ ,\ c_4^*=-c_4 [[/math]]

and they form an orthonormal basis of [math]M_2(\mathbb C)[/math], with respect to the scalar product

[[math]] \lt a,b \gt =tr(ab^*) [[/math]]

with [math]tr:M_2(\mathbb C)\to\mathbb C[/math] being the normalized trace of [math]2\times 2[/math] matrices, [math]tr=Tr/2[/math].

Show Proof

The first two assertions, regarding the multiplication and conjugation rules for the Pauli matrices, follow from some elementary computations. As for the last assertion, this follows by using these rules. Indeed, the fact that the Pauli matrices are pairwise orthogonal follows from computations of the following type, for [math]i\neq j[/math]:

[[math]] \lt c_i,c_j \gt =tr(c_ic_j^*) =tr(\pm c_ic_j) =tr(\pm c_k) =0 [[/math]]

As for the fact that the Pauli matrices have norm 1, this follows from:

[[math]] \lt c_i,c_i \gt =tr(c_ic_i^*) =tr(\pm c_i^2) =tr(c_1) =1 [[/math]]

Thus, we are led to the conclusion in the statement.

■

We should mention here that the Pauli matrices are cult objects in physics, due to the fact that they describe the spin of the electron. Remember maybe the discussion from the beginning of chapter 8, when we were talking about the wave functions [math]\psi:\mathbb R^3\to\mathbb C[/math] of these electrons, and of the Hilbert space [math]H=L^2(\mathbb R^3)[/math] needed for understanding their quantum mechanics. Well, that was only half of the story, with the other half coming from the fact that, a bit like our Earth spins around its axis, the electrons spin too. And it took scientists a lot of skill in order to understand the physics and mathematics of the spin, the conclusion being that the wave function space [math]H=L^2(\mathbb R^3)[/math] has to be enlarged with a copy of [math]K=\mathbb C^2[/math], as to take into account the spin, and with this spin being described by the Pauli matrices, in some appropriate, quantum mechanical sense.

As usual, we refer to Feynman ^[6], Griffiths ^[7] or Weinberg ^[8] for more on all this. And with the remark that the Pauli matrices are actually subject to several possible normalizations, depending on formalism, but let us not get into all this here.

10c. Euler-Rodrigues

Back to mathematics, let us discuss now the basic unitary groups in 3 or more dimensions. The situation here becomes fairly complicated, but it is possible however to explicitly compute the rotation groups [math]SO_3[/math] and [math]O_3[/math], and explaining this result, due to Euler-Rodrigues, which is something non-trivial and very useful, will be our next goal.

The proof of the Euler-Rodrigues formula is something quite tricky. Let us start with the following construction, whose usefulness will become clear in a moment:

Proposition

The adjoint action [math]SU_2\curvearrowright M_2(\mathbb C)[/math], given by

[[math]] T_U(M)=UMU^* [[/math]]

leaves invariant the following real vector subspace of [math]M_2(\mathbb C)[/math],

[[math]] E=span_\mathbb R(c_1,c_2,c_3,c_4) [[/math]]

and we obtain in this way a group morphism [math]SU_2\to GL_4(\mathbb R)[/math].

Show Proof

We have two assertions to be proved, as follows:

(1) We must first prove that, with [math]E\subset M_2(\mathbb C)[/math] being the real vector space in the statement, we have the following implication:

[[math]] U\in SU_2,M\in E\implies UMU^*\in E [[/math]]

But this is clear from the multiplication rules for the Pauli matrices, from Theorem 10.11. Indeed, let us write our matrices [math]U,M[/math] as follows:

[[math]] U=xc_1+yc_2+zc_3+tc_4 [[/math]]

[[math]] M=ac_1+bc_2+cc_3+dc_4 [[/math]]

We know that the coefficients [math]x,y,z,t[/math] and [math]a,b,c,d[/math] are real, due to [math]U\in SU_2[/math] and [math]M\in E[/math]. The point now is that when computing [math]UMU^*[/math], by using the various rules from Theorem 10.11, we obtain a matrix of the same type, namely a combination of [math]c_1,c_2,c_3,c_4[/math], with real coefficients. Thus, we have [math]UMU^*\in E[/math], as desired.

(2) In order to conclude, let us identify [math]E\simeq\mathbb R^4[/math], by using the basis [math]c_1,c_2,c_3,c_4[/math]. The result found in (1) shows that we have a correspondence as follows:

[[math]] SU_2\to M_4(\mathbb R)\quad,\quad U\to (T_U)_{|E} [[/math]]

Now observe that for any [math]U\in SU_2[/math] and any [math]M\in M_2(\mathbb C)[/math] we have:

[[math]] T_{U^*}T_U(M)=U^*UMU^*U=M [[/math]]

Thus [math]T_{U^*}=T_U^{-1}[/math], and so the correspondence that we found can be written as:

[[math]] SU_2\to GL_4(\mathbb R)\quad,\quad U\to (T_U)_{|E} [[/math]]

But this a group morphism, due to the following computation:

[[math]] T_UT_V(M)=UVMV^*U^*=T_{UV}(M) [[/math]]

Thus, we are led to the conclusion in the statement.

■

The point now, which makes the link with [math]SO_3[/math], and which will ultimately elucidate the structure of [math]SO_3[/math], is that Proposition 10.12 can be improved as follows:

Theorem

The adjoint action [math]SU_2\curvearrowright M_2(\mathbb C)[/math], given by

[[math]] T_U(M)=UMU^* [[/math]]

leaves invariant the following real vector subspace of [math]M_2(\mathbb C)[/math],

[[math]] F=span_\mathbb R(c_2,c_3,c_4) [[/math]]

and we obtain in this way a group morphism [math]SU_2\to SO_3[/math].

Show Proof

We can do this in several steps, as follows:

(1) Our first claim is that the group morphism [math]SU_2\to GL_4(\mathbb R)[/math] constructed in Proposition 10.12 is in fact a morphism [math]SU_2\to O_4[/math]. In order to prove this, recall the following formula, valid for any [math]U\in SU_2[/math], from the proof of Proposition 10.12:

[[math]] T_{U^*}=T_U^{-1} [[/math]]

We want to prove that the matrices [math]T_U\in GL_4(\mathbb R)[/math] are orthogonal, and in view of the above formula, it is enough to prove that we have:

[[math]] T_U^*=(T_U)^t [[/math]]

So, let us prove this. For any two matrices [math]M,N\in E[/math], we have:

[[math]] \begin{eqnarray*} \lt T_{U^*}(M),N \gt &=& \lt U^*MU,N \gt \\ &=&tr(U^*MUN)\\ &=&tr(MUNU^*) \end{eqnarray*} [[/math]]

On the other hand, we have as well the following formula:

[[math]] \begin{eqnarray*} \lt (T_U)^t(M),N \gt &=& \lt M,T_U(N) \gt \\ &=& \lt M,UNU^* \gt \\ &=&tr(MUNU^*) \end{eqnarray*} [[/math]]

Thus we have indeed [math]T_U^*=(T_U)^t[/math], which proves our [math]SU_2\to O_4[/math] claim.

(2) In order now to finish, recall that we have by definition [math]c_1=1[/math], as a matrix. Thus, the action of [math]SU_2[/math] on the vector [math]c_1\in E[/math] is given by:

[[math]] T_U(c_1)=Uc_1U^*=UU^*=1=c_1 [[/math]]

We conclude that [math]c_1\in E[/math] is invariant under [math]SU_2[/math], and by orthogonality the following subspace of [math]E[/math] must be invariant as well under the action of [math]SU_2[/math]:

[[math]] e_1^\perp=span_\mathbb R(c_2,c_3,c_4) [[/math]]

Now if we call this subspace [math]F[/math], and we identify [math]F\simeq\mathbb R^3[/math] by using the basis [math]c_2,c_3,c_4[/math], we obtain by restriction to [math]F[/math] a morphism of groups as follows:

[[math]] SU_2\to O_3 [[/math]]

But since this morphism is continuous and [math]SU_2[/math] is connected, its image must be connected too. Now since the target group decomposes as [math]O_3=SO_3\sqcup(-SO_3)[/math], and [math]1\in SU_2[/math] gets mapped to [math]1\in SO_3[/math], the whole image must lie inside [math]SO_3[/math], and we are done.

■

The above result is quite interesting, because we will see in a moment that the morphism [math]SU_2\to SO_3[/math] constructed there is surjective. Thus, we will have a way of parametrizing the elements [math]V\in SO_3[/math] by elements [math]U\in SO_2[/math], and so ultimately by parameters [math](x,y,z,t)\in S^3_\mathbb R[/math]. In order to work out all this, let us start with the following result, coming as a continuation of Proposition 10.12, independently of Theorem 10.13:

Proposition

With respect to the standard basis [math]c_1,c_2,c_3,c_4[/math] of the vector space [math]\mathbb R^4=span(c_1,c_2,c_3,c_4)[/math], the morphism [math]T:SU_2\to GL_4(\mathbb R)[/math] is given by:

[[math]] T_U=\begin{pmatrix} 1&0&0&0\\ 0&x^2+y^2-z^2-t^2&2(yz-xt)&2(xz+yt)\\ 0&2(xt+yz)&x^2+z^2-y^2-t^2&2(zt-xy)\\ 0&2(yt-xz)&2(xy+zt)&x^2+t^2-y^2-z^2 \end{pmatrix} [[/math]]

Thus, when looking at [math]T[/math] as a group morphism [math]SU_2\to O_4[/math], what we have in fact is a group morphism [math]SU_2\to O_3[/math], and even [math]SU_2\to SO_3[/math].

Show Proof

With notations from Proposition 10.12 and its proof, let us first look at the action [math]L:SU_2\curvearrowright\mathbb R^4[/math] by left multiplication, which is by definition given by:

[[math]] L_U(M)=UM [[/math]]

In order to compute the matrix of this action, let us write, as usual:

[[math]] U=xc_1+yc_2+zc_3+tc_4 [[/math]]

[[math]] M=ac_1+bc_2+cc_3+dc_4 [[/math]]

By using the multiplication formulae in Theorem 10.11, we obtain:

[[math]] \begin{eqnarray*} UM &=&(xc_1+yc_2+zc_3+tc_4)(ac_1+bc_2+cc_3+dc_4)\\ &=&(xa-yb-zc-td)c_1\\ &+&(xb+ya+zd-tc)c_2\\ &+&(xc-yd+za+tb)c_3\\ &+&(xd+yc-zb+ta)c_4 \end{eqnarray*} [[/math]]

We conclude that the matrix of the left action considered above is:

[[math]] L_U=\begin{pmatrix} x&-y&-z&-t\\ y&x&-t&z\\ z&t&x&-y\\ t&-z&y&x \end{pmatrix} [[/math]]

Similarly, let us look now at the action [math]R:SU_2\curvearrowright\mathbb R^4[/math] by right multiplication, which is by definition given by the following formula:

[[math]] R_U(M)=MU^* [[/math]]

In order to compute the matrix of this action, let us write, as before:

[[math]] U=xc_1+yc_2+zc_3+tc_4 [[/math]]

[[math]] M=ac_1+bc_2+cc_3+dc_4 [[/math]]

By using the multiplication formulae in Theorem 10.11, we obtain:

[[math]] \begin{eqnarray*} MU^* &=&(ac_1+bc_2+cc_3+dc_4)(xc_1-yc_2-zc_3-tc_4)\\ &=&(ax+by+cz+dt)c_1\\ &+&(-ay+bx-ct+dz)c_2\\ &+&(-az+bt+cx-dy)c_3\\ &+&(-at-bz+cy+dx)c_4 \end{eqnarray*} [[/math]]

We conclude that the matrix of the right action considered above is:

[[math]] R_U=\begin{pmatrix} x&y&z&t\\ -y&x&-t&z\\ -z&t&x&-y\\ -t&-z&y&x \end{pmatrix} [[/math]]

Now by composing, the matrix of the adjoint matrix in the statement is:

[[math]] \begin{eqnarray*} T_U &=&R_UL_U\\ &=&\begin{pmatrix} x&y&z&t\\ -y&x&-t&z\\ -z&t&x&-y\\ -t&-z&y&x \end{pmatrix} \begin{pmatrix} x&-y&-z&-t\\ y&x&-t&z\\ z&t&x&-y\\ t&-z&y&x \end{pmatrix}\\ &=&\begin{pmatrix} 1&0&0&0\\ 0&x^2+y^2-z^2-t^2&2(yz-xt)&2(xz+yt)\\ 0&2(xt+yz)&x^2+z^2-y^2-t^2&2(zt-xy)\\ 0&2(yt-xz)&2(xy+zt)&x^2+t^2-y^2-z^2 \end{pmatrix} \end{eqnarray*} [[/math]]

Thus, we have indeed the formula in the statement. As for the remaining assertions, these are all clear either from this formula, or from Theorem 10.13.

■

We can now formulate the Euler-Rodrigues result, as follows:

Theorem

We have a double cover map, obtained via the adjoint representation,

[[math]] SU_2\to SO_3 [[/math]]

and this map produces the Euler-Rodrigues formula

[[math]] U=\begin{pmatrix} x^2+y^2-z^2-t^2&2(yz-xt)&2(xz+yt)\\ 2(xt+yz)&x^2+z^2-y^2-t^2&2(zt-xy)\\ 2(yt-xz)&2(xy+zt)&x^2+t^2-y^2-z^2 \end{pmatrix} [[/math]]

for the generic elements of [math]SO_3[/math].

Show Proof

We know from the above that we have a group morphism [math]SU_2\to SO_3[/math], given by the formula in the statement, and the problem now is that of proving that this is a double cover map, in the sense that it is surjective, and with kernel [math]\{\pm1\}[/math].

(1) Regarding the kernel, this is elementary to compute, as follows:

[[math]] \begin{eqnarray*} \ker(SU_2\to SO_3) &=&\left\{U\in SU_2\Big|T_U(M)=M,\forall M\in E\right\}\\ &=&\left\{U\in SU_2\Big|UM=MU,\forall M\in E\right\}\\ &=&\left\{U\in SU_2\Big|Uc_i=c_iU,\forall i\right\}\\ &=&\{\pm1\} \end{eqnarray*} [[/math]]

(2) Thus, we are done with this, and as a side remark here, this result shows that our morphism [math]SU_2\to SO_3[/math] is ultimately a morphism as follows:

[[math]] PU_2\subset SO_3\quad,\quad PU_2=SU_2/\{\pm1\} [[/math]]

Here [math]P[/math] stands for “projective”, and it is possible to say more about the construction [math]G\to PG[/math], which can be performed for any subgroup [math]G\subset U_N[/math]. But we will not get here into this, our next goal being anyway that of proving that we have [math]PU_2=SO_3[/math].

(3) We must prove now that the morphism [math]SU_2\to SO_3[/math] is surjective. This is something non-trivial, and there are several advanced proofs for this, as follows:

-- A first proof is by using Lie theory. To be more precise, the tangent spaces at [math]1[/math] of both [math]SU_2[/math] and [math]SO_3[/math] can be explicitely computed, by doing some linear algebra, and the morphism [math]SU_2\to SO_3[/math] follows to be surjective around 1, and then globally.

-- Another proof is via representation theory. Indeed, the representations of [math]SU_2[/math] and [math]SO_3[/math] are subject to very similar formulae, called Clebsch-Gordan rules, and this shows that [math]SU_2\to SO_3[/math] is surjective. We will discuss this in chapter 14 below.

-- Yet another advanced proof, which is actually quite bordeline for what can be called “proof”, is by using the ADE/McKay classification of the subgroups [math]G\subset SO_3[/math], which shows that there is no room strictly inside [math]SO_3[/math] for something as big as [math]PU_2[/math].

(4) In short, with some good knowledge of group theory, we are done. However, this is not our case, and we will present in what follows a more pedestrian proof, which was actually the original proof, based on the fact that any rotation [math]U\in SO_3[/math] has an axis.

(5) As a first computation, let us prove that any rotation [math]U\in Im(SU_2\to SO_3)[/math] has an axis. We must look for fixed points of such rotations, and by linearity it is enough to look for fixed points belonging to the sphere [math]S^2_\mathbb R\subset\mathbb R^3[/math]. Now recall that in our picture for the quotient map [math]SU_2\to SO_3[/math], the space [math]\mathbb R^3[/math] appears as [math]F=span_\mathbb R(c_2,c_3,c_4)[/math], naturally embedded into the space [math]\mathbb R^4[/math] appearing as [math]E=span_\mathbb R(c_1,c_2,c_3,c_4)[/math]. Thus, we must look for fixed points belonging to the sphere [math]S^3_\mathbb R\subset\mathbb R^4[/math] whose first coordinate vanishes. But, in our [math]\mathbb R^4=E[/math] picture, this sphere [math]S^3_\mathbb R[/math] is the group [math]SU_2[/math]. Thus, we must look for fixed points [math]V\in SU_2[/math] whose first coordinate with respect to [math]c_1,c_2,c_3,c_4[/math] vanishes, which amounts in saying that the diagonal entries of [math]V[/math] must be purely imaginary numbers.

(6) Long story short, via our various identifications, we are led into solving the equation [math]UV=VU[/math] with [math]U,V\in SU_2[/math], and with [math]V[/math] having a purely imaginary diagonal. So, with standard notations for [math]SU_2[/math], we must solve the following equation, with [math]p\in i\mathbb R[/math]:

[[math]] \begin{pmatrix}a&b\\-\bar{b}&\bar{a}\end{pmatrix} \begin{pmatrix}p&q\\-\bar{q}&\bar{p}\end{pmatrix} =\begin{pmatrix}p&q\\-\bar{q}&\bar{p}\end{pmatrix} \begin{pmatrix}a&b\\-\bar{b}&\bar{a}\end{pmatrix} [[/math]]

(7) But this is something which is routine. Indeed, by identifying coefficients we obtain the following equations, each appearing twice:

[[math]] b\bar{q}=\bar{b}q\quad,\quad b(p-\bar{p})=(a-\bar{a})q [[/math]]

In the case [math]b=0[/math] the only equation which is left is [math]q=0[/math], and reminding that we must have [math]p\in i\mathbb R[/math], we do have solutions, namely two of them, as follows:

[[math]] V=\pm\begin{pmatrix}i&0\\0&i\end{pmatrix} [[/math]]

(8) In the remaining case [math]b\neq0[/math], the first equation reads [math]b\bar{q}\in\mathbb R[/math], so we must have [math]q=\lambda b[/math] with [math]\lambda\in\mathbb R[/math]. Now with this substitution made, the second equation reads [math]p-\bar{p}=\lambda(a-\bar{a})[/math], and since we must have [math]p\in i\mathbb R[/math], this gives [math]2p=\lambda(a-\bar{a})[/math]. Thus, our equations are:

[[math]] q=\lambda b\quad,\quad p=\lambda\cdot\frac{a-\bar{a}}{2} [[/math]]

Getting back now to our problem about finding fixed points, assuming [math]|a|^2+|b|^2=1[/math] we must find [math]\lambda\in\mathbb R[/math] such that the above numbers [math]p,q[/math] satisfy [math]|p|^2+|q|^2=1[/math]. But:

[[math]] \begin{eqnarray*} |p|^2+|q|^2 &=&|\lambda b|^2+\left|\lambda\cdot\frac{a-\bar{a}}{2}\right|^2\\ &=&\lambda^2(|b|^2+Im(a)^2)\\ &=&\lambda^2(1-Re(a)^2) \end{eqnarray*} [[/math]]

Thus, we have again two solutions to our fixed point problem, given by:

[[math]] \lambda=\pm\frac{1}{\sqrt{1-Re(a)^2}} [[/math]]

(9) Summarizing, we have proved that any rotation [math]U\in Im(SU_2\to SO_3)[/math] has an axis, and with the direction of this axis, corresponding to a pair of opposite points on the sphere [math]S^2_\mathbb R\subset\mathbb R^3[/math], being given by the above formulae, via [math]S^2_\mathbb R\subset S^3_\mathbb R=SU_2[/math].

(10) In order to finish, we must argue that any rotation [math]U\in SO_3[/math] has an axis. But this follows for instance from some topology, by using the induced map [math]S^2_\mathbb R\to S^2_\mathbb R[/math]. Now since [math]U\in SO_3[/math] is uniquely determined by its rotation axis, which can be regarded as a point of [math]S^2_\mathbb R/\{\pm1\}[/math], plus its rotation angle [math]t\in[0,2\pi)[/math], by using [math]S^2_\mathbb R\subset S^3_\mathbb R=SU_2[/math] as in (9) we are led to the conclusion that [math]U[/math] is uniquely determined by an element of [math]SU_2/\{\pm 1\}[/math], and so appears indeed via the Euler-Rodrigues formula, as desired.

■

So long for the Euler-Rodrigues formula. As already mentioned, all the above is just the tip of the iceberg, and there are many more things that can be said, which are all interesting, and worth learning. In what concerns us, we will be back to this in chapter 14 below, when doing representation theory, with some further comments on all this.

Regarding now [math]O_3[/math], the extension from [math]SO_3[/math] is very simple, as follows:

Theorem

We have the Euler-Rodrigues formula

[[math]] U=\pm\begin{pmatrix} x^2+y^2-z^2-t^2&2(yz-xt)&2(xz+yt)\\ 2(xt+yz)&x^2+z^2-y^2-t^2&2(zt-xy)\\ 2(yt-xz)&2(xy+zt)&x^2+t^2-y^2-z^2 \end{pmatrix} [[/math]]

for the generic elements of [math]O_3[/math].

Show Proof

This follows from Theorem 10.15, because the determinant of an orthogonal matrix [math]U\in O_3[/math] must satisfy [math]\det U=\pm1[/math], and in the case [math]\det U=-1[/math], we have:

[[math]] \det(-U) =(-1)^3\det U =-\det U =1 [[/math]]

Thus, assuming [math]\det U=-1[/math], we can therefore rescale [math]U[/math] into an element [math]-U\in SO_3[/math], and this leads to the conclusion in the statement.

■

With the above small [math]N[/math] examples worked out, let us discuss now the general theory, at arbitrary values of [math]N\in\mathbb N[/math]. In the real case, we have the following result:

Proposition

We have a decomposition as follows, with [math]SO_N^{-1}[/math] consisting by definition of the orthogonal matrices having determinant [math]-1[/math]:

[[math]] O_N=SO_N\cup SO_N^{-1} [[/math]]

Moreover, when [math]N[/math] is odd the set [math]SO_N^{-1}[/math] is simply given by [math]SO_N^{-1}=-SO_N[/math].

Show Proof

The first assertion is clear from definitions, because the determinant of an orthogonal matrix must be [math]\pm1[/math]. The second assertion is clear too, and we have seen this already at [math]N=3[/math], in the proof of Theorem 10.16. Finally, when [math]N[/math] is even the situation is more complicated, and requires complex numbers. We will be back to this.

■

In the complex case now, the result is simpler, as follows:

Proposition

We have a decomposition as follows, with [math]SU_N^d[/math] consisting by definition of the unitary matrices having determinant [math]d\in\mathbb T[/math]:

[[math]] O_N=\bigcup_{d\in\mathbb T}SU_N^d [[/math]]

Moreover, the components are [math]SU_N^d=f\cdot SU_N[/math], where [math]f\in\mathbb T[/math] is such that [math]f^N=d[/math].

Show Proof

This is clear from definitions, and from the fact that the determinant of a unitary matrix belongs to [math]\mathbb T[/math], by extracting a suitable square root of the determinant.

■

It is possible to use the decomposition in Proposition 10.18 in order to say more about what happens in the real case, in the context of Proposition 10.17, but we will not get into this. We will basically stop here with our study of [math]O_N,U_N[/math], and of their versions [math]SO_N,SU_N[/math]. As a last result on the subject, however, let us record:

Theorem

We have subgroups of [math]O_N,U_N[/math] constructed via the condition

[[math]] (\det U)^d=1 [[/math]]

with [math]d\in\mathbb N\cup\{\infty\}[/math], which generalize both [math]O_N,U_N[/math] and [math]SO_N,SU_N[/math].

Show Proof

This is indeed from definitions, and from the multiplicativity property of the determinant. We will be back to these groups, which are quite specialized, later on.

■

10d. Symplectic groups

At a more specialized level now, we first have the groups [math]B_N,C_N[/math], consisting of the orthogonal and unitary bistochastic matrices. Let us start with:

Definition

A square matrix [math]M\in M_N(\mathbb C)[/math] is called bistochastic if each row and each column sum up to the same number:

[[math]] \begin{matrix} M_{11}&\ldots&M_{1N}&\to&\lambda\\ \vdots&&\vdots\\ M_{N1}&\ldots&M_{NN}&\to&\lambda\\ \downarrow&&\downarrow\\ \lambda&&\lambda \end{matrix} [[/math]]

If this happens only for the rows, or only for the columns, the matrix is called row-stochastic, respectively column-stochastic.

In what follows we will be interested in the unitary bistochastic matrices, which are quite interesting objects. As a first result, regarding such matrices, we have:

Proposition

For a unitary matrix [math]U\in U_N[/math], the following are equivalent:

[math]H[/math] is bistochastic, with sums [math]\lambda[/math].
[math]H[/math] is row stochastic, with sums [math]\lambda[/math], and [math]|\lambda|=1[/math].
[math]H[/math] is column stochastic, with sums [math]\lambda[/math], and [math]|\lambda|=1[/math].

Show Proof

This is something that we know from chapter 7, with [math](1)\iff(2)[/math] being elementary, and with the further equivalence with (3) coming by symmetry.

■

The unitary bistochastic matrices are stable under a number of operations, and in particular under taking products. Thus, these matrices form a group. We have:

Theorem

The real and complex bistochastic groups, which are the sets

[[math]] B_N\subset O_N\quad,\quad C_N\subset U_N [[/math]]

consisting of matrices which are bistochastic, are isomorphic to [math]O_{N-1}[/math], [math]U_{N-1}[/math].

Show Proof

This is something that we know too from chapter 7. To be more precise, let us pick a matrix [math]F\in U_N[/math], such as the Fourier matrix [math]F_N[/math], satisfying the following condition, where [math]e_0,\ldots,e_{N-1}[/math] is the standard basis of [math]\mathbb C^N[/math], and where [math]\xi[/math] is the all-one vector:

[[math]] Fe_0=\frac{1}{\sqrt{N}}\xi [[/math]]

We have then, by using the above property of [math]F[/math]:

[[math]] \begin{eqnarray*} u\xi=\xi &\iff&uFe_0=Fe_0\\ &\iff&F^*uFe_0=e_0\\ &\iff&F^*uF=diag(1,w) \end{eqnarray*} [[/math]]

Thus we have isomorphisms as in the statement, given by [math]w_{ij}\to(F^*uF)_{ij}[/math].

■

We will be back to [math]B_N,C_N[/math] later. Moving ahead now, as yet another basic example of a continuous group, we have the symplectic group [math]Sp_N[/math]. Let us begin with:

Definition

The “super-space” [math]\bar{\mathbb C}^N[/math] is the usual space [math]\mathbb C^N[/math], with its standard basis [math]\{e_1,\ldots,e_N\}[/math], with a chosen sign [math]\varepsilon=\pm 1[/math], and a chosen involution on the indices:

[[math]] i\to\bar{i} [[/math]]

The “super-identity” matrix is [math]J_{ij}=\delta_{i\bar{j}}[/math] for [math]i\leq j[/math] and [math]J_{ij}=\varepsilon\delta_{i\bar{j}}[/math] for [math]i\geq j[/math].

Up to a permutation of the indices, we have a decomposition [math]N=2p+q[/math], such that the involution is, in standard permutation notation:

[[math]] (12)\ldots (2p-1,2p)(2p+1)\ldots (q) [[/math]]

Thus, up to a base change, the super-identity is as follows, where [math]N=2p+q[/math] and [math]\varepsilon=\pm 1[/math], with the [math]1_q[/math] block at right disappearing if [math]\varepsilon=-1[/math]:

[[math]] J=\begin{pmatrix} 0&1\ \ \ \\ \varepsilon 1&0_{(0)}\\ &&\ddots\\ &&&0&1\ \ \ \\ &&&\varepsilon 1&0_{(p)}\\ &&&&&1_{(1)}\\ &&&&&&\ddots\\ &&&&&&&1_{(q)} \end{pmatrix} [[/math]]

In the case [math]\varepsilon=1[/math], the super-identity is the following matrix:

[[math]] J_+(p,q)=\begin{pmatrix} 0&1\ \ \ \\ 1&0_{(1)}\\ &&\ddots\\ &&&0&1\ \ \ \\ &&&1&0_{(p)}\\ &&&&&1_{(1)}\\ &&&&&&\ddots\\ &&&&&&&1_{(q)} \end{pmatrix} [[/math]]

In the case [math]\varepsilon=-1[/math] now, the diagonal terms vanish, and the super-identity is:

[[math]] J_-(p,0)=\begin{pmatrix} 0&1\ \ \ \\ -1&0_{(1)}\\ &&\ddots\\ &&&0&1\ \ \ \\ &&&-1&0_{(p)} \end{pmatrix} [[/math]]

With the above notions in hand, we have the following result:

Theorem

The super-orthogonal group, which is by definition

[[math]] \bar{O}_N=\left\{U\in U_N\Big|U=J\bar{U}J^{-1}\right\} [[/math]]

with [math]J[/math] being the super-identity matrix, is as follows:

At [math]\varepsilon=1[/math] we have [math]\bar{O}_N=O_N[/math].
At [math]\varepsilon=-1[/math] we have [math]\bar{O}_N=Sp_N[/math].

Show Proof

These results are both elementary, as follows:

(1) At [math]\varepsilon=-1[/math] this follows from definitions.

(2) At [math]\varepsilon=1[/math] now, consider the root of unity [math]\rho=e^{\pi i/4}[/math], and let:

[[math]] \Gamma=\frac{1}{\sqrt{2}}\begin{pmatrix}\rho&\rho^7\\ \rho^3&\rho^5\end{pmatrix} [[/math]]

Then this matrix [math]\Gamma[/math] is unitary, and we have the following formula:

[[math]] \Gamma\begin{pmatrix}0&1\\1&0\end{pmatrix}\Gamma^t=1 [[/math]]

Thus the following matrix is unitary as well, and satisfies [math]CJC^t=1[/math]:

[[math]] C=\begin{pmatrix}\Gamma^{(1)}\\&\ddots\\&&\Gamma^{(p)}\\&&&1_q\end{pmatrix} [[/math]]

Thus in terms of [math]V=CUC^*[/math] the relations [math]U=J\bar{U}J^{-1}=[/math] unitary simply read:

[[math]] V=\bar{V}={\rm unitary} [[/math]]

Thus we obtain an isomorphism [math]\bar{O}_N=O_N[/math] as in the statement.

■

Regarding now [math]Sp_N[/math], we have the following result:

Theorem

The symplectic group [math]Sp_N\subset U_N[/math], which is by definition

[[math]] Sp_N=\left\{U\in U_N\Big|U=J\bar{U}J^{-1}\right\} [[/math]]

consists of the [math]SU_2[/math] patterned matrices,

[[math]] U=\begin{pmatrix} a&b&\ldots\\ -\bar{b}&\bar{a}\\ \vdots&&\ddots \end{pmatrix} [[/math]]

which are unitary, [math]U\in U_N[/math]. In particular, we have [math]Sp_2=SU_2[/math].

Show Proof

This follows indeed from definitions, because the condition [math]U=J\bar{U}J^{-1}[/math] corresponds precisely to the fact that [math]U[/math] must be a [math]SU_2[/math]-patterned matrix.

■

We will be back later to the symplectic groups, towards the end of the present book, with more results about them. In the meantime, have a look at the mechanics book of Arnold ^[9], which explains what the symplectic groups and geometry are good for.

As a last topic of discussion, now that we have a decent understanding of the main continuous groups of unitary matrices [math]G\subset U_N[/math], let us go back to the finite groups from the previous chapter, and make a link with the material there. We first have:

Theorem

The full complex reflection group [math]K_N\subset U_N[/math], given by

[[math]] K_N=M_N(\mathbb T\cup\{0\})\cap U_N [[/math]]

has a wreath product decomposition as follows,

[[math]] K_N=\mathbb T\wr S_N [[/math]]

with [math]S_N[/math] acting on [math]\mathbb T^N[/math] in the standard way, by permuting the factors.

Show Proof

This is something that we know from chapter 9, appearing as the [math]s=\infty[/math] particular case of the results established there for the complex reflection groups [math]H_N^s[/math].

■

By using the above full complex reflection group [math]K_N[/math], we can talk in fact about the reflection subgroup of any compact group [math]G\subset U_N[/math], as follows:

Definition

Given [math]G\subset U_N[/math], we define its reflection subgroup to be

[[math]] K=G\cap K_N [[/math]]

with the intersection taken inside [math]U_N[/math].

This notion is something quite interesting, leading us into the question of understanding what the subgroups of [math]K_N[/math] are. We have here the following construction:

Theorem

We have subgroups of the basic complex reflection groups,

[[math]] H_N^{sd}\subset H_N^s [[/math]]

constructed via the following condition, with [math]d\in\mathbb N\cup\{\infty\}[/math],

[[math]] (\det U)^d=1 [[/math]]

which generalize all the complex reflection groups that we have so far.

Show Proof

Here the first assertion is clear from definitions, and from the multiplicativity of the determinant. As for the second assertion, this is rather a remark, coming from the fact that the alternating group [math]A_N[/math], which is the only finite group so far not fitting into the series [math]\{H_N^s\}[/math], is indeed of this type, obtained from [math]H_N^1=S_N[/math] by using [math]d=1[/math].

■

The point now is that, by a well-known and deep result in group theory, the complex reflection groups consist of the series [math]\{H_N^{sd}\}[/math] constructed above, and of a number of exceptional groups, which can be fully classified. To be more precise, we have:

Theorem

The irreducible complex reflection groups are

[[math]] H_N^{sd}=\left\{U\in H_N^s\Big|(\det U)^d=1\right\} [[/math]]

along with [math]34[/math] exceptional examples.

Show Proof

This is something quite advanced, and we refer here to the paper of Shephard and Todd ^[10], and to the subsequent literature on the subject.

■

Getting back now to our goal, namely mixing continuous and finite subgroups [math]G\subset U_N[/math], consider the following diagram, formed by the main rotation and reflection groups:

[[math]] \xymatrix@R=50pt@C=50pt{ K_N\ar[r]&U_N\\ H_N\ar[r]\ar[u]&O_N\ar[u]} [[/math]]

We know from the above that this is an intersection and generation diagram. Now assume that we have an intermediate compact group, as follows:

[[math]] H_N\subset G_N\subset U_N [[/math]]

The point is that we can think of this group as living inside the above square, and so project it on the edges, as to obtain information about it. Indeed, let us start with:

Definition

Associated to any closed subgroup [math]G_N\subset U_N[/math] are its discrete, real, unitary and smooth versions, given by the formulae

[[math]] G_N^d=G_N\cap K_N\quad,\quad G_N^r=G_N\cap O_N [[/math]]

[[math]] G_N^u= \lt G_N,K_N \gt \quad,\quad G_N^s= \lt G_N,O_N \gt [[/math]]

with [math] \lt \,, \gt [/math] being the topological generation operation.

Assuming now that we have an intermediate compact group [math]H_N\subset G_N\subset U_N[/math], as above, we are led in this way to the following notion:

Definition

A compact group [math]H_N\subset G_N\subset U_N[/math] is called oriented if

[[math]] \xymatrix@R=40pt@C=40pt{ K_N\ar[r]&G_N^u\ar[r]&U_N\\ G_N^d\ar[u]\ar[r]&G_N\ar[r]\ar[u]&G_N^s\ar[u]\\ H_N\ar[r]\ar[u]&G_N^r\ar[u]\ar[r]&O_N\ar[u]} [[/math]]

is an intersection and generation diagram.

This notion is quite interesting, because most of our basic examples of closed subgroups [math]G_N\subset U_N[/math], finite or continuous, are oriented. Moreover, the world of oriented groups is quite rigid, due to either of the following conditions, which must be satisfied:

[[math]] G_N= \lt G_N^d,G_N^r \gt \quad,\quad G_N=G_N^u\cap G_N^s [[/math]]

Summarizing, we are naturally led in this way to the following question, which is certainly interesting, and is related to all that has been said above, about groups: \begin{question} What are the oriented groups [math]H_N\subset G_N\subset U_N[/math]? What about the oriented groups coming in families, [math]G=(G_N)[/math], with [math]N\in\mathbb N[/math]? \end{question} And we will stop here our discussion, sometimes a good question is better as hunting trophy than a final theorem, or at least that's what my cats say. We will be back to this in chapters 13-16 below, under a number of supplementary assumptions on the groups that we consider, which will allow us to derive a number of classification results.

General references

Banica, Teo (2024). "Linear algebra and group theory". arXiv:2206.09283 [math.CO].

References

S. Lang, Algebra, Addison-Wesley (1993).
I.R. Shafarevich, Basic algebraic geometry, Springer (1974).
J. Harris, Algebraic geometry, Springer (1992).
J.E. Humphreys, Introduction to Lie algebras and representation theory, Springer (1972).
J.P. Serre, Linear representations of finite groups, Springer (1977).
R.P. Feynman, R.B. Leighton and M. Sands, The Feynman lectures on physics III: quantum mechanics, Caltech (1966).
D.J. Griffiths and D.F. Schroeter, Introduction to quantum mechanics, Cambridge Univ. Press (2018).
S. Weinberg, Lectures on quantum mechanics, Cambridge Univ. Press (2012).
V.I. Arnold, Mathematical methods of classical mechanics, Springer (1974).
G.C. Shephard and J.A. Todd, Finite unitary reflection groups, Canad. J. Math. 6 (1954), 274--304.

[lan-1] S. Lang, Algebra, Addison-Wesley (1993).

[sha-2] I.R. Shafarevich, Basic algebraic geometry, Springer (1974).

[har-3] J. Harris, Algebraic geometry, Springer (1992).

[hum-4] J.E. Humphreys, Introduction to Lie algebras and representation theory, Springer (1972).

[ser-5] J.P. Serre, Linear representations of finite groups, Springer (1977).

[fe3-6] R.P. Feynman, R.B. Leighton and M. Sands, The Feynman lectures on physics III: quantum mechanics, Caltech (1966).

[gr2-7] D.J. Griffiths and D.F. Schroeter, Introduction to quantum mechanics, Cambridge Univ. Press (2018).

[we2-8] S. Weinberg, Lectures on quantum mechanics, Cambridge Univ. Press (2012).

[ar2-9] V.I. Arnold, Mathematical methods of classical mechanics, Springer (1974).

[sto-10] G.C. Shephard and J.A. Todd, Finite unitary reflection groups, Canad. J. Math. 6 (1954), 274--304.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]