Special matrices

[math] \newcommand{\mathds}{\mathbb}[/math]

7a. Fourier matrices

In this chapter we go back to basic linear algebra, and we discuss a number of more specialized questions. We will be interested in various classes of “special matrices”, and in the tools for dealing with them. Our motivations are as follows:


(1) There are many interesting groups of matrices [math]G\subset U_N[/math], that we will study later in this book, and the individual elements [math]U\in G[/math], which are defined by some particular conditions, can be usually thought of as being “special matrices”. Thus, our study of the special matrices will be a good introduction to the groups of matrices [math]G\subset U_N[/math].


(2) Less abstractly, we will see that the study of the special matrices leads us into Fourier analysis, and more specifically into “discrete Fourier analysis”. Although nothing beats classical Fourier analysis, that can be learned for instance from Rudin [1], [2], having something here on discrete Fourier analysis will be certainly a good thing.


(3) Our study will come as well as a natural continuation of the linear algebra theory developed in chapters 1-4. We have seen there the foundations of the theory, basically reducing everything to the study of the matrices [math]A\in M_N(\mathbb C)[/math], and now if we want to learn more, we will certainly have to assume that our matrices [math]A[/math] are of “special type”.


(4) Finally, and totally away now from any abstraction, the special matrices that we will study here are related to all sorts of interesting physics and applications, such as quantum groups, quantum information, random matrices, coding theory, and many more. So, we will learn here interesting and useful theory, that can be applied afterwards.


Getting started now, and for having a taste of what we want to do, in this chapter, as a first and central example of a special matrix, we have the flat matrix:

Definition

The flat matrix [math]\mathbb I_N[/math] is the all-one [math]N\times N[/math] matrix:

[[math]] \mathbb I_N=\begin{pmatrix} 1&\ldots&1\\ \vdots&&\vdots\\ 1&\ldots&1\end{pmatrix} [[/math]]
Equivalently, [math]\mathbb I_N/N[/math] is the orthogonal projection on the all-one vector [math]\xi\in\mathbb C^N[/math].

Observe that [math]\mathbb I_N[/math] has a lot of interesting properties, such as being circulant, and bistochastic. The idea will be that many techniques that can be applied to [math]\mathbb I_N[/math], with quite trivial results, apply to such special classes of matrices, with non-trivial consequences.


A first interesting question regarding [math]\mathbb I_N[/math] concerns its diagonalization. Since [math]\mathbb I_N[/math] is a multiple of a rank 1 projection, we have right away the following result:

Proposition

The flat matrix [math]\mathbb I_N[/math] diagonalizes as follows,

[[math]] \begin{pmatrix} 1&\ldots&\ldots&1\\ \vdots&&&\vdots\\ \vdots&&&\vdots\\ 1&\ldots&\ldots&1\end{pmatrix}=P \begin{pmatrix} N\\ &0\\ &&\ddots\\ &&&0\end{pmatrix}P^{-1} [[/math]]
where [math]P\in M_N(\mathbb C)[/math] can be any matrix formed by the all one-vector [math]\xi[/math], followed by [math]N-1[/math] linearly independent solutions [math]x\in\mathbb C^N[/math] of the equation [math]x_1+\ldots+x_N=0[/math].


Show Proof

This follows indeed from our linear algebra knowledge from chapters 1-4, by using the fact that [math]\mathbb I_N/N[/math] is the orthogonal projection onto [math]\mathbb C\xi[/math].

In practice now, the problem which is left is that of finding an explicit matrix [math]P\in M_N(\mathbb C)[/math], as above. To be more precise, there are plently of solutions here, some of them being even real, [math]P\in M_N(\mathbb R)[/math], and the problem is that of finding a “nice” such solution, say having the property that [math]P_{ij}[/math] appears as an explicit function of [math]i,j[/math].


Long story short, we are led to the question of solving, in a somewhat canonical and elegant way, the following equation, over the real or the complex numbers:

[[math]] x_1+\ldots+x_N=0 [[/math]]

And this is more tricky than it seems. To be more precise, there is no hope of doing this over the real numbers. As in what regards the complex numbers, there is a ray of light here coming from the roots of unity, and more specifically, from:

Proposition

We have the formula

[[math]] \frac{1}{N}\sum_{k=0}^{N-1}z^k= \begin{cases} 0&{\rm if}\ z\neq1\\ 1&{\rm if}\ z=1 \end{cases} [[/math]]
valid for any [math]N[/math]-th root of unity [math]z[/math].


Show Proof

This is something that we know from chapter 3, coming from the fact that the average in the statement computes the barycenter of the polygon formed by the numbers [math]1,z,z^2,\ldots,z^{N-1}[/math], in the complex plane. Indeed, since this polygon is regular and centered at 0, the barycenter in question is obviously 0, and this unless we are in the case [math]z=1[/math], where the polygon degenerates, and the barycenter is obviously 1.

Summarizing, we have a beginning of an answer to our question, with the idea being that of using a matrix [math]P\in M_N(\mathbb C)[/math] formed by the [math]N[/math]-th roots of unity. To be more precise, this leads us into the Fourier matrix, which is as follows:

Definition

The Fourier matrix [math]F_N[/math] is the following matrix, with [math]w=e^{2\pi i/N}[/math]:

[[math]] F_N= \begin{pmatrix} 1&1&1&\ldots&1\\ 1&w&w^2&\ldots&w^{N-1}\\ 1&w^2&w^4&\ldots&w^{2(N-1)}\\ \vdots&\vdots&\vdots&&\vdots\\ 1&w^{N-1}&w^{2(N-1)}&\ldots&w^{(N-1)^2} \end{pmatrix} [[/math]]
That is, [math]F_N=(w^{ij})_{ij}[/math], with indices [math]i,j\in\{0,1,\ldots,N-1\}[/math], taken modulo [math]N[/math].

Before getting further, observe that this matrix [math]F_N[/math] is “special” as well, but in a different sense, its main properties being the fact that it is a Vandermonde matrix, and a rescaled unitary. We will axiomatize later on the matrices of this type.


Getting back now to the diagonalization problem for the flat matrix [math]\mathbb I_N[/math], this can be solved by using the Fourier matrix [math]F_N[/math], in the following elegant way:

Theorem

The flat matrix [math]\mathbb I_N[/math] diagonalizes as follows,

[[math]] \begin{pmatrix} 1&\ldots&\ldots&1\\ \vdots&&&\vdots\\ \vdots&&&\vdots\\ 1&\ldots&\ldots&1\end{pmatrix}=\frac{1}{N}\,F_N \begin{pmatrix} N\\ &0\\ &&\ddots\\ &&&0\end{pmatrix}F_N^* [[/math]]
with [math]F_N=(w^{ij})_{ij}[/math] being the Fourier matrix.


Show Proof

According to Proposition 7.2, we are left with finding the 0-eigenvectors of [math]\mathbb I_N[/math], which amounts in solving the following equation:

[[math]] x_0+\ldots+x_{N-1}=0 [[/math]]

For this purpose, we use the root of unity [math]w=e^{2\pi i/N}[/math], and more specifically, the following standard formula, coming from Proposition 7.3:

[[math]] \sum_{i=0}^{N-1}w^{ij}=N\delta_{j0} [[/math]]

This formula shows that for [math]j=1,\ldots,N-1[/math], the vector [math]v_j=(w^{ij})_i[/math] is a 0-eigenvector. Moreover, these vectors are pairwise orthogonal, because we have:

[[math]] \lt v_j,v_k \gt =\sum_iw^{ij-ik} =N\delta_{jk} [[/math]]

Thus, we have our basis [math]\{v_1,\ldots,v_{N-1}\}[/math] of 0-eigenvectors, and since the [math]N[/math]-eigenvector is [math]\xi=v_0[/math], the passage matrix [math]P[/math] that we are looking is given by:

[[math]] P=\begin{bmatrix}v_0&v_1&\ldots&v_{N-1}\end{bmatrix} [[/math]]

But this is precisely the Fourier matrix, [math]P=F_N[/math]. In order to finish now, observe that the above computation of [math] \lt v_i,v_j \gt [/math] shows that [math]F_N/\sqrt{N}[/math] is unitary, and so:

[[math]] F_N^{-1}=\frac{1}{N}\,F_N^* [[/math]]

Thus, we are led to the diagonalization formula in the statement.

Generally speaking, the above result will be the template for what we will be doing here. On one hand we will have special matrices to be studied, of [math]\mathbb I_N[/math] type, and on the other hand we will have special matrices that can be used as tools, of [math]F_N[/math] type.


Let us begin with a discussion of the “tools”. The Fourier matrix [math]F_N[/math] certainly has many interesting properties, but what really stands out is the fact that its entries are on the unit circle, and its columns are pairwise orthogonal. Which leads us into:

Definition

A complex Hadamard matrix is a square matrix

[[math]] H\in M_N(\mathbb T) [[/math]]
where [math]\mathbb T[/math] is the unit circle, satisfying the following equivalent conditions:

  • The rows are pairwise orthogonal.
  • The columns are pairwise orthogonal.
  • The rescaled matrix [math]H/\sqrt{N}[/math] is unitary.
  • The rescaled matrix [math]H^t/\sqrt{N}[/math] is unitary.

Here the fact that the above conditions are equivalent comes from basic linear algebra, and more specifically from the fact that a matrix [math]U\in M_N(\mathbb C)[/math] is a unitary precisely when the rows, or the columns, have norm 1, and are pairwise orthogonal.


We already know, from the proof of Theorem 7.5, that the Fourier matrix [math]F_N[/math] is a complex Hadamard matrix. There are many other examples of complex Hadamard matrices, and the basic theory of such matrices can be summarized as follows:

Theorem

The class of the [math]N\times N[/math] complex Hadamard matrices is as follows:

  • It contains the Fourier matrix [math]F_N[/math].
  • It is stable under taking tensor products.
  • It is stable under taking transposes, conjugates and adjoints.
  • It is stable under permuting rows, or permuting columns.
  • It is stable under multiplying rows or columns by numbers in [math]\mathbb T[/math].


Show Proof

All this is elementary, the idea being as follows:


(1) This is something that we already know, from the proof of Theorem 7.5, with the orthogonality coming from the following standard formula:

[[math]] \sum_iw^{ij}=N\delta_{j0} [[/math]]

(2) Assume that [math]H\in M_M(\mathbb T)[/math] and [math]K\in M_N(\mathbb T)[/math] are Hadamard matrices, and consider their tensor product, which in double index notation is as follows:

[[math]] (H\otimes K)_{ia,jb}=H_{ij}K_{ab} [[/math]]

We have then [math]H\otimes K\in M_{MN}(\mathbb T)[/math], and the rows [math]R_{ia}[/math] of this matrix are pairwise orthogonal, as shown by the following computation:

[[math]] \begin{eqnarray*} \lt R_{ia},R_{kc} \gt &=&\sum_{jb}H_{ij}K_{ab}\cdot\bar{H}_{kj}\bar{K}_{cb}\\ &=&\sum_jH_{ij}\bar{H}_{kj}\sum_bK_{ab}\bar{K}_{cb}\\ &=&MN\delta_{ik}\delta_{ac} \end{eqnarray*} [[/math]]


(3) We know that the set formed by the [math]N\times N[/math] complex Hadamard matrices appears as follows, with the intersection being taken inside [math]M_N(\mathbb C)[/math]:

[[math]] X_N=M_N(\mathbb T)\cap\sqrt{N}U_N [[/math]]

The set [math]M_N(\mathbb T)[/math] is stable under the operations in the statement. As for the set [math]\sqrt{N}U_N[/math], here we can use the well-known fact that if a matrix is unitary, [math]U\in U_N[/math], then so is its complex conjugate [math]\bar{U}=(\bar{U}_{ij})[/math], the inversion formulae being as follows:

[[math]] U^*=U^{-1}\quad,\quad U^t=\bar{U}^{-1} [[/math]]

Thus the unitary group [math]U_N[/math] is stable under the following operations:

[[math]] U\to U^t\quad,\quad U\to\bar{U}\quad,\quad U\to U^* [[/math]]

It follows that the above set [math]X_N[/math] is stable as well under these operations, as desired.


(4) This is clear from definitions, because permuting rows or permuting columns leaves invariant both the sets [math]M_N(\mathbb T)[/math] and [math]\sqrt{N}U_N[/math].


(5) This is once again clear from definitions, because multiplying rows or columns by numbers in [math]\mathbb T[/math] leaves invariant both the sets [math]M_N(\mathbb T)[/math] and [math]\sqrt{N}U_N[/math].

In the above result (1,2) are really important, and (3,4,5) are rather technical remarks. As a consequence, coming from (1,2), let us formulate:

Theorem

The following matrices, called generalized Fourier matrices,

[[math]] F_{N_1,\ldots,N_k}=F_{N_1}\otimes\ldots\otimes F_{N_k} [[/math]]
are Hadamard, for any choice of [math]N_1,\ldots,N_k[/math]. In particular the following matrices,

[[math]] W_N=\begin{pmatrix}1&1\\1&-1\end{pmatrix}^{\otimes k} [[/math]]
having size [math]N=2^k[/math], and called Walsh matrices, are all Hadamard.


Show Proof

This is a consequence of Theorem 7.7, as follows:


(1) The first assertion follows from (1,2) in Theorem 7.7.


(2) The second assertion is a consequence of (1), by taking [math]N_1=\ldots=N_k=2[/math]. Indeed, the generalized Fourier matrix that we obtain is:

[[math]] F_{2,\ldots,2} =F_2^{\otimes k} =\begin{pmatrix}1&1\\1&-1\end{pmatrix}^{\otimes k} [[/math]]

Thus, we are led to the conclusion in the statement.

As an illustration for the above result, the second Walsh matrix, which is an Hadamard matrix having real entries, as is the case with all the Walsh matrices, is as follows:

[[math]] W_4=\begin{pmatrix}1&1&1&1\\ 1&-1&1&-1\\ 1&1&-1&-1\\ 1&-1&-1&1\end{pmatrix} [[/math]]

The generalized Fourier matrices, and in particular the Walsh matrices, have many applications, to questions in coding, radio transmissions, quantum physics, and many more. We refer here for instance to the book of Bengtsson-\.Zyczkowski [3], and to the papers of Björck [4], Haagerup [5], Idel-Wolf [6], Jones [7], Sylvester [8]. We will be back to these matrices later, on several occasions.

7b. Circulant matrices

Let us go back now to the general linear algebra considerations from the beginning of this chapter. We have seen that [math]F_N[/math] diagonalizes in an elegant way the flat matrix [math]\mathbb I_N[/math], and the idea in what follows will be that of [math]F_N[/math], or other real or complex Hadamard matrices, can be used in order to deal with other matrices, of [math]\mathbb I_N[/math] type.


A first feature of the flat matrix [math]\mathbb I_N[/math] is that it is circulant, in the following sense:

Definition

A real or complex matrix [math]M[/math] is called circulant if

[[math]] M_{ij}=\xi_{j-i} [[/math]]
for a certain vector [math]\xi[/math], with the indices taken modulo [math]N[/math].

The circulant matrices are beautiful mathematical objects, which appear of course in many serious problems as well. As an example, at [math]N=4[/math], we must have:

[[math]] M=\begin{pmatrix} a&b&c&d\\ d&a&b&c\\ c&d&a&b\\ b&c&d&a \end{pmatrix} [[/math]]

The point now is that, while certainly gently looking, these matrices can be quite diabolic, when it comes to diagonalization, and other problems. For instance, when [math]M[/math] is real, the computations with [math]M[/math] are usually very complicated over the real numbers. Fortunately the complex numbers and the Fourier matrices are there, and we have:

Theorem

For a matrix [math]M\in M_N(\mathbb C)[/math], the following are equivalent:

  • [math]M[/math] is circulant, in the sense that we have, for a certain vector [math]\xi\in\mathbb C^N[/math]:
    [[math]] M_{ij}=\xi_{j-i} [[/math]]
  • [math]M[/math] is Fourier-diagonal, in the sense that, for a certain diagonal matrix [math]Q[/math]:
    [[math]] M=F_NQF_N^* [[/math]]

In addition, if these conditions hold, then [math]\xi,Q[/math] are related by the formula [math]\xi=F_N^*q[/math], where [math]q\in\mathbb C^N[/math] is the column vector formed by the diagonal entries of [math]Q[/math].


Show Proof

This follows from some basic computations with roots of unity, as follows:


[math](1)\implies(2)[/math] Assuming [math]M_{ij}=\xi_{j-i}[/math], the matrix [math]Q=F_N^*MF_N[/math] is indeed diagonal, as shown by the following computation:

[[math]] \begin{eqnarray*} Q_{ij} &=&\sum_{kl}w^{-ik}M_{kl}w^{lj}\\ &=&\sum_{kl}w^{jl-ik}\xi_{l-k}\\ &=&\sum_{kr}w^{j(k+r)-ik}\xi_r\\ &=&\sum_rw^{jr}\xi_r\sum_kw^{(j-i)k}\\ &=&N\delta_{ij}\sum_rw^{jr}\xi_r \end{eqnarray*} [[/math]]


[math](2)\implies(1)[/math] Assuming [math]Q=diag(q_1,\ldots,q_N)[/math], the matrix [math]M=F_NQF_N^*[/math] is indeed circulant, as shown by the following computation:

[[math]] M_{ij} =\sum_kw^{ik}Q_{kk}w^{-jk} =\sum_kw^{(i-j)k}q_k [[/math]]

To be more precise, in this formula the last term depends only on [math]j-i[/math], and so shows that we have [math]M_{ij}=\xi_{j-i}[/math], with [math]\xi[/math] being the following vector:

[[math]] \xi_i =\sum_kw^{-ik}q_k =(F_N^*q)_i [[/math]]

Thus, we are led to the conclusions in the statement.

As a basic illustration for the above result, for the circulant matrix [math]M=\mathbb I_N[/math] we recover in this way the diagonalization result from Theorem 7.5, namely:

[[math]] \begin{pmatrix} 1&\ldots&\ldots&1\\ \vdots&&&\vdots\\ \vdots&&&\vdots\\ 1&\ldots&\ldots&1\end{pmatrix}=\frac{1}{N}\,F_N \begin{pmatrix} N\\ &0\\ &&\ddots\\ &&&0\end{pmatrix}F_N^* [[/math]]

The above result is something quite powerful, and very useful, and suggests doing everything in Fourier, when dealing with circulant matrices. And we can use here:

Theorem

The various basic sets of [math]N\times N[/math] circulant matrices are as follows, with the convention that associated to any [math]q\in\mathbb C^N[/math] is the matrix [math]Q=diag(q_1,\ldots,q_N)[/math]:

  • The set of all circulant matrices is:
    [[math]] M_N(\mathbb C)^{circ}=\left\{F_NQF_N^*\Big|q\in\mathbb C^N\right\} [[/math]]
  • The set of all circulant unitary matrices is:
    [[math]] U_N^{circ}=\left\{\frac{1}{N}F_NQF_N^*\Big|q\in\mathbb T^N\right\} [[/math]]
  • The set of all circulant orthogonal matrices is:
    [[math]] O_N^{circ}=\left\{\frac{1}{N}F_NQF_N^*\Big|q\in\mathbb T^N,\bar{q}_i=q_{-i},\forall i\right\} [[/math]]

In addition, in this picture, the first row vector of [math]F_NQF_N^*[/math] is given by [math]\xi=F_N^*q[/math].


Show Proof

All this follows from Theorem 7.10, as follows:


(1) This assertion, along with the last one, is Theorem 7.10 itself.


(2) This is clear from (1), and from the fact that the rescaled matrix [math]F_N/\sqrt{N}[/math] is unitary, because the eigenvalues of a unitary matrix must be on the unit circle [math]\mathbb T[/math].


(3) This follows from (2), because the matrix is real when [math]\xi_i=\bar{\xi}_i[/math], and in Fourier transform, [math]\xi=F_N^*q[/math], this corresponds to the condition [math]\bar{q}_i=q_{-i}[/math].

There are many other things that can be said about the circulant matrices, along these lines. Importantly, all this can be generalized to the setting of the matrices which are [math](N_1,\ldots,N_k)[/math] patterned, in a certain technical sense, and the matrix which does the job here is the corresponding generalized Fourier matrix, namely:

[[math]] F_{N_1,\ldots,N_k}=F_{N_1}\otimes\ldots\otimes F_{N_k} [[/math]]

We will be back to this later, towards the end of the present book. As a last topic now regarding the circulant matrices, which is somehow one level above the various considerations above, let us discuss the circulant Hadamard matrices. It is convenient to use the following notion, coming from the various results in Theorem 7.7:

Definition

Two complex Hadamard matrices are called equivalent, and we write [math]H\sim K[/math], when it is possible to pass from [math]H[/math] to [math]K[/math] via the following operations:

  • Permuting the rows, or permuting the columns.
  • Multiplying the rows or columns by numbers in [math]\mathbb T[/math].

Here we have not taken into account all the results in Theorem 7.7 when formulating the above definition, because the operations [math]H\to H^t,\bar{H},H^*[/math] are far more subtle than those in (1,2) above, and can complicate things, if included in the equivalence.


The point now is that, up to equivalence, we can put the Fourier matrix [math]F_N[/math] in circulant form. At small values of [math]N[/math], this can be done as follows:

Proposition

The following are circulant and symmetric Hadamard matrices,

[[math]] F_2'=\begin{pmatrix}i&1\\1&i\end{pmatrix}\qquad,\qquad F_3'=\begin{pmatrix}w&1&1\\1&w&1\\1&1&w\end{pmatrix} [[/math]]

[[math]] F_4''=\begin{pmatrix}-1&\nu&1&\nu\\\nu&-1&\nu&1\\1&\nu&-1&\nu\\ \nu&1&\nu&-1\end{pmatrix} [[/math]]
where [math]w=e^{2\pi i/3},\nu=e^{\pi i/4}[/math], equivalent to the Fourier matrices [math]F_2,F_3,F_4[/math].


Show Proof

The orthogonality between rows being clear, we have here complex Hadamard matrices. The fact that we have an equivalence [math]F_2\sim F_2'[/math] follows from:

[[math]] \begin{pmatrix}1&1\\1&-1\end{pmatrix} \sim\begin{pmatrix}i&i\\1&-1\end{pmatrix} \sim\begin{pmatrix}i&1\\1&i\end{pmatrix} [[/math]]

At [math]N=3[/math] now, the equivalence [math]F_3\sim F_3'[/math] can be constructed as follows:

[[math]] \begin{pmatrix}1&1&1\\1&w&w^2\\1&w^2&w\end{pmatrix} \sim\begin{pmatrix}1&1&w\\1&w&1\\w&1&1\end{pmatrix} \sim\begin{pmatrix}w&1&1\\1&w&1\\1&1&w\end{pmatrix} [[/math]]

As for the case [math]N=4[/math], here the equivalence [math]F_4\sim F_4''[/math] can be constructed as follows, where we use the logarithmic notation [math][k]_s=e^{2\pi ki/s}[/math], with respect to [math]s=8[/math]:

[[math]] \begin{bmatrix}0&0&0&0\\0&2&4&6\\0&4&0&4\\0&6&4&2\end{bmatrix}_8 \sim\begin{bmatrix}0&1&4&1\\1&4&1&0\\4&1&0&1\\1&0&1&4\end{bmatrix}_8 \sim\begin{bmatrix}4&1&0&1\\1&4&1&0\\0&1&4&1\\1&0&1&4\end{bmatrix}_8 [[/math]]

Thus, the Fourier matrices [math]F_2,F_3,F_4[/math] can be put indeed in circulant form.

We will explain later the reasons for denoting the above matrix [math]F_4''[/math], instead of [math]F_4'[/math], the idea being that [math]F_4'[/math] will be a matrix belonging to a certain series.


In order to discuss now the general case, we will use a technical method for dealing with the circulant matrices, namely Björck's cyclic root formalism [4], as follows:

Theorem

Assume that a matrix [math]H\in M_N(\mathbb T)[/math] is circulant, [math]H_{ij}=\gamma_{j-i}[/math]. Then [math]H[/math] is is a complex Hadamard matrix if and only if the vector

[[math]] z=(z_0,z_1,\ldots,z_{N-1}) [[/math]]
given by [math]z_i=\gamma_i/\gamma_{i-1}[/math] satisfies the following equations:

[[math]] \begin{eqnarray*} z_0+z_1+\ldots+z_{N-1}&=&0\\ z_0z_1+z_1z_2+\ldots+z_{N-1}z_0&=&0\\ \ldots\\ z_0z_1\ldots z_{N-2}+\ldots+z_{N-1}z_0\ldots z_{N-3}&=&0\\ z_0z_1\ldots z_{N-1}&=&1 \end{eqnarray*} [[/math]]
If so is the case, we say that [math]z=(z_0,\ldots,z_{N-1})[/math] is a cyclic [math]N[/math]-root.


Show Proof

This follows from a direct computation, the idea being that, with [math]H_{ij}=\gamma_{j-i}[/math] as above, the orthogonality conditions between the rows are best written in terms of the variables [math]z_i=\gamma_i/\gamma_{i-1}[/math], and correspond to the equations in the statement.

Now back to the Fourier matrices, we have the following result:

Theorem

Given [math]N\in\mathbb N[/math], construct the following complex numbers:

[[math]] \nu=e^{\pi i/N}\quad,\quad q=\nu^{N-1}\quad,\quad w=\nu^2 [[/math]]
We have then a cyclic [math]N[/math]-root, given by the following formula,

[[math]] (q,qw,qw^2,\ldots,qw^{N-1}) [[/math]]
and the corresponding complex Hadamard matrix [math]F_N'[/math] is circulant and symmetric, and equivalent to the Fourier matrix [math]F_N[/math].


Show Proof

Given two numbers [math]q,w\in\mathbb T[/math], let us find out when [math](q,qw,qw^2,\ldots,qw^{N-1})[/math] is a cyclic root. We have two conditions to be verified, as follows:


(1) In order for the [math]=0[/math] equations in Theorem 7.14 to be satisfied, the value of [math]q[/math] is irrelevant, and [math]w[/math] must be a primitive [math]N[/math]-root of unity.


(2) As for the [math]=1[/math] equation in Theorem 7.14, this states that we must have:

[[math]] q^Nw^{\frac{N(N-1)}{2}}=1 [[/math]]

Thus, we must have [math]q^N=(-1)^{N-1}[/math], so with the values of [math]q,w\in\mathbb T[/math] in the statement, we have a cyclic [math]N[/math]-root. Now construct [math]H_{ij}=\gamma_{j-i}[/math] as in Theorem 7.14. We have:

[[math]] \begin{eqnarray*} \gamma_k=\gamma_{-k} &\iff&q^{k+1}w^{\frac{k(k+1)}{2}}=q^{-k+1}w^{\frac{k(k-1)}{2}}\\ &\iff&q^{2k}w^k=1\\ &\iff&q^2=w^{-1} \end{eqnarray*} [[/math]]


But this latter condition holds indeed, because we have:

[[math]] q^2 =\nu^{2N-2} =\nu^{-2} =w^{-1} [[/math]]

We conclude that our circulant matrix [math]H[/math] is symmetric as well, as claimed. It remains to construct an equivalence [math]H\sim F_N[/math]. In order to do this, observe that, due to our conventions [math]q=\nu^{N-1},w=\nu^2[/math], the first row vector of [math]H[/math] is given by:

[[math]] \begin{eqnarray*} \gamma_k &=&q^{k+1}w^{\frac{k(k+1)}{2}}\\ &=&\nu^{(N-1)(k+1)}\nu^{k(k+1)}\\ &=&\nu^{(N+k-1)(k+1)} \end{eqnarray*} [[/math]]


Thus, the entries of [math]H[/math] are given by the following formula:

[[math]] \begin{eqnarray*} H_{-i,j} &=&H_{0,i+j}\\ &=&\nu^{(N+i+j-1)(i+j+1)}\\ &=&\nu^{i^2+j^2+2ij+Ni+Nj+N-1}\\ &=&\nu^{N-1}\cdot\nu^{i^2+Ni}\cdot\nu^{j^2+Nj}\cdot\nu^{2ij} \end{eqnarray*} [[/math]]


With this formula in hand, we can now finish. Indeed, the matrix [math]H=(H_{ij})[/math] is equivalent to the following matrix:

[[math]] H'=(H_{-i,j}) [[/math]]

Now regarding this latter matrix [math]H'[/math], observe that in the above formula, the factors [math]\nu^{N-1}[/math], [math]\nu^{i^2+Ni}[/math], [math]\nu^{j^2+Nj}[/math] correspond respectively to a global multiplication by a scalar, and to row and column multiplications by scalars. Thus [math]H'[/math] is equivalent to the matrix [math]H''[/math] obtained from it by deleting these factors. But this latter matrix, given by [math]H''_{ij}=\nu^{2ij}[/math] with [math]\nu=e^{\pi i/N}[/math], is precisely the Fourier matrix [math]F_N[/math], and we are done.

As an illustration, at [math]N=2,3[/math] we obtain the old matrices [math]F_2',F_3'[/math]. As for the case [math]N=4[/math], here we obtain the following matrix, with [math]\nu=e^{\pi i/4}[/math]:

[[math]] F_4'=\begin{pmatrix} \nu^3&1&\nu^7&1\\ 1&\nu^3&1&\nu^7\\ \nu^7&1&\nu^3&1\\ 1&\nu^7&1&\nu^3 \end{pmatrix} [[/math]]

This matrix is equivalent to the matrix [math]F_4''[/math] from Proposition 7.13, with the equivalence [math]F_4'\sim F_4''[/math] being obtained by multiplying everything by the number [math]\nu=e^{\pi i/4}[/math].


There are many other things that can be said about the circulant Hadamard matrices, and about the Fourier matrices, and we refer here to Björck [4] and Haagerup [5].

7c. Bistochastic matrices

Getting back now to the main idea behind what we are doing, namely building on the relation between [math]\mathbb I_N[/math] and [math]F_N[/math], let us study now the class of bistochastic matrices:

Definition

A square matrix [math]M\in M_N(\mathbb C)[/math] is called bistochastic if each row and each column sum up to the same number:

[[math]] \begin{matrix} M_{11}&\ldots&M_{1N}&\to&\lambda\\ \vdots&&\vdots\\ M_{N1}&\ldots&M_{NN}&\to&\lambda\\ \downarrow&&\downarrow\\ \lambda&&\lambda \end{matrix} [[/math]]
If this happens only for the rows, or only for the columns, the matrix is called row-stochastic, respectively column-stochastic.

As the name indicates, these matrices are useful in statistics, with the case of the matrices having entries in [math][0,1][/math], which sum up to [math]\lambda=1[/math], being the important one.


As a basic example of a bistochastic matrix, we have of course the flat matrix [math]\mathbb I_N[/math]. In fact, the various above notions of stochasticity are closely related to [math]\mathbb I_N[/math], or rather to the all-one vector [math]\xi[/math] that the matrix [math]\mathbb I_N/N[/math] projects on, in the following way:

Proposition

Let [math]M\in M_N(\mathbb C)[/math] be a square matrix.

  • [math]M[/math] is row stochastic, with sums [math]\lambda[/math], when [math]M\xi=\lambda\xi[/math].
  • [math]M[/math] is column stochastic, with sums [math]\lambda[/math], when [math]M^t\xi=\lambda\xi[/math].
  • [math]M[/math] is bistochastic, with sums [math]\lambda[/math], when [math]M\xi=M^t\xi=\lambda\xi[/math].


Show Proof

All these assertions are clear from definitions, because when multiplying a matrix by [math]\xi[/math], we obtain the vector formed by the row sums.

As an observation here, we can reformulate if we want the above statement in a purely matrix-theoretic form, by using the flat matrix [math]\mathbb I_N[/math], as follows:

Proposition

Let [math]M\in M_N(\mathbb C)[/math] be a square matrix.

  • [math]M[/math] is row stochastic, with sums [math]\lambda[/math], when [math]M\mathbb I_N=\lambda\mathbb I_N[/math].
  • [math]M[/math] is column stochastic, with sums [math]\lambda[/math], when [math]\mathbb I_NM=\lambda\mathbb I_N[/math].
  • [math]M[/math] is bistochastic, with sums [math]\lambda[/math], when [math]M\mathbb I_N=\mathbb I_NM=\lambda\mathbb I_N[/math].


Show Proof

This follows from Proposition 7.17, and from the fact that both the rows and the columns of the flat matrix [math]\mathbb I_N[/math] are copies of the all-one vector [math]\xi[/math].

In what follows we will be mainly interested in the unitary bistochastic matrices, which are quite interesting objects. These do not exactly cover the flat matrix [math]\mathbb I_N[/math], but cover instead the following related matrix, which appears in many linear algebra questions:

[[math]] K_N=\frac{1}{N}\begin{pmatrix} 2-N&&2\\ &\ddots\\ 2&&2-N \end{pmatrix} [[/math]]

As a first result, regarding such matrices, we have the following statement:

Theorem

For a unitary matrix [math]U\in U_N[/math], the following conditions are equivalent:

  • [math]H[/math] is bistochastic, with sums [math]\lambda[/math].
  • [math]H[/math] is row stochastic, with sums [math]\lambda[/math], and [math]|\lambda|=1[/math].
  • [math]H[/math] is column stochastic, with sums [math]\lambda[/math], and [math]|\lambda|=1[/math].


Show Proof

By using a symmetry argument we just need to prove [math](1)\iff(2)[/math], and both the implications are elementary, as follows:


[math](1)\implies(2)[/math] If we denote by [math]U_1,\ldots,U_N\in\mathbb C^N[/math] the rows of [math]U[/math], we have indeed:

[[math]] \begin{eqnarray*} 1 &=&\sum_i \lt U_1,U_i \gt \\ &=&\sum_jU_{1j}\sum_i\bar{U}_{ij}\\ &=&\sum_jU_{1j}\cdot\bar{\lambda}\\ &=&|\lambda|^2 \end{eqnarray*} [[/math]]


[math](2)\implies(1)[/math] Consider the all-one vector [math]\xi=(1)_i\in\mathbb C^N[/math]. The fact that [math]U[/math] is row-stochastic with sums [math]\lambda[/math] reads:

[[math]] \begin{eqnarray*} \sum_jU_{ij}=\lambda,\forall i &\iff&\sum_jU_{ij}\xi_j=\lambda\xi_i,\forall i\\ &\iff&U\xi=\lambda\xi \end{eqnarray*} [[/math]]


Also, the fact that [math]U[/math] is column-stochastic with sums [math]\lambda[/math] reads:

[[math]] \begin{eqnarray*} \sum_iU_{ij}=\lambda,\forall j &\iff&\sum_jU_{ij}\xi_i=\lambda\xi_j,\forall j\\ &\iff&U^t\xi=\lambda\xi \end{eqnarray*} [[/math]]


We must prove that the first condition implies the second one, provided that the row sum [math]\lambda[/math] satisfies [math]|\lambda|=1[/math]. But this follows from the following computation:

[[math]] \begin{eqnarray*} U\xi=\lambda\xi &\implies&U^*U\xi=\lambda U^*\xi\\ &\implies&\xi=\lambda U^*\xi\\ &\implies&\xi=\bar{\lambda}U^t\xi\\ &\implies&U^t\xi=\lambda\xi \end{eqnarray*} [[/math]]


Thus, we have proved both the implications, and we are done.

The unitary bistochastic matrices are stable under a number of operations, and in particular under taking products, and we have the following result:

Theorem

The real and complex bistochastic groups, which are the sets

[[math]] B_N\subset O_N\quad,\quad C_N\subset U_N [[/math]]
consisting of matrices which are bistochastic, are isomorphic to [math]O_{N-1}[/math], [math]U_{N-1}[/math].


Show Proof

Let us pick a unitary matrix [math]F\in U_N[/math] satisfying the following condition, where [math]e_0,\ldots,e_{N-1}[/math] is the standard basis of [math]\mathbb C^N[/math], and where [math]\xi[/math] is the all-one vector:

[[math]] Fe_0=\frac{1}{\sqrt{N}}\xi [[/math]]

Observe that such matrices [math]F\in U_N[/math] exist indeed, the basic example being the normalized Fourier matrix [math]F_N/\sqrt{N}[/math]. We have then, by using the above property of [math]F[/math]:

[[math]] \begin{eqnarray*} u\xi=\xi &\iff&uFe_0=Fe_0\\ &\iff&F^*uFe_0=e_0\\ &\iff&F^*uF=diag(1,w) \end{eqnarray*} [[/math]]


Thus we have isomorphisms as in the statement, given by [math]w_{ij}\to(F^*uF)_{ij}[/math].

We will be back to [math]B_N,C_N[/math] later in this book, when doing group theory. In relation now with the Hadamard matrices, as a first remark, the first Walsh matrix [math]W_2[/math] looks better in complex bistochastic form, modulo the standard equivalence relation:

[[math]] \begin{pmatrix}1&1\\1&-1\end{pmatrix} \sim\begin{pmatrix}i&i\\1&-1\end{pmatrix} \sim\begin{pmatrix}i&1\\1&i\end{pmatrix} [[/math]]

The second Walsh matrix [math]W_4=W_2\otimes W_2[/math] can be put as well in complex bistochastic form, and also looks better in this form, as shown by:

[[math]] \ \begin{pmatrix}1&1&1&1\\ 1&-1&1&-1\\ 1&1&-1&-1\\ 1&-1&-1&1\end{pmatrix} \sim\begin{pmatrix} 1&1&1&1\\ 1&1&-1&-1\\ 1&-1&1&-1\\ 1&-1&-1&1 \end{pmatrix} [[/math]]

[[math]] \sim\begin{pmatrix} 1&-1&-1&-1\\ 1&-1&1&1\\ 1&1&-1&1\\ 1&1&1&-1 \end{pmatrix} \sim\begin{pmatrix} -1&1&1&1\\ 1&-1&1&1\\ 1&1&-1&1\\ 1&1&1&-1 \end{pmatrix} [[/math]]

In fact, by using the above formulae, we are led to the following statement:

Proposition

All the Walsh matrices, [math]W_N=W_2^{\otimes n}[/math] with [math]N=2^n[/math], can be put in bistochastic form, up to the standard equivalence relation, as follows:

  • The matrices [math]W_N[/math] with [math]N=4^n[/math] admit a real bistochastic form, namely:
    [[math]] W_N\sim\begin{pmatrix} -1&1&1&1\\ 1&-1&1&1\\ 1&1&-1&1\\ 1&1&1&-1 \end{pmatrix}^{\otimes n} [[/math]]
  • The matrices [math]W_N[/math] with [math]N=2\times4^n[/math] admit a complex bistochastic form, namely:
    [[math]] W_N\sim\begin{pmatrix}i&1\\1&i\end{pmatrix}\otimes\begin{pmatrix} -1&1&1&1\\ 1&-1&1&1\\ 1&1&-1&1\\ 1&1&1&-1 \end{pmatrix}^{\otimes n} [[/math]]


Show Proof

This follows indeed from the above discussion.

Regarding now the question of putting the general Hadamard matrices, real or complex, in complex bistochastic form, things here are tricky. We first have:

Theorem

The class of the bistochastic complex Hadamard matrices has the following properties:

  • It contains the circulant symmetric forms [math]F_N'[/math] of the Fourier matrices [math]F_N[/math].
  • It is stable under permuting rows and columns.
  • It is stable under taking tensor products.

In particular, any generalized Fourier matrix [math]F_{N_1,\ldots,N_k}=F_{N_1}\otimes\ldots\otimes F_{N_k}[/math] can be put in bistochastic and symmetric form, up to the equivalence relation.


Show Proof

We have several things to be proved, the idea being as follows:


(1) We know from the above that any Fourier matrix [math]F_N[/math] has a circulant and symmetric form [math]F_N'[/math]. But since circulant implies bistochastic, this gives the result.


(2) The claim regarding permuting rows and columns is clear.


(3) Assuming that [math]H,K[/math] are bistochastic, with sums [math]\lambda,\mu[/math], we have:

[[math]] \begin{eqnarray*} \sum_{ia}(H\otimes K)_{ia,jb} &=&\sum_{ia}H_{ij}K_{ab}\\ &=&\sum_iH_{ij}\sum_aK_{ab}\\ &=&\lambda\mu \end{eqnarray*} [[/math]]


We have as well the following computation:

[[math]] \begin{eqnarray*} \sum_{jb}(H\otimes K)_{ia,jb} &=&\sum_{jb}H_{ij}K_{ab}\\ &=&\sum_jH_{ij}\sum_bK_{ab}\\ &=&\lambda\mu \end{eqnarray*} [[/math]]


Thus, the matrix [math]H\otimes K[/math] is bistochastic as well.


(4) As for the last assertion, this follows from (1,2,3).

In general now, putting an arbitrary complex Hadamard matrix in bistochastic form can be theoretically done, according to a general theorem of Idel-Wolf [6]. The proof of this latter theorem is however based on a quite advanced, and non-explicit argument, coming from symplectic geometry, and there are many interesting open questions here.

7d. Hadamard conjecture

As a final topic for this chapter, let us discuss the real Hadamard matrices. The definition here, going back to 19th century work of Sylvester [8], is as follows:

Definition

A real Hadamard matrix is a square binary matrix,

[[math]] H\in M_N(\pm1) [[/math]]
whose rows are pairwise orthogonal, with respect to the scalar product on [math]\mathbb R^N[/math].

Observe that we do not really need real numbers in order to talk about the Hadamard matrices, because the orthogonality condition tells us that, when comparing two rows, the number of matchings should equal the number of mismatchings.


As a first result regarding such matrices, we have:

Proposition

For a square matrix [math]H\in M_N(\pm1)[/math], the following are equivalent:

  • The rows of [math]H[/math] are pairwise orthogonal, and so [math]H[/math] is Hadamard.
  • The columns of [math]H[/math] are pairwise orthogonal, and so [math]H^t[/math] is Hadamard.
  • The rescaled matrix [math]U=H/\sqrt{N}[/math] is orthogonal, [math]U\in O_N[/math].


Show Proof

This is something that we already know for the complex Hadamard matrices, with the orthogonal group [math]O_N[/math] being replaced by the unitary group [math]U_N[/math]. In the real case the proof is similar, with everything coming from definitions, and linear algebra.

As an abstract consequence of the above result, let us record:

Theorem

The set of the [math]N\times N[/math] Hadamard matrices is

[[math]] Y_N=M_N(\pm 1)\cap\sqrt{N}O_N [[/math]]
where [math]O_N[/math] is the orthogonal group, the intersection being taken inside [math]M_N(\mathbb R)[/math].


Show Proof

This follows from Proposition 7.24, which tells us that an arbitrary matrix [math]H\in M_N(\pm1)[/math] belongs to [math]Y_N[/math] if and only if it belongs to [math]\sqrt{N}O_N[/math].

As a conclusion here, the set [math]Y_N[/math] that we are interested in appears as a kind of set of “special rational points” of the real algebraic manifold [math]\sqrt{N}O_N[/math]. Moving now forward, as before in the complex matrix case, it is convenient to introduce:

Definition

Two real Hadamard matrices are called equivalent, and we write [math]H\sim K[/math], when it is possible to pass from [math]H[/math] to [math]K[/math] via the following operations:

  • Permuting the rows, or the columns.
  • Multiplying the rows or columns by [math]-1[/math].

Observe that we do not include the transposition operation [math]H\to H^t[/math] in our list of allowed operations. This is because Proposition 7.24, while looking quite elementary, rests however on a deep linear algebra fact, namely that the transpose of an orthogonal matrix is orthogonal as well, and this can produce complications later on.


Let us do now some classification work. Here is the result at [math]N=4[/math]:

Proposition

There is only one Hadamard matrix at [math]N=4[/math], namely

[[math]] W_4=W_2\otimes W_2 [[/math]]
up to the standard equivalence relation for such matrices.


Show Proof

Consider an Hadamard matrix [math]H\in M_4(\pm1)[/math], assumed to be dephased:

[[math]] H=\begin{pmatrix}1&1&1&1\\ 1&a&b&c\\ 1&d&e&f\\ 1&g&h&i\end{pmatrix} [[/math]]

By orthogonality of the first 2 rows we must have [math]\{a,b,c\}=\{-1,-1,1\}[/math], and so by permuting the last 3 columns, we can further assume that our matrix is as follows:

[[math]] H=\begin{pmatrix}1&1&1&1\\ 1&-1&1&-1\\ 1&m&n&o\\ 1&p&q&r\end{pmatrix} [[/math]]

By orthogonality of the first 2 columns we must have [math]\{m,p\}=\{-1,1\}[/math], and so by permuting the last 2 rows, we can further assume that our matrix is as follows:

[[math]] H=\begin{pmatrix}1&1&1&1\\ 1&-1&1&-1\\ 1&1&x&y\\ 1&-1&z&t\end{pmatrix} [[/math]]

Now from the orthogonality of the rows and columns we obtain [math]x=y=-1[/math], and then [math]z=-1,t=1[/math]. Thus, up to equivalence we have [math]H=W_4[/math], as claimed.

The case [math]N=5[/math] is excluded, because the orthogonality condition forces [math]N\in 2\mathbb N[/math]. The point now is that the case [math]N=6[/math] is excluded as well, because we have:

Proposition

The size of an Hadamard matrix must be

[[math]] N\in\{2\}\cup 4\mathbb N [[/math]]
with this coming from the orthogonality condition between the first [math]3[/math] rows.


Show Proof

By permuting the rows and columns or by multiplying them by [math]-1[/math], as to rearrange the first 3 rows, we can always assume that our matrix looks as follows:

[[math]] H=\begin{pmatrix} 1\ldots\ldots 1&1\ldots\ldots 1&1\ldots\ldots 1&1\ldots\ldots 1\\ 1\ldots\ldots 1&1\ldots\ldots 1&-1\ldots -1&-1\ldots -1\\ 1\ldots\ldots 1&-1\ldots -1&1\ldots\ldots 1&-1\ldots -1\\ \underbrace{\ldots\ldots\ldots}_x&\underbrace{\ldots\ldots\ldots}_y&\underbrace{\ldots\ldots\ldots}_z&\underbrace{\ldots\ldots\ldots}_t \end{pmatrix} [[/math]]

Now if we denote by [math]x,y,z,t[/math] the sizes of the 4 block columns, as indicated, the orthogonality conditions between the first 3 rows give the following system of equations:

[[math]] (1\perp 2)\quad:\quad x+y=z+t [[/math]]

[[math]] (1\perp 3)\quad:\quad x+z=y+t [[/math]]

[[math]] (2\perp 3)\quad:\quad x+t=y+z [[/math]]

The numbers [math]x,y,z,t[/math] being such that the average of any two equals the average of the other two, and so equals the global average, the solution of our system is:

[[math]] x=y=z=t [[/math]]

Thus the matrix size [math]N=x+y+z+t[/math] must be a multiple of 4, as claimed.

The above result, and various other findings, suggest the following conjecture:

\begin{conjecture}[Hadamard Conjecture (HC)] There is at least one Hadamard matrix

[[math]] H\in M_N(\pm1) [[/math]]

for any integer [math]N\in 4\mathbb N[/math]. \end{conjecture} This conjecture, going back to the 19th century, is one of the most beautiful statements in combinatorics, linear algebra, and mathematics in general. Quite remarkably, the numeric verification so far goes up to the number of the beast:

[[math]] \mathfrak N=666 [[/math]]

Our purpose now will be that of gathering some evidence for this conjecture. At [math]N=4,8[/math] we have the Walsh matrices [math]W_4,W_8[/math]. Thus, the next existence problem comes at [math]N=12[/math]. And here, we can use the following key construction, due to Paley:

Theorem

Let [math]q=p^r[/math] be an odd prime power, define

[[math]] \chi:\mathbb F_q\to\{-1,0,1\} [[/math]]
by [math]\chi(0)=0[/math], [math]\chi(a)=1[/math] if [math]a=b^2[/math] for some [math]b\neq0[/math], and [math]\chi(a)=-1[/math] otherwise, and finally set

[[math]] Q_{ab}=\chi(a-b) [[/math]]
We have then constructions of Hadamard matrices, as follows:

  • Paley [math]1[/math]: if [math]q=3(4)[/math] we have a matrix of size [math]N=q+1[/math], as follows:
    [[math]] P_N^1=1+\begin{pmatrix} 0&1&\ldots&1\\ -1\\ \vdots&&Q\\ -1 \end{pmatrix} [[/math]]
  • Paley [math]2[/math]: if [math]q=1(4)[/math] we have a matrix of size [math]N=2q+2[/math], as follows:
    [[math]] P_N^2=\begin{pmatrix} 0&1&\ldots&1\\ 1\\ \vdots&&Q\\ 1 \end{pmatrix}\quad:\quad 0\to\begin{pmatrix}1&-1\\ -1&-1\end{pmatrix}\quad,\quad\pm1\to\pm\begin{pmatrix}1&1\\1&-1\end{pmatrix} [[/math]]

These matrices are skew-symmetric [math](H+H^t=2)[/math], respectively symmetric [math](H=H^t)[/math].


Show Proof

In order to simplify the presentation, we will denote by [math]1[/math] all the identity matrices, of any size, and by [math]\mathbb I[/math] all the rectangular all-one matrices, of any size as well. It is elementary to check that the matrix [math]Q_{ab}=\chi(a-b)[/math] has the following properties:

[[math]] QQ^t=q1-\mathbb I\quad,\quad Q\mathbb I=\mathbb IQ=0 [[/math]]

In addition, we have the following formulae, which are elementary as well, coming from the fact that [math]-1[/math] is a square in [math]\mathbb F_q[/math] precisely when [math]q=1(4)[/math]:

[[math]] q=1(4)\implies Q=Q^t\ \ \, [[/math]]

[[math]] q=3(4)\implies Q=-Q^t [[/math]]

With these observations in hand, the proof goes as follows:


(1) With our conventions for the symbols [math]1[/math] and [math]\mathbb I[/math], the matrix in the statement is:

[[math]] P_N^1=\begin{pmatrix}1&\mathbb I\\ -\mathbb I&1+Q\end{pmatrix} [[/math]]

With this formula in hand, the Hadamard matrix condition follows from:

[[math]] \begin{eqnarray*} P_N^1(P_N^1)^t &=&\begin{pmatrix}1&\mathbb I\\ -\mathbb I&1+Q\end{pmatrix}\begin{pmatrix}1&-\mathbb I\\ \mathbb I&1-Q\end{pmatrix}\\ &=&\begin{pmatrix}N&0\\ 0&\mathbb I+1-Q^2\end{pmatrix}\\ &=&\begin{pmatrix}N&0\\ 0&N\end{pmatrix} \end{eqnarray*} [[/math]]


(2) If we denote by [math]G,F[/math] the matrices in the statement, which replace respectively the [math]0,1[/math] entries, then we have the following formula for our matrix:

[[math]] P_N^2=\begin{pmatrix}0&\mathbb I\\ \mathbb I&Q\end{pmatrix}\otimes F+1\otimes G [[/math]]

With this formula in hand, the Hadamard matrix condition follows from:

[[math]] \begin{eqnarray*} (P_N^2)^2 &=&\begin{pmatrix}0&\mathbb I\\ \mathbb I&Q\end{pmatrix}^2\otimes F^2+\begin{pmatrix}1&0\\ 0&1\end{pmatrix}\otimes G^2+\begin{pmatrix}0&\mathbb I\\ \mathbb I&Q\end{pmatrix}\otimes(FG+GF)\\ &=&\begin{pmatrix}q&0\\ 0&q\end{pmatrix}\otimes 2+\begin{pmatrix}1&0\\ 0&1\end{pmatrix}\otimes 2+\begin{pmatrix}0&\mathbb I\\ \mathbb I&Q\end{pmatrix}\otimes0\\ &=&\begin{pmatrix}N&0\\ 0&N\end{pmatrix} \end{eqnarray*} [[/math]]


Finally, the last assertion is clear, from the above formulae relating [math]Q,Q^t[/math].

The above constructions allow us to get well beyond the Walsh matrix level:

Theorem

The HC is verified at least up to [math]N=88[/math], as follows:

  • At [math]N=4,8,16,32,64[/math] we have Walsh matrices.
  • At [math]N=12,20,24,28,44,48,60,68,72,80,84,88[/math] we have Paley [math]1[/math] matrices.
  • At [math]N=36,52,76[/math] we have Paley [math]2[/math] matrices.
  • At [math]N=40,56[/math] we have Paley [math]1[/math] matrices tensored with [math]W_2[/math].


Show Proof

First of all, the numbers in (1-4) are indeed all the multiples of 4, up to 88. As for the various assertions, the proof here goes as follows:


(1) This is clear from the definition of the Walsh matrices.


(2) Since [math]N-1[/math] takes the values [math]q=11,19,23,27,43,47,59,67,71,79,83,87[/math], all prime powers, we can indeed apply the Paley 1 construction, in all these cases.


(3) Since [math]N=4(8)[/math] here, and [math]N/2-1[/math] takes the values [math]q=17,25,37[/math], all prime powers, we can indeed apply the Paley 2 construction, in these cases.


(4) At [math]N=40[/math] we have indeed [math]P_{20}^1\otimes W_2[/math], and at [math]N=56[/math] we have [math]P_{28}^1\otimes W_2[/math].

As a continuation of all this, at [math]N=92[/math] we have [math]92-1=7\times13[/math], so the Paley 1 construction does not work, and [math]92/2=46[/math], so the Paley 2 construction, or tensoring with [math]W_2[/math], does not work either. However, we can use here the following result:

Theorem

Assuming that [math]A,B,C,D\in M_K(\pm1)[/math] are circulant, symmetric, pairwise commute and satisfy the condition

[[math]] A^2+B^2+C^2+D^2=4K [[/math]]
the following [math]4K\times4K[/math] matrix is Hadamard, called of Williamson type:

[[math]] H=\begin{pmatrix} A&B&C&D\\ -B&A&-D&C\\ -C&D&A&-B\\ -D&-C&B&A \end{pmatrix} [[/math]]
Moreover, matrices [math]A,B,C,D[/math] as above exist at [math]K=23[/math], where [math]4K=92[/math].


Show Proof

Consider the quaternion units [math]1,i,j,k\in M_4(0,1)[/math], which describe the positions of the [math]A,B,C,D[/math] entries in the matrix [math]H[/math] from the statement. We have then:

[[math]] H=A\otimes 1+B\otimes i+C\otimes j+D\otimes k [[/math]]

Assuming now that [math]A,B,C,D[/math] are symmetric, we have:

[[math]] \begin{eqnarray*} HH^t &=&(A\otimes 1+B\otimes i+C\otimes j+D\otimes k)\\ &&(A\otimes 1-B\otimes i-C\otimes j-D\otimes k)\\ &=&(A^2+B^2+C^2+D^2)\otimes 1-([A,B]-[C,D])\otimes i\\ &&-([A,C]-[B,D])\otimes j-([A,D]-[B,C])\otimes k \end{eqnarray*} [[/math]]


Now assume that our matrices [math]A,B,C,D[/math] pairwise commute, and satisfy the condition in the statement. In this case, it follows from the above formula that we have:

[[math]] HH^t=4K [[/math]]

Thus, we obtain indeed an Hadamard matrix, as claimed. However, finding such matrices is in general a difficult task, and this is where Williamson's extra assumption in the statement, that [math]A,B,C,D[/math] should be taken circulant, comes from. Finally, regarding the [math]K=23[/math] and [math]N=92[/math] example, this comes via a computer search.

Things get even worse at higher values of [math]N[/math], where more and more complicated constructions are needed. The whole subject is quite technical, and, as already mentioned, human knowledge here stops so far at the number of the beast, namely:

[[math]] \mathfrak N=666 [[/math]]

Switching topics now, another well-known open question concerns the circulant case. Given a binary vector [math]\gamma\in(\pm 1)^N[/math], one can ask whether the matrix [math]H\in M_N(\pm 1)[/math] defined by [math]H_{ij}=\gamma_{j-i}[/math] is Hadamard or not. Here is a solution to the problem:

[[math]] K_4=\begin{pmatrix}-1&1&1&1\\ 1&-1&1&1\\ 1&1&-1&1\\ 1&1&1&-1\end{pmatrix} [[/math]]

More generally, any vector [math]\gamma\in(\pm 1)^4[/math] satisfying [math]\sum\gamma_i=\pm 1[/math] is a solution to the problem. The following conjecture, from the 50s, states that there are no other solutions:

\begin{conjecture}[Circulant Hadamard Conjecture (CHC)] The only Hadamard matrices which are circulant are

[[math]] K_4=\begin{pmatrix}-1&1&1&1\\ 1&-1&1&1\\ 1&1&-1&1\\ 1&1&1&-1\end{pmatrix} [[/math]]

and its conjugates, regardless of the value of [math]N\in\mathbb N[/math]. \end{conjecture} The fact that such a simple-looking problem is still open might seem quite surprising. Indeed, if we denote by [math]S\subset\{1,\ldots,N\}[/math] the set of positions of the [math]-1[/math] entries of [math]\gamma[/math], the Hadamard matrix condition is simply, for any [math]k\neq 0[/math], taken modulo [math]N[/math]:

[[math]] |S\cap(S+k)|=|S|-N/4 [[/math]]

Thus, the above conjecture simply states that at [math]N\neq 4[/math], such a set [math]S[/math] cannot exist. This is a well-known problem in combinatorics, raised by Ryser a long time ago.


Summarizing, we have many interesting questions in the real case. The situation is quite different from the one in complex case, where at any [math]N\in\mathbb N[/math] we have the Fourier matrix [math]F_N[/math], which makes the HC problematics dissapear. Since [math]F_N[/math] can be put in circulant form, the CHC dissapears as well. There are however many interesting questions in the complex case, for the most in relation with questions in quantum physics.


General references

Banica, Teo (2024). "Linear algebra and group theory". arXiv:2206.09283 [math.CO].

References

  1. W. Rudin, Real and complex analysis, McGraw-Hill (1966).
  2. W. Rudin, Fourier analysis on groups, Dover (1972).
  3. I. Bengtsson and K. \.Zyczkowski, Geometry of quantum states, Cambridge Univ. Press (2006).
  4. 4.0 4.1 4.2 G. Björck, Functions of modulus [math]1[/math] on [math]{\rm Z}_n[/math] whose Fourier transforms have constant modulus, and cyclic [math]n[/math]-roots, NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci. 315 (1990), 131--140.
  5. 5.0 5.1 U. Haagerup, Orthogonal maximal abelian [math]*[/math]-subalgebras of the [math]n\times n[/math] matrices and cyclic [math]n[/math]-roots, in “Operator algebras and quantum field theory”, International Press (1997), 296--323.
  6. 6.0 6.1 M. Idel and M.M. Wolf, Sinkhorn normal form for unitary matrices, Linear Algebra Appl. 471 (2015), 76--84.
  7. V.F.R. Jones, Planar algebras I (1999).
  8. 8.0 8.1 J.J. Sylvester, Thoughts on inverse orthogonal matrices, simultaneous sign-successions, and tesselated pavements in two or more colours, with applications to Newton's rule, ornamental tile-work, and the theory of numbers, Phil. Mag. 34 (1867), 461--475.