The Laplacian
5a. Laplace operator
Welcome to geometry. Many things can be said here, but going straight to the point, there is nothing more geometric than waves. Indeed, waves produce physics, with each major branch of physics being guided by its own wave equation. And then physics produces, as we know it since Newton, calculus, also known as “geometry and analysis”.
So, this will be our philosophy in what follows, waves, physics, calculus, geometry and analysis being more or less the same thing. And with our graphs being expected to help at some point, in relation with discretization questions, for waves, physics and so on.
Getting started now, we first need to talk about waves. We have here:
\begin{fact}
Waves can be of many types, as follows:
- Mechanical waves, such as the usual water waves, but also the sound waves, or the seismic waves. In all these cases, the wave propagates mechanically, via a certain medium, which can be solid, liquid or gaseous.
- Electromagnetic waves, coming via a more complicated mechanism, namely an accelerating charge in the context of electromagnetism. These are the radio waves, microwaves, IR, visible light, UV, X-rays and [math]\gamma[/math]-rays.
- Other waves, including the heat waves, appearing in the context of heat diffusion through a certain material, and also the waves from quantum mechanics, describing the movements, which are wave-like, of the elementary particles.
\end{fact} Quite remarkably, the behavior of all the above mechanical and electromagnetic waves is basically described by the same wave equation, which looks as follows:
As for the heat waves and quantum mechanical waves, these are described by some similar equations, namely [math]\dot{\varphi}=\alpha\Delta\varphi[/math], and versions of it. Which leads us into:
\begin{question}
What is the Laplace operator [math]\Delta[/math], appearing in the above, and making the world go round?
\end{question}
Which might look a bit scary, I know, are we supposed now to stop mathematics, and take lots of physics classes, for a few years in a row, in order to answer our question.
Fortunately for us, this world was made with some built-in beauty, among others with basic mathematics corresponding to basic physics, and vice versa. So, on purely philosophical grounds, expect the answer to Question 5.2 to be something very simple, mathematically speaking. And simple that mathematical answer indeed is, as follows:
\begin{answer}
The second derivative of a function [math]\varphi:\mathbb R^N\to\mathbb R[/math], making the formula
work, is its Hessian matrix [math]\varphi''(x)\in M_N(\mathbb R)[/math], given by the following formula:
However, when needing a number, as second derivative, the trace of [math]\varphi''(x)[/math], denoted
and called Laplacian of [math]\varphi[/math], is usually the correct quantity. \end{answer} Very nice all this, and as a plan now for the present chapter, we will first try to understand what Answer 5.3 says, from a purely mathematical perspective. Then, we will make a detour through lattices and graphs, further building on the graph theory developed in Part I. And finally, armed with all this knowledge, we will discuss waves.
Getting started for good now, as said in Answer 5.3, it is all about the Taylor formula at order 2. And here, in one variable, to start with, things are quite simple, and you certainly know them well. Indeed, given a function [math]\varphi:\mathbb R\to\mathbb R[/math], the first job is that of finding a quantity [math]\varphi'(x)\in\mathbb R[/math] making the following approximation formula work:
But here, there are not so many choices, and the solution is that of defining the number [math]\varphi'(x)\in\mathbb R[/math] by the following formula, provided that the limit converges indeed:
This number is called derivative of [math]\varphi[/math] at the point [math]x\in\mathbb R[/math], and as you surely know, geometrically, we have [math]\varphi'(x)=\tan\alpha[/math], with [math]\alpha[/math] being the slope of [math]\varphi[/math] at the point [math]x[/math]. Now still in one variable, we can talk as well about second derivatives, as follows:
The second derivative of a function [math]\varphi:\mathbb R\to\mathbb R[/math], making the formula
Assume indeed that [math]\varphi[/math] is twice differentiable at [math]x[/math], and let us try to construct an approximation of [math]\varphi[/math] around [math]x[/math] by a quadratic function, as follows:
We must have [math]a=\varphi(x)[/math], and we also know that [math]b=\varphi'(x)[/math] is the correct choice for the coefficient of [math]h[/math]. Thus, our approximation must be as follows:
In order to find the correct choice for [math]c\in\mathbb R[/math], observe that the function [math]\psi(h)=\varphi(x+h)[/math] matches with [math]P(h)=\varphi(x)+\varphi'(x)h+ch^2[/math] in what regards the value at [math]h=0[/math], and also in what regards the value of the derivative at [math]h=0[/math]. Thus, the correct choice of [math]c\in\mathbb R[/math] should be the one making match the second derivatives at [math]h=0[/math], and this gives:
We are therefore led to the formula in the statement. In order to prove now this formula, [math]\psi(h)\simeq P(h)[/math], we can use L'H\^opital's rule, and we obtain, as desired:
Thus, we are led to the conclusion in the statement.
Many more things can be said about second derivatives, and let us record here:
Intuitively speaking, the second derivative [math]\varphi''(x)\in\mathbb R[/math] computes how much different is [math]\varphi(x)[/math], compared to the average of [math]\varphi(y)[/math], with [math]y\simeq x[/math].
This is obviously something a bit heuristic, but which is good to know. Let us write the formula in Theorem 5.4, as such, and with [math]h\to-h[/math] too:
By making the average, we obtain the following formula:
Thus, thinking a bit, we are led to the conclusion in the statement. It is of course possible to say more here, but we will not really need all this, in what follows.
Moving now to several variables, as a first job, given a function [math]\varphi:\mathbb R^N\to\mathbb R[/math], we would like to find a quantity [math]\varphi'(x)[/math] making the following approximation formula work:
But here, again there are not so many choices, and the solution is that of defining [math]\varphi'(x)[/math] as being the row vector formed by the partial derivatives at [math]x[/math]:
To be more precise, with this value for [math]\varphi'(x)[/math], our approximation formula [math]\varphi(x+h)\simeq\varphi(x)+\varphi'(x)h[/math] makes sense indeed, as an equality of real numbers, with [math]\varphi'(x)h\in\mathbb R[/math] being obtained as the matrix multiplication of the row vector [math]\varphi'(x)[/math], and the column vector [math]h[/math]. As for the fact that our formula holds indeed, this follows by putting together the approximation properties of each of the partial derivatives [math]d\varphi/dx_i[/math], which gives:
Moving now to second derivatives, we have here, generalizing Theorem 5.4:
The second derivative of a function [math]\varphi:\mathbb R^N\to\mathbb R[/math], making the formula
There are several things going on here, the idea being as follows:
(1) As a first observation, at [math]N=1[/math] the Hessian matrix constructed above is simply the [math]1\times1[/math] matrix having as entry the second derivative [math]\varphi''(x)[/math], and the formula in the statement is something that we know well from Theorem 5.4, namely:
(2) At [math]N=2[/math] now, we obviously need to differentiate [math]\varphi[/math] twice, and the point is that we come in this way upon the following formula, called Clairaut formula:
But, is this formula correct or not? As an intuitive justification for it, let us consider a product of power functions, [math]\varphi(z)=x^py^q[/math]. We have then:
Thus, we can see that the Clairaut formula holds indeed, due to the fact that the functions in [math]x,y[/math] commute. Of course, this does not really prove our formula, in general. But exercise for you, to have this idea fully working, by using linear combinations and density arguments, or to look up the standard proof, using the mean value theorem.
(3) Moving now to [math]N=3[/math] and higher, we can use here the Clairaut formula with respect to any pair of coordinates, and we obtain the Schwarz formula, namely:
Thus, the second derivative, or Hessian matrix, is symmetric, as claimed.
(4) Getting now to the main topic, namely approximation formula in the statement, let [math]y\in\mathbb R^N[/math], and consider the following function, with [math]r\in\mathbb R[/math]:
We know from (1) that the Taylor formula for [math]f[/math], at the point [math]r=0[/math], reads:
And our claim is that, with [math]h=ry[/math], this is precisely the formula in the statement.
(5) So, let us see if our claim is correct. By using the chain rule, we have the following formula, with on the right, as usual, a row vector multiplied by a column vector:
By using again the chain rule, we can compute the second derivative as well:
(6) Time now to conclude. We know that we have [math]f(r)=\varphi(x+ry)[/math], and according to our various computations above, we have the following formulae:
But with this data in hand, the usual Taylor formula for our one variable function [math]f[/math], at order 2, at the point [math]r=0[/math], takes the following form, with [math]h=ry[/math]:
Thus, we have obtained the formula in the statement.
Getting back now to what we wanted to do, namely understand Answer 5.3, it remains to talk about the Laplace operator [math]\Delta[/math]. Things are quite tricky here, basically requiring some physics that we still need to develop, but as something mathematical to start with, we have the following higher dimensional analogue of Proposition 5.5:
Intuitively, the following quantity, called Laplacian of [math]\varphi[/math],
As before with Proposition 5.5, this is something a bit heuristic, but good to know. Let us write the formula in Theorem 5.6, as such, and with [math]h\to-h[/math] too:
By making the average, we obtain the following formula:
Thus, thinking a bit, we are led to the conclusion in the statement. It is of course possible to say more here, but we will not really need all the details, in what follows.
With this understood, the problem is now, what can we say about the mathematics of [math]\Delta[/math]? Inspired by linear algebra, let us formulate a basic question, as follows: \begin{question} The Laplace operator being a linear operator,
what can we say about it, inspired by usual linear algebra? \end{question} In answer now, the space of functions [math]\varphi:\mathbb R^N\to\mathbb R[/math], on which [math]\Delta[/math] acts, being infinite dimensional, the usual tools from linear algebra do not apply as such, and we must be extremely careful. For instance, we cannot really expect to diagonalize [math]\Delta[/math].
Thinking some more, there is actually a real bug too with our problem, because at [math]N=1[/math] this problem becomes “what can we say about the second derivatives [math]\varphi”:\mathbb R\to\mathbb R[/math] of the functions [math]\varphi:\mathbb R\to\mathbb R[/math], inspired by linear algebra, with answer “not much”.
And by thinking even more, still at [math]N=1[/math], there is a second bug too, because if [math]\varphi:\mathbb R\to\mathbb R[/math] is twice differentiable, nothing will guarantee that its second derivative [math]\varphi'':\mathbb R\to\mathbb R[/math] is twice differentiable too. Thus, we have some issues with the domain and range of [math]\Delta[/math], regarded as linear operator, and these problems will persist at higher [math]N[/math].
So, shall we trash Question 5.8? Not so quick, because, very remarkably, we have:
\begin{fact}
The functions [math]\varphi:\mathbb R^N\to\mathbb R[/math] which are [math]0[/math]-eigenvectors of [math]\Delta[/math],
called harmonic functions, have the following properties:
- At [math]N=1[/math], nothing spectacular, these are just the linear functions.
- At [math]N=2[/math], these are, locally, the real parts of holomorphic functions.
- At [math]N\geq 3[/math], these still share many properties with the holomorphic functions.
\end{fact} In order to understand this, or at least get introduced to it, let us first look at the case [math]N=2[/math]. Here, any function [math]\varphi:\mathbb R^2\to\mathbb R[/math] can be regarded as function [math]\varphi:\mathbb C\to\mathbb R[/math], depending on [math]z=x+iy[/math]. But, in view of this, it is natural to enlarge to attention to the functions [math]\varphi:\mathbb C\to\mathbb C[/math], and ask which of these functions are harmonic, [math]\Delta\varphi=0[/math]. And here, we have the following remarkable result, making the link with complex analysis:
Any holomorphic function [math]\varphi:\mathbb C\to\mathbb C[/math], when regarded as function
The first assertion comes from the following computation, with [math]z=x+iy[/math]:
As for the second assertion, this follows from [math]\Delta\bar{\varphi}=\overline{\Delta\varphi}[/math], which is clear from definitions, and which shows that if [math]\varphi[/math] is harmonic, than so is its conjugate [math]\bar{\varphi}[/math].
Many more things can be said, along these lines, notably a proof of the assertion (2) in Fact 5.9, which is however a quite tough piece of mathematics, and then with a clarification of the assertion (3) too, from that same principle, which again requires some substantial mathematics. But, in what follows, we will not really need all this.
5b. Graphs, lattices
Back now to graphs, we have the following definition, inspired by the above:
We call Laplacian of a graph [math]X[/math] the matrix
This definition is inspired by differential geometry, or just by multivariable calculus, and more specifically by the well-known Laplace operator there, given by:
More on this in a moment, but as a word regarding terminology, this is traditionally confusing in graph theory, and impossible to fix in a decent way, according to:
\begin{warning} The graph Laplacian above is in fact the negative Laplacian,
with our preference for it, negative, coming from the fact that it is positive, [math]L\geq0[/math]. \end{warning} Which sounds like a bad joke, but this is how things are, and more on this a moment. In practice now, the graph Laplacian is given by the following formula:
Alternatively, we have the following formula, for the entries of the Laplacian:
With these formulae in hand, we can formulate, as our first result on the subject:
A function on a graph is harmonic, [math]Lf=0[/math], precisely when
We have indeed the following computation, for any function [math]f[/math]:
Thus, we are led to the conclusions in the statement.
Summarizing, we have some good reasons for calling [math]L[/math] the Laplacian, because the solutions of [math]Lf=0[/math] satisfy what we would expect from a harmonic function, namely having the “average over the neighborhood” property. With the remark however that the harmonic functions on graphs are something trivial, due to the following fact:
A function on a graph [math]X[/math] is harmonic in the above sense precisely when it is constant over the connected components of [math]X[/math].
This is clear from the equation that we found in Proposition 5.13, namely:
Indeed, based on this, we can say for instance that [math]f[/math] cannot have variations over a connected component, and so must be constant on these components, as stated.
At a more advanced level now, let us try to understand the relation with the usual Laplacian from analysis [math]\Delta[/math], which is given by the following formula:
In one dimension, [math]N=1[/math], the Laplacian is simply the second derivative, [math]\Delta f=f''[/math]. Now let us recall that the first derivative of a one-variable function is given by:
We deduce from this, or from the Taylor formula at order 2, to be fully correct, that the second derivative of a one-variable function is given by the following formula:
Now since [math]\mathbb R[/math] can be thought of as appearing as the continuum limit, [math]t\to0[/math], of the graphs [math]t\mathbb Z\simeq\mathbb Z[/math], this suggests defining the Laplacian of [math]\mathbb Z[/math] by the following formula:
But this is exactly what we have in Definition 5.11, up to a sign switch, the graph Laplacian of [math]\mathbb Z[/math], as constructed there, being given by the following formula:
Summarizing, we have reached to the formula in Warning 5.12, namely:
In arbitrary dimensions now, everything generalizes well, and we have:
The Laplacian of graphs is compatible with the usual Laplacian,
Indeed, at [math]N=2[/math], to start with, the formula that we need, coming from standard multivariable calculus, or just from the [math]N=1[/math] formula, is as follows:
Now since [math]\mathbb R^2[/math] can be thought of as appearing as the continuum limit, [math]t\to0[/math], of the graphs [math]t\mathbb Z^2\simeq\mathbb Z^2[/math], this suggests defining the Laplacian of [math]\mathbb Z^2[/math] as follows:
But this is exactly what we have in Definition 5.11, up to a sign switch, the graph Laplacian of [math]\mathbb Z^2[/math], as constructed there, being given by the following formula:
At higher [math]N\in\mathbb N[/math] the proof is similar, and we will leave this as an exercise.
All this is quite interesting, and suggests doing all sorts of geometric and analytic things, with our graphs and their Laplacians. For instance, we can try to review the above with [math]\mathbb R^N[/math] replaced by more general manifolds, having a certain curvature. Also, we can now do PDE over our graphs, by using the negative Laplacian constructed above.
But more on this later. Now back to our general graph questions, and to Definition 5.11 as it is, the Laplacian of graphs as constructed there has the following properties:
The graph Laplacian [math]L=v-d[/math] has the following properties:
- It is symmetric, [math]L=L^t[/math].
- It is positive definite, [math]L\geq 0[/math].
- It is bistochastic, with row and column sums [math]0[/math].
- It has [math]0[/math] as eigenvalue, with the other eigenvalues being positive.
- The multiplicity of [math]0[/math] is the number of connected components.
- In the connected case, the eigenvalues are [math]0 \lt \lambda_1\leq\ldots\leq\lambda_{N-1}[/math].
All this is straightforward, the idea being as follows:
(1) This is clear from [math]L=v-d[/math], both [math]v,d[/math] being symmetric.
(2) This follows from the following computation, for any function [math]f[/math] on the graph:
(3) This is again clear from [math]L=v-d[/math], and from the definition of [math]v,d[/math].
(4) Here the first assertion comes from (3), and the second one, from (2).
(5) Given an arbitrary graph, we can label its vertices inceasingly, over the connected components, and this makes the adjacency matrix [math]d[/math], so the Laplacian [math]L[/math] as well, block diagonal. Thus, we are left with proving that for a connected graph, the multiplicity of 0 is precisely 1. But this follows from the formula from the proof of (2), namely:
Indeed, this formula shows in particular that we have the following equivalence:
Now since our graph was assumed to be connected, as per the above beginning of proof, the condition on the right means that [math]f[/math] must be constant. Thus, the 0-eigenspace of the Laplacian follows to be 1-dimensional, spanned by the all-1 vector, as desired.
(6) This follows indeed from (4) and (5), and with the remark that in fact we already proved this, in the proof of (5), with the formulae there being very useful in practice.
5c. Discrete waves
Getting now into discretization and physics, we have the following key result:
The wave equation in [math]\mathbb R^N[/math] is
The equation in the statement is of course what comes out of physics experiments. However, allowing us a bit of imagination, and trust in this imagination, we can mathematically “prove” this equation, by discretizing, as follows:
(1) Let us first consider the 1D case. In order to understand the propagation of waves, we will model [math]\mathbb R[/math] as a network of balls, with springs between them, as follows:
Now let us send an impulse, and see how the balls will be moving. For this purpose, we zoom on one ball. The situation here is as follows, [math]l[/math] being the spring length:
We have two forces acting at [math]x[/math]. First is the Newton motion force, mass times acceleration, which is as follows, with [math]m[/math] being the mass of each ball:
And second is the Hooke force, displacement of the spring, times spring constant. Since we have two springs at [math]x[/math], this is as follows, [math]k[/math] being the spring constant:
We conclude that the equation of motion, in our model, is as follows:
(2) Now let us take the limit of our model, as to reach to continuum. For this purpose we will assume that our system consists of [math]B \gt \gt 0[/math] balls, having a total mass [math]M[/math], and spanning a total distance [math]L[/math]. Thus, our previous infinitesimal parameters are as follows, with [math]K[/math] being the spring constant of the total system, which is of course lower than [math]k[/math]:
With these changes, our equation of motion found in (1) reads:
Now observe that this equation can be written, more conveniently, as follows:
With [math]N\to\infty[/math], and therefore [math]l\to0[/math], we obtain in this way:
We are therefore led to the wave equation in the statement, which is [math]\ddot{\varphi}=v^2\varphi''[/math] in our present [math]N=1[/math] dimensional case, the propagation speed being [math]v=\sqrt{K/M}\cdot L[/math].
(3) In [math]2[/math] dimensions now, the same argument carries on. Indeed, we can use here a lattice model as follows, with all the edges standing for small springs:
As before in one dimension, we send an impulse, and we zoom on one ball. The situation here is as follows, with [math]l[/math] being the spring length:
We have two forces acting at [math](x,y)[/math]. First is the Newton motion force, mass times acceleration, which is as follows, with [math]m[/math] being the mass of each ball:
And second is the Hooke force, displacement of the spring, times spring constant. Since we have four springs at [math](x,y)[/math], this is as follows, [math]k[/math] being the spring constant:
We conclude that the equation of motion, in our model, is as follows:
(4) Now let us take the limit of our model, as to reach to continuum. For this purpose we will assume that our system consists of [math]B^2 \gt \gt 0[/math] balls, having a total mass [math]M[/math], and spanning a total area [math]L^2[/math]. Thus, our previous infinitesimal parameters are as follows, with [math]K[/math] being the spring constant of the total system, taken to be equal to [math]k[/math]:
With these changes, our equation of motion found in (3) reads:
Now observe that this equation can be written, more conveniently, as follows:
With [math]N\to\infty[/math], and therefore [math]l\to0[/math], we obtain in this way:
Thus, we are led in this way to the following wave equation in two dimensions, with [math]v=\sqrt{K/M}\cdot L[/math] being the propagation speed of our wave:
But we recognize at right the Laplace operator, and we are done. As before in 1D, there is of course some discussion to be made here, arguing that our spring model in (3) is indeed the correct one. But do not worry, experiments confirm our findings.
(5) In 3 dimensions now, which is the case of the main interest, corresponding to our real-life world, the same argument carries over, and the wave equation is as follows:
(6) Finally, the same argument, namely a lattice model, carries on in arbitrary [math]N[/math] dimensions, and the wave equation here is as follows:
Thus, we are led to the conclusion in the statement.
In order to reach to some further insight into our spring models above, we must get deeper into elasticity. We have here the following result:
The wave equation can be understood as well directly, as a wave propagating through a linear elastic medium, via stress.
This is indeed something very standard, the idea being as follows:
(1) In the 1D case, assume that we have a bar of length [math]L[/math], made of linear elastic material. The stiffness of the bar is then the following quantity, with [math]A[/math] being the cross-sectional area, and with [math]E[/math] being the Young modulus of the material:
Now when sending a pulse, this propagates as follows, [math]M[/math] being the total mass:
Bur since [math]V=AL[/math] is the volume, with [math]\rho=M/V[/math] being the density, we have:
Thus, as a conclusion, the wave propagates with speed [math]v=\sqrt{E/\rho}[/math].
(2) In two or more dimensions, the study, and final result, are similar.
Getting now to mathematics, with a bit of work, we can fully solve the 1D wave equation. In order to explain this, we will need a standard calculus result, as follows:
The derivative of a function of type
Consider a primitive of the function that we integrate, [math]F'=f[/math]. We have:
By using now the chain rule for derivatives, we obtain from this:
Thus, we are led to the formula in the statement.
Now back to the 1D waves, the general result here, due to d'Alembert, along with a little more, in relation with our lattice models above, is as follows:
The solution of the 1D wave equation with initial value conditions [math]\varphi(x,0)=f(x)[/math] and [math]\dot{\varphi}(x,0)=g(x)[/math] is given by the d'Alembert formula, namely:
There are several things going on here, the idea being as follows:
(1) Let us first check that the d'Alembert solution is indeed a solution of the wave equation [math]\ddot{\varphi}=v^2\varphi''[/math]. The first time derivative is computed as follows:
The second time derivative is computed as follows:
Regarding now space derivatives, the first one is computed as follows:
As for the second space derivative, this is computed as follows:
Thus we have indeed [math]\ddot{\varphi}=v^2\varphi''[/math]. As for the initial conditions, [math]\varphi(x,0)=f(x)[/math] is clear from our definition of [math]\varphi[/math], and [math]\dot{\varphi}(x,0)=g(x)[/math] is clear from our above formula of [math]\dot{\varphi}[/math].
(2) Conversely now, we must show that our solution is unique, but instead of going here into abstract arguments, we will simply solve our equation, which among others will doublecheck out computations in (1). Let us make the following change of variables:
With this change of variables, which is quite tricky, mixing space and time variables, our wave equation [math]\ddot{\varphi}=v^2\varphi''[/math] reformulates in a very simple way, as follows:
But this latter equation tells us that our new [math]\xi,\eta[/math] variables get separated, and we conclude from this that the solution must be of the following special form:
Now by taking into account the intial conditions [math]\varphi(x,0)=f(x)[/math] and [math]\dot{\varphi}(x,0)=g(x)[/math], and then integrating, we are led to the d'Alembert formula in the statement.
(3) In regards now with our discretization questions, by using a 1D lattice model with balls and springs as before, what happens to all the above is more or less that the above d'Alembert integral gets computed via Riemann sums, in our model, as stated.
In [math]N\geq2[/math] dimensions things are more complicated. We will be back to this.
5d. Discrete heat
The general heat equation is quite similar to the one for the waves, as follows:
Heat diffusion in [math]\mathbb R^N[/math] is described by the heat equation
The study here is quite similar to the study of waves, as follows:
(1) To start with, as an intuitive explanation for the equation, since the second derivative [math]\varphi''[/math] in one dimension, or the quantity [math]\Delta\varphi[/math] in general, computes the average value of a function [math]\varphi[/math] around a point, minus the value of [math]\varphi[/math] at that point, the heat equation as formulated above tells us that the rate of change [math]\dot{\varphi}[/math] of the temperature of the material at any given point must be proportional, with proportionality factor [math]\alpha \gt 0[/math], to the average difference of temperature between that given point and the surrounding material.
(2) The heat equation as formulated above is of course something approximative, and several improvements can be made to it, first by incorporating a term accounting for heat radiation, and then doing several fine-tunings, depending on the material involved. But more on this later, for the moment let us focus on the heat equation above.
(3) In relation with our modelling questions, we can recover this equation a bit as we did for the wave equation before, by using a basic lattice model. Indeed, let us first assume, for simplifying, that we are in the one-dimensional case, [math]N=1[/math]. Here our model looks as follows, with distance [math]l \gt 0[/math] between neighbors:
In order to model heat diffusion, we have to implement the intuitive mechanism explained above, namely “the rate of change of the temperature of the material at any given point must be proportional, with proportionality factor [math]\alpha \gt 0[/math], to the average difference of temperature between that given point and the surrounding material”.
(4) In practice, this leads to a condition as follows, expressing the change of the temperature [math]\varphi[/math], over a small period of time [math]\delta \gt 0[/math]:
To be more precise, we have made several assumptions here, as follows:
-- General heat diffusion assumption: the change of temperature at any given point [math]x[/math] is proportional to the average over neighbors, [math]y\sim x[/math], of the differences [math]\varphi(y,t)-\varphi(x,t)[/math] between the temperatures at [math]x[/math], and at these neighbors [math]y[/math].
-- Infinitesimal time and length conditions: in our model, the change of temperature at a given point [math]x[/math] is proportional to small period of time involved, [math]\delta \gt 0[/math], and is inverse proportional to the square of the distance between neighbors, [math]l^2[/math].
(5) Regarding these latter assumptions, the one regarding the proportionality with the time elapsed [math]\delta \gt 0[/math] is something quite natural, physically speaking, and mathematically speaking too, because we can rewrite our equation as follows, making it clear that we have here an equation regarding the rate of change of temperature at [math]x[/math]:
As for the second assumption that we made above, namely inverse proportionality with [math]l^2[/math], this can be justified on physical grounds too, but again, perhaps the best is to do the math, which will show right away where this proportionality comes from.
(6) So, let us do the math. In the context of our 1D model the neighbors of [math]x[/math] are the points [math]x\pm l[/math], and so the equation that we wrote above takes the following form:
Now observe that we can write this equation as follows:
(7) As it was the case with the wave equation before, we recognize on the right the usual approximation of the second derivative, coming from calculus. Thus, when taking the continuous limit of our model, [math]l\to 0[/math], we obtain the following equation:
Now with [math]t\to0[/math], we are led in this way to the heat equation, namely:
Summarizing, we are done with the 1D case, with our proof being quite similar to the one for the wave equation, from the previous section.
(8) In practice now, there are of course still a few details to be discussed, in relation with all this, for instance at the end, in relation with the precise order of the limiting operations [math]l\to0[/math] and [math]\delta\to0[/math] to be performed, but these remain minor aspects, because our equation makes it clear, right from the beginning, that time and space are separated, and so that there is no serious issue with all this. And so, fully done with 1D.
(9) With this done, let us discuss now 2 dimensions. Here, as before for the waves, we can use a lattice model as follows, with all lengths being [math]l \gt 0[/math], for simplifying:
(10) We have to implement now the physical heat diffusion mechanism, namely “the rate of change of the temperature of the material at any given point must be proportional, with proportionality factor [math]\alpha \gt 0[/math], to the average difference of temperature between that given point and the surrounding material”. In practice, this leads to a condition as follows, expressing the change of the temperature [math]\varphi[/math], over a small period of time [math]\delta \gt 0[/math]:
In fact, we can rewrite our equation as follows, making it clear that we have here an equation regarding the rate of change of temperature at [math]x[/math]:
(11) So, let us do the math. In the context of our 2D model the neighbors of [math]x[/math] are the points [math](x\pm l,y\pm l)[/math], so the equation above takes the following form:
Now observe that we can write this equation as follows:
(12) As it was the case when modeling the wave equation before, we recognize on the right the usual approximation of the second derivative, coming from calculus. Thus, when taking the continuous limit of our model, [math]l\to 0[/math], we obtain the following equation:
Now with [math]t\to0[/math], we are led in this way to the heat equation, namely:
Finally, in arbitrary [math]N[/math] dimensions the same argument carries over, namely a straightforward lattice model, and gives the heat equation, as formulated in the statement.
Observe that we can use if we want different lenghts [math]l \gt 0[/math] on the vertical and on the horizontal, because these will simplify anyway due to proportionality. Also, for some further mathematical fun, we can build our model on a cylinder, or a torus.
Also, as mentioned before, our heat equation above is something approximative, and several improvements can be made to it, first by incorporating a term accounting for heat radiation, and also by doing several fine-tunings, depending on the material involved. Some of these improvements can be implemented in the lattice model setting.
Regarding now the mathematics of the heat equation, many things can be said. As a first result here, often used by mathematicians, as to assume [math]\alpha=1[/math], we have:
Up to a time rescaling, we can assume [math]\alpha=1[/math], as to deal with
This is clear physically speaking, because according to our model, changing the parameter [math]\alpha \gt 0[/math] will result in accelerating or slowing the heat diffusion, in time [math]t \gt 0[/math]. Mathematically, this follows via a change of variables, for the time variable [math]t[/math].
Regarding now the resolution of the heat equation, we have here:
The heat equation, normalized as [math]\dot\varphi=\Delta\varphi[/math], and with initial condition [math]\varphi(x,0)=f(x)[/math], has as solution the function
According to the definition of the convolution operation [math]*[/math], we have to check that the following function satisfies [math]\dot\varphi=\Delta\varphi[/math], with initial condition [math]\varphi(x,0)=f(x)[/math]:
But both checks are elementary, coming from definitions.
Regarding now our discretization questions, things here are quite tricky, and related to the Central Limit Theorem (CLT) from probability theory, which produces the normal laws, in dimension [math]N=1[/math], but also in general, in arbitrary [math]N\geq1[/math] dimensions.
General references
Banica, Teo (2024). "Graphs and their symmetries". arXiv:2406.03664 [math.CO].