78
arXiv:0907.4178v1 [math.PR] 23 Jul 2009 An Introduction to Stochastic PDEs July 23, 2009 Martin Hairer The University of Warwick / Courant Institute Contents 1 Introduction 1 1.1 Acknowledgements ...................................... 2 2 Some Motivating Examples 2 2.1 A model for a random string (polymer) ............................ 2 2.2 The stochastic Navier-Stokes equations ........................... 3 2.3 The stochastic heat equation ................................. 4 2.4 What have we learned? .................................... 6 3 Gaussian Measure Theory 7 3.1 A-priori bounds on Gaussian measures ............................ 10 3.2 The Cameron-Martin space .................................. 16 3.3 Images of Gaussian measures ................................. 21 3.4 Cylindrical Wiener processes and stochastic integration ................... 24 4 A Primer on Semigroup Theory 28 4.1 Strongly continuous semigroups ............................... 29 4.2 Semigroups with selfadjoint generators ............................ 34 4.3 Analytic semigroups ..................................... 35 4.4 Interpolation spaces ...................................... 39 5 Linear SPDEs / Stochastic Convolutions 44 5.1 Time and space regularity ................................... 47 5.2 Long-time behaviour ..................................... 50 5.3 Convergence in other topologies ............................... 55 6 Semilinear SPDEs 63 6.1 Local solutions ........................................ 63 6.2 Interpolation inequalities and Sobolev embeddings ..................... 66 6.3 Reaction-diffusion equations ................................. 70 6.4 The stochastic Navier-Stokes equations ........................... 73 1 Introduction These notes are based on a series of lectures given first at the University of Warwick in spring 2008 and then at the Courant Institute in spring 2009. It is an attempt to give a reasonably self-contained presentation of the basic theory of stochastic partial differential equations, taking for granted basic measure theory, functional analysis and probability theory, but nothing else. Since the aim was to present most of the material covered in these notes during a 30-hours series of postgraduate lecture, such an attempt is doomed to failure unless drastic choices are made. This is why many

arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

arX

iv:0

907.

4178

v1 [

mat

h.P

R]

23 J

ul 2

009

An Introduction to Stochastic PDEs

July 23, 2009

Martin Hairer

The University of Warwick / Courant Institute

Contents

1 Introduction 11.1 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 2

2 Some Motivating Examples 22.1 A model for a random string (polymer) . . . . . . . . . . . . . . . . .. . . . . . . . . . . 22.2 The stochastic Navier-Stokes equations . . . . . . . . . . . . .. . . . . . . . . . . . . . 32.3 The stochastic heat equation . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 42.4 What have we learned? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 6

3 Gaussian Measure Theory 73.1 A-priori bounds on Gaussian measures . . . . . . . . . . . . . . . .. . . . . . . . . . . . 103.2 The Cameron-Martin space . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 163.3 Images of Gaussian measures . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 213.4 Cylindrical Wiener processes and stochastic integration . . . . . . . . . . . . . . . . . . . 24

4 A Primer on Semigroup Theory 284.1 Strongly continuous semigroups . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 294.2 Semigroups with selfadjoint generators . . . . . . . . . . . . .. . . . . . . . . . . . . . . 344.3 Analytic semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 354.4 Interpolation spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 39

5 Linear SPDEs / Stochastic Convolutions 445.1 Time and space regularity . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 475.2 Long-time behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 505.3 Convergence in other topologies . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 55

6 Semilinear SPDEs 636.1 Local solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 636.2 Interpolation inequalities and Sobolev embeddings . . .. . . . . . . . . . . . . . . . . . 666.3 Reaction-diffusion equations . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 706.4 The stochastic Navier-Stokes equations . . . . . . . . . . . . .. . . . . . . . . . . . . . 73

1 Introduction

These notes are based on a series of lectures given first at theUniversity of Warwick in spring 2008and then at the Courant Institute in spring 2009. It is an attempt to give a reasonably self-containedpresentation of the basic theory of stochastic partial differential equations, taking for granted basicmeasure theory, functional analysis and probability theory, but nothing else. Since the aim wasto present most of the material covered in these notes duringa 30-hours series of postgraduatelecture, such an attempt is doomed to failure unless drasticchoices are made. This is why many

Page 2: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

2 SOME MOTIVATING EXAMPLES

important facets of the theory of stochastic PDEs are missing from these notes. In particular, wedo not treat equations with multiplicative noise, we donot treat equations driven Levy noise, wedo not consider equations with ‘rough’ (that is not locally Lipschitz, even in a suitable space)nonlinearities, we donot treat measure-valued processes, we donot consider hyperbolic or ellipticproblems, we donot cover Malliavin calculus and densities of solutions, etc. The reader who isinterested in a more detailed exposition of these more technically subtle parts of the theory mightbe advised to read the excellent works [DPZ92b, DPZ96, PZ07,PR07, SS05].

Instead, the approach taken in these notes is to focus on semilinearparabolicproblems drivenby additivenoise. These can be treated as stochastic evolution equations in some infinite-dimen-sional Banach or Hilbert space that usually have nice regularising properties and they alreadyform (in my humble opinion) a very rich class of problems withmany interesting properties.Furthermore, this class of problems has the advantage of allowing to completely pass under silencemany subtle problems arising from stochastic integration in infinite-dimensional spaces.

1.1 Acknowledgements

These notes would never have been completed, were it not for the enthusiasm of the attendants ofthe course. Hundreds of typos and mistakes were spotted and corrected. I am particularly indebtedto David Epstein and Jochen Voß who carefully worked their way through these notes when theywere still in a state of wilderness. Special thanks are also due to Pavel Bubak who was runningthe tutorials for the course given in Warwick.

2 Some Motivating Examples

2.1 A model for a random string (polymer)

TakeN + 1 particles with positionsun immersed in a fluid and assume that nearest-neighboursare connected by harmonic springs. If the particles are furthermore subject to an external forcingF , the equations of motion (in the overdamped regime where theforces acting on the particle aremore important than inertia, which can also formally be seenas the limit where the masses of theparticles go to zero) would be given by

du0

dt= k(u1 − u0) + F (u0) ,

dun

dt= k(un+1 + un−1 − 2un) + F (un) , n = 1, . . . , N − 1 ,

duN

dt= k(uN−1 − uN ) + F (uN ) .

This is a primitive model for a polymer chain consisting ofN + 1 monomers and without self-interaction. It does however not take into account the effect of the molecules of water that wouldrandomly ‘kick’ the particles that make up our string. Assuming that these kicks occur randomlyand independently at high rate, this effect can be modelled in first instance by independent whitenoises acting on all degrees of freedom of our model. We thus obtain a system of coupled stochas-tic differential equations:

du0 = k(u1 − u0) dt + F (u0) dt+ σ dw0(t) ,

dun = k(un+1 + un−1 − 2un) dt + F (un) dt + σ dwn(t) , n = 1, . . . , N − 1 ,

duN = k(uN−1 − uN ) dt + F (uN ) dt + σ dwN (t) .

Page 3: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SOME MOTIVATING EXAMPLES 3

Formally taking the continuum limit (with the scalingsk ≈ νN2 andσ ≈√N ), we can infer that

if N is very large, this system is well-described by the solutionto a stochasticpartial differentialequation

du(x, t) = ν∂2xu(x, t) dt + F (u(x, t)) dt + dW (x, t) ,

endowed with the boundary conditions∂xu(0, t) = ∂xu(1, t) = 0. It is not so cleara priori whatthe meaning of the termdW (x, t) should be. We will see in the next section that, at least on aformal level, it is reasonable to assume thatEdW (x,t)

dtdW (y,s)

ds = δ(x − y)δ(t − s). The precisemeaning of this formula will be discussed later.

2.2 The stochastic Navier-Stokes equations

The Navier-Stokes equations describing the evolution of the velocity fieldu(x, t) of an incom-pressible viscous fluid are given by

du

dt= ν∆u− (u · ∇)u−∇p+ f , (2.1)

complemented with the (algebraic) incompressibility condition divu = 0. Here,f denotes someexternal force acting on the fluid, whereas the pressurep is given implicitly by the requirementthat divu = 0 at all times.

While it is not too difficult in general to show that solutionsto (2.1) exist in some weak sense,in the case wherex ∈ Rd with d ≥ 3, their uniquenessis an open problem with a $1,000,000prize. We will of course not attempt to solve this long-standing problem, so we are going torestrict ourselves to the cased = 2. (The cased = 1 makes no sense since there the conditiondivu = 0 would imply thatu is constant. However, one could also consider the Burger’s equationwhich has similar features to the Navier-Stokes equations.)

For simplicity, we consider solutions that are periodic in space, so that we viewu as a functionfrom T2 × R+ to R2. In the absence of external forcingf , one can use the incompressibilityassumption to see that

d

dt

T2

|u(x, t)|2 dx = −2ν

T2

trDu(x, t)∗Du(x, t) dx ≤ −2ν

T2

|u(x, t)|2 dx ,

where we used the Poincare inequality in the last line (assuming that∫

T2 u(x, t) dx = 0). There-fore, by Gronwall’s inequality, the solutions decay to0 exponentially fast. This shows that energyneeds to be pumped into the system continuously if one wishesto maintain an interesting regime.

One way to achieve this from a mathematical point of view is toadd a forcef that is randomlyfluctuating. We are going to show that if one takes a random force that is Gaussian and such that

Ef (x, t)f (y, s) = δ(t− s)C(x− y) ,

for some correlation functionC then, provided thatC is sufficiently regular, one can show that(2.1) has solutions for all times. Furthermore, these solutions do not blow up in the sense that onecan find a constantK such that, for any solution to (2.1), one has

lim supt→∞

E‖u(t)‖2 ≤ K ,

for some suitable norm‖ · ‖. This allows to provide a construction of a model for homogeneousturbulence which is amenable to mathematical analysis.

Page 4: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

4 SOME MOTIVATING EXAMPLES

2.3 The stochastic heat equation

In this section, we focus on the particular example of the stochastic heat equation. We will performa number of calculations that give us a feeling for what the solutions to this equation look like.These calculations will not be completely rigorous but could be made so with some extra effort.Most tools required to make them rigorous will be introducedlater in the course.

2.3.1 Setup

Recall that theheat equationis the partial differential equation:

∂tu = ∆u , u: R+ × Rn → R . (2.2)

Given any bounded continuous initial conditionu0: Rn → R, there exists a unique solutionu to(2.2) which is continuous onR+ × Rn and such thatu(0, x) = u0(x) for everyx ∈ Rn.

This solution is given by the formula

u(t, x) =1

(4πt)n/2

Rne−

|x−y|24t u0(y) dy .

We will denote this by the shorthandu(t, · ) = e∆tu0 by analogy with the solution to anRd-valuedlinear equation of the type∂tu = Au.

Let us now go one level up in difficulty by considering (2.2) with an additional ‘forcing term’f :

∂tu = ∆u+ f , u: R+ × Rn → R . (2.3)

From the variations of constants formula, we obtain that thesolution to (2.3) is given by

u(t, · ) = e∆tu0 +

∫ t

0

e∆(t−s)f (s, · ) ds . (2.4)

Since the kernel defininge∆t is very smooth, this expression actually makes sense for a largeclass of distributionsf . Suppose now thatf is ‘space-time white noise’. We do not define thisrigorously for the moment, but characterise it as a (distribution-valued) centred Gaussian processξ such thatEξ(s, x)ξ(t, y) = δ(t− s)δ(x− y).

The stochastic heat equation is then the stochastic partialdifferential equation

∂tu = ∆u+ ξ , u: R+ × Rn → R . (2.5)

Consider the simplest caseu0 = 0, so that its solution is given by

u(t, x) =

∫ t

0

1

(4π|t − s|)n/2

Rne−

|x−y|24(t−s) ξ(s, y) dy ds (2.6)

This is again a centred Gaussian process, but its covariancefunction is more complicated. The aimof this section is to get some idea about the space-time regularity properties of (2.6). While thesolutions to ordinary stochastic differential equations are in generalα-Holder continuous (in time)for everyα < 1/2 but not forα = 1/2, we will see that in dimensionn = 1, u as given by (2.6)is only ‘almost’1/4-Holder continuous in time and ‘almost’1/2-Holder continuous in space. Inhigher dimensions, it is not even function-valued... The reason for this lower time-regularity isthat the driving space-time white noise is not only very singular as a function of time, but also asa function of space. Therefore, some of the regularising effect of the heat equation is required toturn it into a continuous function in space.

Page 5: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SOME MOTIVATING EXAMPLES 5

Heuristically, the appearance of the Holder exponents1/2 for space and1/4 for time in di-mensionn = 1 can be understood by the following argument. If we were to remove the term∂tu in (2.5), thenu would have the same time-regularity asξ, but two more derivatives of spaceregularity. If on the other hand we were to remove the term∆u, thenu would have the samplespace regularity asξ, but one more derivative of time regularity. The consequence of keeping bothterms is that we can ‘trade’ space-regularity against time-regularity at a cost of one time derivativefor two space derivatives. Now we know that white noise (thatis the centred Gaussian processη with Eη(t)η(s) = δ(t − s)) is the time derivative of Brownian motion, which itself is‘almost’1/2-Holder continuous. Therefore, the regularity ofη requires ‘a bit more than half a derivative’of improvement if we wish to obtain a continuous function.

Turning back toξ, we see that it is expected to behave likeη both in the space direction andin the time direction. So, in order to turn it into a continuous function of time, roughly half of atime derivative is required. This leaves over half of a time derivative, which we trade against onespatial derivative, thus concluding that for fixed time,u will be almost1/2-Holder continuous inspace. Concerning the time regularity, we note that half of aspace derivative is required to turnξ into a continuous function of space, thus leaving one and a half space derivative. These can betraded against3/4 of a time derivative, thus explaining the1/4-Holder continuity in time.

In Section 5.1, we are going to see more precisely how the space-regularity and the time-regularity interplay in the solutions to linear SPDEs, thusallowing us to justify rigorously thistype of heuristic arguments. For the moment, let us justify it by a calculation in the particular caseof the stochastic heat equation.

2.3.2 A formal calculation

Define the covariance for the solution to the stochastic heatequation by

C(s, t, x, y) = Eu(s, x)u(t, y) , (2.7)

whereu is given by (2.6).By (statistical) translation invariance, it is clear thatC(s, t, x, y) = C(s, t, 0, x − y). Using

(2.6) and the expression for the covariance ofξ, one has

C(s, t, 0, x)

=1

(4π)nE

∫ t

0

∫ s

0

Rn

Rn

1

|s− r′|n/2|t− r|n/2e− |x−y|2

4(t−r) −|y′|2

4(s−r′) ξ(r, y)ξ(r′, y′) dy dy′ dr′ dr

=1

(4π)n

∫ s∧t

0

Rn

1

|s− r|n/2|t− r|n/2e−

|x−y|24(t−r) −

|y|24(s−r) dy dr

=1

(4π)n

∫ s∧t

0

Rn

1

|s− r|n/2|t− r|n/2

× exp(− |x|2

4(t− r)− 〈x, y〉

2(t− r)− |y|2

4(s− r)− |y|2

4(t− r)

)dy dr .

The integral overy can be performed explicitly by ‘completing the square’ and one obtains

C(s, t, 0, x) = 2−n∫ s∧t

0

(s+ t− 2r)−n/2 exp(− |x|2

4(s+ t− 2r)

)dr

= 2−(n+1)∫ s+t

|s−t|ℓ−n/2 exp

(−|x|2

4ℓ

)dℓ . (2.8)

We notice that the singularity atℓ = 0 is integrable if and only ifn < 2, so thatC(t, t, 0, 0) isfinite only in the one-dimensional case. We therefore limit ourselves to this case in the sequel.

Page 6: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

6 SOME MOTIVATING EXAMPLES

Remark 2.1 Even though the random variableu defined by (2.6) is not function-valued in dimen-sion2, it is ‘almost’ the case since the singularity in (2.8) diverges only logarithmically. The sta-tionary solution to (2.5) is called theGaussian free fieldand has been the object of intense studiesover the last few years, especially in dimension2. Its interest stems from the fact that many of itsfeatures are conformally invariant (as a consequence of theconformal invariance of the Laplacian),thus linking probability theory to quantum field theory on one hand and to complex geometry onthe other hand. The Gaussian free field also relates directlyto the Schramm-Loewner evolutions(SLEs) for the study of which Werner was awarded the Fields medal in 2006, see [Law04, SS06].For more information on the Gaussian free field, see for example the review article by Sheffield[She07].

The regularity ofu is determined by the behaviour ofC near the ‘diagonal’s = t, x = y. Wefirst consider the time regularity. We therefore setx = 0 and compute

C(s, t, 0, 0) =1

4

∫ s+t

|s−t|ℓ−1/2 dℓ = 1

2(|s+ t| 12 − |s− t| 12 ) .

This shows that, in the casen = 1 and fors ≈ t, one has the asymptotic behaviour

E|u(s, 0) − u(t, 0)|2 ≈ |t− s| 12 .

Comparing this with the standard Brownian motion for whichE|B(s) − B(t)|2 = |t − s|, weconclude that the processt 7→ u(t, x) is, for fixedx, almost surelyα-Holder continuous for anyexponentα < 1/4 but not for α = 1/4. This argument is a special case of Kolmogorov’s cele-brated continuity test, of which we will see a version adapted to Gaussian measures in Section 3.1.

If, on the other hand, we fixs = t, we obtain (still in the casen = 1) via the change ofvariablesz = |x|2/4ℓ, the expression

C(t, t, 0, x) =|x|8

∫ ∞

|x|28t

z−3

2 e−z dz .

Integrating by parts, we get

C(t, t, 0, x) =

√t

4e−

|x|28t +

|x|4

∫ ∞

|x|28t

z−1

2 e−z dz ,

So that to leading order we have for small values ofx:

C(t, t, 0, x) ≈√t

4+

|x|4

∫ ∞

0

z−1

2 e−z dz =√t+

√π|x|4

+ O(|x|2/8√t) .

This shows that, at any fixed instantt, the solution to (2.5) looks like a Brownian motion in spaceover lengthscales of ordert1/2. Note that over such a lengthscale the Brownian motion fluctuatesby aboutt1/4, which is precisely the order of magnitude ofE|u(t, x)|.

2.4 What have we learned?

1. At a ‘hand-waving’ level, we have forced our equation witha term that has a temporalevolution resembling white noise, so that one would naivelyexpect its solutions to havea temporal regularity resembling Brownian motion. However, for any fixed location inspace, the solution to the stochastic heat equation has a time-regularity which is only almostHolder-1

4, as opposed to the almost Holder-1

2time-regularity of Brownian motion.

Page 7: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 7

2. Unlike the solutions to an ordinary parabolic PDE, the solutions to a stochastic PDE tend tobe spatially ‘rough’. It is therefore not obviousa priori how the formal expression that weobtained is to be related to the original equation (2.5), since even for positive times, the mapx 7→ u(t, x) is certainly not twice differentiable.

3. Even though the deterministic heat equation has the property thate∆tu → 0 ast → ∞ foreveryu ∈ L2, the solution to the stochastic heat equation has the property thatE|u(x, t)|2 →∞ for fixedx ast→ ∞. This shows that in this particular case, the stochastic forcing termpumps energy into the system faster than the deterministic evolution can dissipate it.

Exercise 2.2Perform the same calculation, but for the equation

∂tu = ∆u− au+ ξ , u: R+ × R → R .

Show that ast→ ∞, the law of its solution converges to the law of an Ornstein-Uhlenbeck process(if the space variable is viewed as ‘time’):

limt→∞

Eu(t, x)u(t, y) = Ce−c|x−y| .

Compute the constantsC andc as functions of the parametera.

3 Gaussian Measure Theory

This section is devoted to the study of Gaussian measures on general Banach spaces. Throughoutthis section and throughout most of the remainder of these notes, we will denote byB an arbitraryseparable Banach space. Recall that a space is separable if it contains a countable dense subset,see for example the monograph [Yos95]. This separability assumption turns out to be crucial formeasures onB to behave in a non-pathological way. It can be circumvented by trickery in mostnatural situations where non-separable spaces arise, but we choose not to complicate our lives byconsidering overly general cases in these notes. Another convention that will be used throughoutthese notes is that all of the measures that we consider are Borel measures, meaning that weconsider every open set to be measurable.

One additional assumption that would appear to be natural inthe context of Gaussian measuretheory is thatB be reflexive (that isB∗∗ = B). This is for example because the mean of a measureµ appears at first sight to be an element ofB∗∗ rather than ofB, since the natural1 way of definingthe meanm of µ is to setm(ℓ) =

∫B ℓ(x)µ(dx) for anyℓ ∈ B∗. This turns out not to be a problem,

since the mean of a Gaussian measure on aseparableBanach spaceB is always an element ofB itself, see the monograph [Bog98]. However this result is not straightforward to prove, so wewill take here the more pragmatic approach that whenever we consider Gaussian measures withnon-zero mean, we simply take the meanm ∈ B as given.

Example 3.1 Before we proceed, let us just mention a few examples of Banach spaces. ThespacesLp(M, ν) (with (M, ν) any countably generated measure space like for example anyPolishspace equipped with a Radon measureν) for p ∈ (1,∞) are both reflexive and separable. However,reflexivity fails in general forL1 spaces and both properties fail to hold in general forL∞ spaces[Yos95]. The space of bounded continuous functions on a compact space is separable, but notreflexive. The space of bounded continuous functions fromRn to R is neither separable nor

1Without further assumption, we do not know a priori whetherx 7→ ‖x‖ is integrable, so that the more naturaldefinitionm =

∫B x µ(dx) is prohibited.

Page 8: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

8 GAUSSIAN MEASURE THEORY

reflexive, but the space of continuous functions fromRn to R vanishing at infinity is separable.(The last two statements are still true if we replaceRn by any locally compact complete separablemetric space.) Hilbert spaces are obviously reflexive sinceH∗ = H for every Hilbert spaceH bythe Riesz representation theorem [Yos95]. There exist non-separable Hilbert spaces, but they haverather pathological properties and do not appear very oftenin practice.

We start with the definition of a Gaussian measure on a Banach space. Since there is noequivalent to Lebesgue measure in infinite dimensions (one could never expect it to beσ-additive),we cannot define it by prescribing the form of its density. However, it turns out that Gaussianmeasures onRn can be characterised by prescribing that the projections ofthe measure onto anyone-dimensional subspace ofRn are all Gaussian. This is a property that can readily be generalisedto infinite-dimensional spaces:

Definition 3.2 A Gaussian probability measureµ on a Banach spaceB is a Borel measure suchthatℓ∗µ is a real Gaussian probability measure onR for every linear functionalℓ:B → R. (Diracmeasures are also considered to be Gaussian measures, but with zero covariance.) We call itcentredif ℓ∗µ is centred for everyℓ.

Remark 3.3 We used here the notationf∗µ for the push-forward of a measureµ under a mapf .This is defined by the relation(f∗µ)(A) = µ(f−1(A)).

Remark 3.4 We could also have defined Gaussian measures by imposing thatT ∗µ is Gaussian forevery bounded linear mapT :B → Rn and everyn. These two definitions are equivalent becauseprobability measures onRn are characterised by their Fourier transforms and these areconstructedfrom one-dimensional marginals, see Proposition 3.9 below.

Exercise 3.5Let ξn be a sequence of i.i.d.N (0, 1) random variables and letan be a sequenceof real numbers. Show that the law of (a0ξ0, a1ξ1, . . .) determines a Gaussian measure onℓ2 ifand only if

∑n≥0 a

2n <∞.

One first question that one may ask is whether this is indeed a reasonable definition. After all, itonly makes a statement about the one-dimensional projections of the measureµ, which itself liveson a huge infinite-dimensional space. However, this turns out to be reasonable since, providedthatB is separable, the one-dimensional projections of any probability measure carry sufficientinformation to characterise it. This statement can be formalised as follows:

Proposition 3.6 Let B be a separable Banach space and letµ and ν be two probability Borelmeasures onB. If ℓ∗µ = ℓ∗ν for everyℓ ∈ B∗, thenµ = ν.

Proof. Denote by Cyl(B) the algebra of cylindrical sets onB, that isA ∈ Cyl(B) if and onlyif there existsn > 0, a continuous linear mapT :B → Rn, and a Borel setA ⊂ Rn such thatA = T−1A. It follows from the fact that measures onRn are determined by their one-dimensionalprojections thatµ(A) = ν(A) for everyA ∈ Cyl(B) and therefore, by a basic uniqueness resultin measure theory (see Lemma II.4.6 in [RW94] or Theorem 1.5.6 in [Bog07] for example), foreveryA in theσ-algebraE(B) generated by Cyl(B). It thus remains to show thatE(B) coincideswith the Borelσ-algebra ofB. Actually, since every cylindrical set is a Borel set, it suffices toshow that all open (and therefore all Borel) sets are contained inE(B).

SinceB is separable, every open setU can be written as a countable union of closed balls. (Fixany dense countable subsetxn of B and check that one has for exampleU =

⋃xn∈U B(xn, rn),

wherern = 12

supr > 0 : B(xn, r) ⊂ U and B(x, r) denotes the closed ball of radiusr

Page 9: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 9

centred atx.) SinceE(B) is invariant under translations and dilations, it remainsto check thatB(0, 1) ∈ E(B). Let xn be a countable dense subset ofx ∈ B : ‖x‖ = 1 and letℓn beany sequence inB∗ such that‖ℓn‖ = 1 andℓn(xn) = 1 (such elements exist by the Hahn-Banachextension theorem [Yos95]). Define nowK =

⋂n≥0x ∈ B : |ℓn(x)| ≤ 1. It is clear that

K ∈ E(B), so that the proof is complete if we can show thatK = B(0, 1).Since obviouslyB(0, 1) ⊂ K, it suffices to show that the reverse inclusion holds. Lety ∈

B with ‖y‖ > 1 be arbitrary and sety = y/‖y‖. By the density of thexn’s, there exists asubsequencexkn such that‖xkn − y‖ ≤ 1

n , say, so thatℓkn(y) ≥ 1 − 1n . By linearity, this implies

thatℓkn(y) ≥ ‖y‖(1− 1n), so that there exists a sufficiently largen so thatℓkn(y) > 1. This shows

thaty 6∈ K and we conclude thatK ⊂ B(0, 1) as required.

From now on, we will mostly consider centred Gaussian measures, since one can always re-duce oneself to the general case by a simple translation. Given a centred Gaussian measureµ, wedefine a mapCµ:B∗ × B∗ → R by

Cµ(ℓ, ℓ′) =

Bℓ(x)ℓ′(x)µ(dx) . (3.1)

Remark 3.7 In the caseB = Rn, this is just the covariance matrix, provided that we perform theusual identification ofRn with its dual.

Remark 3.8 One can identify in a canonical wayCµ with an operatorCµ:B∗ → B∗∗ via theidentity Cµ(ℓ)(ℓ′) = Cµ(ℓ, ℓ′).

The mapCµ will be called theCovariance operatorof µ. It follows immediately from thedefinitions that the operatorCµ is bilinear and positive definite, although there might in generalexist someℓ such thatCµ(ℓ, ℓ) = 0. Furthermore, the Fourier transform ofµ is given by

µ(ℓ) def=

Beiℓ(x) µ(dx) = exp(−1

2Cµ(ℓ, ℓ)) , (3.2)

whereℓ ∈ B∗. This can be checked by using the explicit form of the one-dimensional Gaussianmeasure. Conversely, (3.2) characterises Gaussian measures in the sense that ifµ is a measuresuch that there existsCµ satisfying (3.2) for everyℓ ∈ B∗, thenµ must be centred Gaussian. Thereason why this is so is that two distinct probability measures necessarily have distinct Fouriertransforms:

Proposition 3.9 Letµ andν be any two probability measures on a separable Banach spaceB. Ifµ(ℓ) = ν(ℓ) for everyℓ ∈ B∗, thenµ = ν.

Proof. In the particular caseB = Rn, if ϕ is a smooth function with compact support, it followsfrom Fubini’s theorem and the invertibility of the Fourier transform that one has the identity

Rnϕ(x)µ(dx) =

1

(2π)n

Rn

Rnϕ(k)e−ikx dk µ(dx) =

1

(2π)n

Rnϕ(k) µ(−k) dk ,

so that, since bounded continuous functions can be approximated by smooth functions,µ is indeeddetermined byµ. The general case then follows immediately from Proposition 3.6.

As a simple consequence, we have the following trivial but useful property:

Page 10: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

10 GAUSSIAN MEASURE THEORY

Proposition 3.10 Letµ be a Gaussian measure onB and, for everyϕ ∈ R, define the ‘rotation’Rϕ:B2 → B2 by

Rϕ(x, y) = (x sinϕ+ y cosϕ, x cosϕ− y sinϕ) .

Then, one hasR∗ϕ(µ⊗ µ) = µ⊗ µ.

Proof. Since we just showed in Proposition 3.9 that a measure is characterised by its Fouriertransform, it suffices to check thatµ⊗ µ Rϕ = µ⊗ µ, which is an easy exercise.

3.1 A-priori bounds on Gaussian measures

We are going to show now that the operatorCµ has to be bounded, as a straightforward conse-quence of the fact thatx 7→ ‖x‖2 is integrable for any Gaussian measure. Actually, we are goingto show much more, namely that there always exists a constantα > 0 such that exp(α‖x‖2) isintegrable! In other words, the norm of any Banach-space valued Gaussian random variable hasGaussian tails, just like in the finite-dimensional case. Amazingly, this result uses the Gaussianityof the measure only indirectly through the rotation invariance shown in Proposition 3.10, and eventhis property is only used for rotations by the angleϕ = π/4. This is the content of the followingfundamental result in the theory of Gaussian measures:

Theorem 3.11 (Fernique, 1970)Letµ be any probability measure on a separable Banach spaceB such that the conclusion of Proposition 3.10 holds forϕ = π/4. Then, there existsα > 0 suchthat

∫B exp(α‖x‖2)µ(dx) <∞.

Proof. Note first that, from Proposition 3.10, one has for any two positive numberst andτ thebound

µ(‖x‖ ≤ τ )µ(‖x‖ > t) =

‖x‖≤τ

‖y‖>tµ(dx)µ(dy) =

‖x−y√2‖≤τ

‖x+y√2‖>t

µ(dx)µ(dy)

≤∫

‖x‖> t−τ√2

‖y‖> t−τ√2

µ(dx)µ(dy) = µ(‖x‖ > t−τ√

2

)2

. (3.3)

In order to go from the first to the second line, we have used thefact that the triangle inequalityimplies

min‖x‖, ‖y‖ ≥ 12(‖x+ y‖ − ‖x− y‖) ,

so that‖x + y‖ >√

2t and‖x − y‖ ≤√

2τ do indeed imply that both‖x‖ and‖y‖ are greaterthan t−τ√

2. Since‖x‖ is µ-almost surely finite, there exists someτ > 0 such thatµ(‖x‖ ≤ τ ) ≥ 3

4.

Set nowt0 = τ and definetn for n > 0 recursively by the relationtn = tn+1−τ√2

. It follows from(3.3) that

µ(‖x‖ > tn+1) ≤ µ(‖x‖ > tn+1−τ√

2

)2

/µ(‖x‖ ≤ τ ) ≤ 43µ(‖x‖ > tn)2 .

Settingyn = 43µ(‖x‖ > tn+1), this yields the recursionyn+1 ≤ y2

n with y0 ≤ 1/3. Applying thisinequality repeatedly, we obtain

µ(‖x‖ > tn) =3

4yn ≤ 3

4y2n

0 ≤ 1

43−1−2n ≤ 3−2n

.

On the other hand, one can check explicitly thattn =√

2n+1−1√2−1

τ ≤ 2n/2 · (2 +√

2)τ , so that in

particulartn+1 ≤ 2n/2 · 5τ . This shows that one has the bound

µ(‖x‖ > tn) ≤ 3−t2n+1

25τ2 ,

Page 11: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 11

implying that there exists auniversal constantα > 0 such that the boundµ(‖x‖ > t) ≤exp(−2αt2/τ2) holds for everyt ≥ τ . Integrating by parts, we finally obtain

Bexp

(α‖x‖2

τ2

)µ(dx) ≤ eα +

τ2

∫ ∞

τteα

t2

τ2 µ(‖x‖ > t) dt

≤ eα + 2α

∫ ∞

1

te−αt2 dt <∞ , (3.4)

which is the desired result.

As an immediate corollary of Fernique’s theorem, we have

Corollary 3.12 There exists a constant‖Cµ‖ < ∞ such thatCµ(ℓ, ℓ′) ≤ ‖Cµ‖‖ℓ‖‖ℓ′‖ for anyℓ, ℓ′ ∈ B∗. Furthermore, the operatorCµ defined in Remark 3.8 is a continuous operator fromB∗

toB.

Proof. The boundedness ofCµ implies thatCµ is continuous fromB∗ toB∗∗. However,B∗∗ mightbe strictly larger thanB in general. The fact that the range ofCµ actually belongs toB followsfrom the fact that one has the identity

Cµℓ =

Bx ℓ(x)µ(dx) . (3.5)

Here, the right-hand side is well-defined as a Bochner integral [Boc33, Hil53] becauseB is as-sumed to be separable and we know from Fernique’s theorem that ‖x‖2 is integrable with respectto µ.

Remark 3.13 In Theorem 3.11, one can actually take forα any value smaller than1/(2‖Cµ‖).Furthermore, this value happens to be sharp, see [Led96, Thm4.1].

Another consequence of the proof of Fernique’s theorem is aneven stronger result, namelyall moments (including exponential moments!) of the norm ofa Banach-space valued Gaussianrandom variable can be estimated in a universal way in terms of its first moment. More precisely,we have

Proposition 3.14 There exist universal constantsα,K > 0 with the following properties. Letµbe a Gaussian measure on a separable Banach spaceB and letf : R+ → R+ be any measurablefunction such thatf (x) ≤ Cf exp(αx2) for everyx ≥ 0. Define furthermore the first moment ofµbyM =

∫B ‖x‖µ(dx). Then, one has the bound

∫B f (‖x‖/M )µ(dx) ≤ KCf .

In particular, the higher moments ofµ are bounded by∫B ‖x‖2n µ(dx) ≤ n!Kα−nM2n.

Proof. It suffices to note that the bound (3.4) is independent ofτ and that by Chebychev’s in-equality, one can choose for exampleτ = 4M . The last claim then follows from the fact thateαx2 ≥ αnx2n

n!.

Actually, the covariance operatorCµ is more than just bounded. IfB happens to be a Hilbertspace, one has indeed the following result, which allows us to characterise in a very precise waythe set of all centred Gaussian measures on a Hilbert space:

Page 12: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

12 GAUSSIAN MEASURE THEORY

Proposition 3.15 If B = H is a Hilbert space, then the operatorCµ:H → H defined as beforeby the identity〈Cµh, k〉 = Cµ(h, k) is trace class and one has the identity

H‖x‖2 µ(dx) = tr Cµ . (3.6)

(Here, we used Riesz’s representation theorem to identifyH with its dual.)Conversely, for every positive trace class symmetric operator K:H → H, there exists a Gaus-

sian measureµ onH such thatCµ = K.

Proof. Fix an arbitrary orthonormal basisen of H. We know from Theorem 3.11 that the secondmoment of the norm is finite:

∫H ‖h‖2 µ(dh) <∞. On the other hand, one has

H‖h‖2 µ(dh) =

∞∑

n=1

H〈h, en〉2 µ(dh) =

∞∑

n=1

〈en, Cµen〉 = tr Cµ ,

which is precisely (3.6). To pull the sum out of the integral in the first equality, we used Lebegue’sdominated convergence theorem.

In order to prove the converse statement, sinceK is compact, we can find an orthonormalbasisen of H such thatKen = λnen andλn ≥ 0,

∑n λn < ∞. Let furthermoreξn be

a collection of i.i.d.N (0, 1) Gaussian random variables (such a family exists by Kolmogorov’sextension theorem). Then, since

∑n λnEξ2n = trK < ∞, the series

∑n

√λnξnen converges in

mean square, so that it has a subsequence converging almost surely in H. One can easily checkthat the law of the limiting random variable is Gaussian and has the requested covariance.

No such precise characterisation of the covariance operators of Gaussian measures exists inBanach spaces. One can however show thatCµ is at least a little bit better than bounded, namelythat it is always a compact operator. We leave this statementas an exercise for the interestedreader, since we will not make any use of it in these notes:

Exercise 3.16Show that in the case of a Gaussian measureµ on a general separable Banach spaceB, the covariance operatorCµ:B∗ → B is compact in the sense that it maps the unit ball onB∗

into a compact subset ofB. Hint: Proceed by contradiction by first showing that ifCµ wasn’tcompact, then it would be possible to find a constantc > 0 and a sequence of elementsℓkk≥0

such that‖ℓk‖ = 1, Cµ(ℓk, ℓj) = 0 for k 6= j, andCµ(ℓk, ℓk) ≥ c for everyk. Conclude that ifthis was the case, then the law of large numbers applied to thesequence of random variablesℓn(x)would imply that supk≥0 ℓk(x) = ∞ µ-almost surely, thus obtaining a contradiction with the factthat supk≥0 ℓk(x) ≤ ‖x‖ <∞ almost surely.

In many situations, it is furthermore helpful to find out whether a given covariance structurecan be realised as a Gaussian measure on some space of Holdercontinuous functions. This can beachieved through the following version of Kolmogorov’s continuity criterion, which can be foundfor example in [RY94, p. 26]:

Theorem 3.17 (Kolmogorov) For d > 0, let C: [0, 1]d × [0, 1]d → R be a symmetric functionsuch that, for every finite collectionxim

i=1 of points in[0, 1]d, the matrixCij = C(xi, xj) ispositive definite. If furthermore there existsα > 0 and a constantK > 0 such thatC(x, x) +C(y, y) − 2C(x, y) ≤ K|x − y|2α for any two pointsx, y ∈ [0, 1]d then there exists a uniquecentred Gaussian measureµ onC([0, 1]d,R) such that

C([0,1]d,R)f (x)f (y)µ(df ) = C(x, y) , (3.7)

Page 13: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 13

for any two pointsx, y ∈ [0, 1]d. Furthermore, for everyβ < α, one hasµ(Cβ([0, 1]d,R)) = 1,whereCβ([0, 1]d,R) is the space ofβ-Holder continuous functions.

Proof. SetB = C([0, 1]d,R) andB∗ its dual, which consists of the set of Borel measures withfinite total variation [Yos95, p. 119]. Since convex combinations of Dirac measures are dense (inthe topology of weak convergence) in the set of probability measures, it follows that the set oflinear combinations of point evaluations is weakly dense inB∗. Therefore, the claim follows ifwe are able to construct a measureµ onB such that (3.7) holds and such that, iff is distributedaccording toµ, then for any finite collection of pointsxi ⊂ [0, 1]d, the joint law of thef (xi) isGaussian.

By Kolmogorov’s extension theorem, we can construct a measureµ0 onX = R[0,1]d endowedwith the productσ-algebra such that the laws of all finite-dimensional marginals are Gaussian andsatisfy (3.7). We denote byX anX -valued random variable with lawµ0. At this stage, one wouldthink that the proof is complete if we can show thatX almost surely has finiteβ-Holder norm.The problem with this statement is that theβ-Holder norm is not a measurable function onX !The reason for this is that it requires point evaluations ofX at uncountably many locations, whilefunctions that are measurable with respect to the productσ-algebra onX are allowed to dependon at most countably many function evaluations.

This problem can be circumvented very elegantly in the following way. Denote byD ⊂ [0, 1]d

the subset of dyadic numbers (actually any countable dense subset would do for now, but thedyadic numbers will be convenient later on) and define the event Ωβ by

Ωβ =X : X(x) def

= limy→xy∈D

X(y) exists for everyx ∈ [0, 1]d andX belongs toCβ([0, 1]d,R).

Since the eventΩβ can be constructed from evaluatingX at only countably many points, it is ameasurable set. For the same reason, the mapι:X → Cβ([0, 1]d,R) given by

ι(X) =

X if X ∈ Ωβ,0 otherwise

is measurable with respect to the productσ-algebra onX (and the Borelσ-algebra onCβ), so thatthe claim follows if we can show thatµ0(Ωβ) = 1 for everyβ < α. (Takeµ = ι∗µ0.) Denotingtheβ-Holder norm ofX restricted to the dyadic numbers byMβ(X) = supx 6=y :x,y∈D|X(x) −X(y)|/|x−y|β, we see thatΩβ can alternatively be characterised asΩβ = X : Mβ(X) <∞,so that the claim follows if we can show for example thatEMβ(X) <∞.

Denote byDm ⊂ D the set of those numbers whose coordinates are integer multiples of2−m

and denote by∆m the set of pairsx, y ∈ Dm such that|x− y| = 2−m. In particular, note that∆m

contains at most2(m+2)d such pairs.We are now going to make use of our simplifying assumption that we are dealing with Gaus-

sian random variables, so thatpth moments can be bounded in terms of second moments. Moreprecisely, for everyp ≥ 1 there exists a constantCp such that ifX is a Gaussian random variable,

then one has the boundE|X|p ≤ Cp(E|X|2)p/2. SettingKm(X) = supx,y∈∆m|X(x)−X(y)| and

fixing some arbitraryβ′ ∈ (β, α), we see that forp ≥ 1 large enough, there exists a constantKp

such that

EKpm(X) ≤

x,y∈∆m

E|X(x) −X(y)|p ≤ Cp

x,y∈∆m

(E|X(x) −X(y)|2)p/2

= Cp

x,y∈∆m

(C(x, x) + C(y, y) − 2C(x, y))p/2 ≤ Cp2(m+2)d−αmp

Page 14: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

14 GAUSSIAN MEASURE THEORY

≤ Cp2−β′mp ,

for some constantsCp. (In order to obtain the last inequality, we had to assume that p ≥ dα−β′

m+2

m

which can always be achieved by some value ofp independent ofm since we assumed thatβ′ <α.) Using Jensen’s inequality, this shows that there exists aconstantK such that the bound

EKm(X) ≤ K2−β′m (3.8)

holds uniformly inm. Fix now any two pointsx, y ∈ D with x 6= y and denote bym0 the largestm such that|x − y| < 2−m. One can then find sequencesxnn≥m0

andynn≥m0with the

following properties:

1. One has limn→∞ xn = x and limn→∞ yn = y.

2. Eitherxm0= ym0

or both points belong to the vertices of the same ‘dyadic hypercube’ inDm0

, so that they can be connected by at mostd ‘bonds’ in∆m0.

3. For everyn ≥ m0, xn andxn+1 belong to the vertices of the same ‘dyadic hypercube’in Dn+1, so that they can be connected by at mostd ‘bonds’ in ∆n+1 and similarly for(yn, yn+1).

x

xn

One way of constructing this sequence is to order elements inDm bylexicographic order and to choosexn = maxx ∈ Dn : xj ≤ xj ∀j, asillustrated in the picture to the right. This shows that one has the bound

|X(x) −X(y)| ≤ |X(xm0) − Y (xm0

)| +∞∑

n=m0

|X(xn+1) −X(xn)|

+∞∑

n=m0

|X(yn+1) −X(yn)|

≤ dKm0(X) + 2d

∞∑

n=m0

Kn+1(X) ≤ 2d∞∑

n=m0

Kn(X) .

Sincem0 was chosen in such a way that|x− y| ≥ 2−m0−1, one has the bound

Mβ(X) ≤ 2d supm≥0

2β(m+1)∞∑

n=m

Kn(X) ≤ 2β+1d∞∑

n=0

2βnKn(X) .

It follows from this and from the bound (3.8) that

E|Mβ(X)| ≤ 2β+1d∞∑

n=0

2βnEKn(X) ≤ 2β+1dK∞∑

n=0

2(β−β′)n <∞ ,

sinceβ′ was chosen strictly larger thanβ.

Combining Kolmogorov’s continuity criterion with Fernique’s theorem, we note that we canapply it not only to real-valued processes, but to any Gaussian Banach-space valued process:

Proposition 3.18 Let B be a separable Banach space and letX(x)x∈[0,1]d be a collection ofB-valued Gaussian random variables such that

E‖X(x) −X(y)‖ ≤ C|x− y|α ,

for someC > 0 and someα ∈ (0, 1]. Then, there exists a unique Gaussian measureµ onC([0, 1]d,B) such that, ifY is a random variable with lawµ, thenY (x) is equal in law toX(x)for everyx. Furthermore,µ(Cβ([0, 1]d,B)) = 1 for everyβ < α.

Page 15: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 15

Proof. The proof is identical to that of Theorem 3.17, noting that the boundE‖X(x) −X(y)‖p ≤Cp|x− y|αp follows from the assumption and Proposition 3.14.

Remark 3.19 The spaceCβ([0, 1]d,R) is not separable. However, the spaceCβ0 ([0, 1]d,R) of

Holder continuous functions that furthermore satisfy limy→x|f (x)−f (y)||x−y|β = 0 uniformly in x is

separable (polynomials with rational coefficients are dense in it). This is in complete analogywith the fact that the space of bounded measurable functionsis not separable, while the space ofcontinuous functions is.

It is furthermore possible to check thatCβ′ ⊂ Cβ0 for everyβ′ > β, so that Exercise 3.39 below

shows thatµ can actually be realised as a Gaussian measure onCβ0 ([0, 1]d,R).

Exercise 3.20Try to find conditions onG ⊂ Rd that are as weak as possible and such thatKolmogorov’s continuity theorem still holds if the cube [0, 1]d is replaced byG. Hint: Onepossible strategy is to embedG into a cube and then to try to extendC(x, y) to that cube.

Exercise 3.21Show that ifG is as in the previous exercise,H is a Hilbert space, andC:G×G→L(H,H) is such thatC(x, y) positive definite, symmetric, and trace class for any twox, y ∈G, then Kolmogorov’s continuity theorem still holds if its condition is replaced by trC(x, x) +trC(y, y)− 2 trC(x, y) ≤ K|x− y|α. More precisely, one can construct a measureµ on the spaceCβ([0, 1]d,H) such that

Cβ ([0,1]d,R)〈h, f (x)〉〈f (y), k〉µ(df ) = 〈h,C(x, y)k〉 ,

for anyx, y ∈ G andh, k ∈ H.

A very useful consequence of Kolmogorov’s continuity criterion is the following result:

Corollary 3.22 Let ηkk≥0 be countably many i.i.d. standard Gaussian random variables (realor complex). Moreover letfkk≥0 ⊂ Lip(G,C) where the domainG ⊂ Rd is sufficiently regularfor Kolomgorov’s continuity theorem to hold. Suppose thereis someδ ∈ (0, 2) such that

S21 =

k∈I

‖fk‖2L∞ <∞ and S2

2 =∑

k∈I

‖fk‖2−δL∞ Lip(fk)δ <∞ , (3.9)

and definef =∑

k∈I ηkfk. Thenf is almost surely bounded and Holder continuous for everyHolder exponent smaller thanδ/2.

Proof. From the assumptions we immediately derive thatf (x) and f (x) − f (y) are a centredGaussian for anyx, y ∈ G. Moreover, the corresponding series converge absolutely.Using thattheηk are i.i.d., we obtain

E|f (x) − f (y)|2 =∑

k∈I

|fk(x) − fk(y)|2 ≤∑

k∈I

min2‖fk‖2L∞ ,Lip(fk)2|x− y|2

≤ 2∑

k∈I

‖fk‖2−δL∞ Lip(fk)δ|x− y|δ = 2S2

2 |x− y|δ ,

where we used that mina, bx2 ≤ a1−δ/2bδ/2|x|δ for anya, b ≥ 0. The claim now follows fromKolmogorov’s continuity theorem.

Remark 3.23 One should really think of thefk’s in Corollary 3.22 as being an orthonormal basisof the Cameron-Martin space of some Gaussian measure. (See Section 3.2 below for the definitionof the Cameron-Martin space associate to a Gaussian measure.) The criterion (3.9) then providesan effective way of deciding whether the measure in questioncan be realised on a space of Holdercontinuous functions.

Page 16: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

16 GAUSSIAN MEASURE THEORY

3.2 The Cameron-Martin space

Given a Gaussian measureµ on a separable Banach spaceB, it is possible to associate to it ina canonical way a Hilbert spaceHµ ⊂ B, called the Cameron-Martin space ofµ. The mainimportance of the Cameron-Martin space is that it characterises precisely those directions inB inwhich translations leave the measureµ ‘quasi-invariant’ in the sense that the translated measurehas the same null sets as the original measure. In general, the spaceHµ will turn out to be strictlysmaller thanB. Actually, this is always the case as soon as dimHµ = ∞ and, even worse, wewill see that in this case one necessarily hasµ(Hµ) = 0! Contrast this to the case of finite-dimensional Lebesgue measure which is invariant under translations in any direction! This is astriking illustration of the fact that measures in infinite-dimensional spaces have a strong tendencyof being mutually singular.

The definition of the Cameron-Martin space is the following,where we postpone to Re-mark 3.26 and Proposition 3.30 the verification that‖h‖µ is well-defined and that‖h‖µ > 0for h 6= 0:

Definition 3.24 The Cameron-Martin spaceHµ of µ is the completion of the linear subspaceHµ ⊂ B defined by

Hµ = h ∈ B : ∃h∗ ∈ B∗ with Cµ(h∗, ℓ) = ℓ(h) ∀ℓ ∈ B∗ ,

under the norm‖h‖2µ = 〈h, h〉µ = Cµ(h∗, h∗). It is a Hilbert space when endowed with the scalar

product〈h, k〉µ = Cµ(h∗, k∗).

Exercise 3.25Convince yourself that the spaceHµ is nothing but the range of the operatorCµ

defined in Remark 3.8.

Remark 3.26 Even though the maph 7→ h∗ may not be one to one, the norm‖h‖µ is well-defined. To see this, assume that for a givenh ∈ Hµ, there are two corresponding elementsh∗1 andh∗2 in B∗. Then, definingk = h∗1 + h∗2, one has

Cµ(h∗1, h∗1) − Cµ(h∗2, h

∗2) = Cµ(h∗1, k) − Cµ(h∗2, k) = k(h) − k(h) = 0 ,

showing that‖h‖µ does indeed not depend on the choice ofh∗.

Exercise 3.27The Wiener measureµ is defined onB = C([0, 1],R) as the centred Gaussian mea-sure with covariance operator given byCµ(δs, δt) = s ∧ t. Show that the Cameron-Martin spacefor the Wiener measure onB = C([0, 1],R) is given by the spaceH1,2

0 ([0, 1]) of all absolutelycontinuous functionsh such thath(0) = 0 and

∫ 1

0h2(t) dt <∞.

Exercise 3.28Show that in the caseB = Rn, the Cameron-Martin space is given by the range ofthe covariance matrix. Write an expression for‖h‖µ in this case.

Exercise 3.29Show that the Cameron-Martin space of a Gaussian measure determines it. Moreprecisely, ifµ andν are two Gaussian measures onB such thatHµ = Hν and such that‖h‖µ =‖h‖ν for everyh ∈ Hµ, then they are identical.

For this reason, a Gaussian measure onB is sometimes given by specifying the Hilbert spacestructure (Hµ, ‖ · ‖µ). Such a specification is then usually called anabstract Wiener space.

Let us discuss a few properties of the Cameron-Martin space.First of all, we show that it is asubspace ofB despite the completion procedure and that all non-zero elements ofHµ have strictlypositive norm:

Page 17: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 17

Proposition 3.30 One hasHµ ⊂ B. Furthermore, one has the bound

〈h, h〉µ ≥ ‖Cµ‖−1‖h‖2 , (3.10)

where the norms on the right hand side are understood to be taken inB.

Proof. One has the chain of inequalities

‖h‖2 = supℓ∈B∗\0

ℓ(h)2

‖ℓ‖2= sup

ℓ∈B∗\0

Cµ(h∗, ℓ)2

‖ℓ‖2≤ sup

ℓ∈B∗\0

Cµ(h∗, h∗)Cµ(ℓ, ℓ)‖ℓ‖2

≤ ‖Cµ‖〈h, h〉µ ,

which yields the bound on the norms. The fact thatHµ is a subset ofB (or rather that it can beinterpreted as such) then follows from the fact thatB is complete and that Cauchy sequences inHµ are also Cauchy sequences inB by (3.10).

A simple example showing that the correspondenceh 7→ h∗ in the definition ofHµ is notnecessarily unique is the caseµ = δ0, so thatCµ = 0. If one choosesh = 0, then anyh∗ ∈ Bhas the required property thatCµ(h∗, ℓ) = ℓ(h), so that this is an extreme case of non-uniqueness.However, if we viewB∗ as a subset ofL2(B, µ) (by identifying linear functionals that agreeµ-almost surely), then the correspondenceh 7→ h∗ is always an isomorphism. One has indeed∫B h

∗(x)2 µ(dx) = Cµ(h∗, h∗) = ‖h‖2µ. In particular, ifh∗1 andh∗2 are two distinct elements ofB∗

associated to the same elementh ∈ B, thenh∗1 − h∗2 is associated to the element0 and therefore∫B(h∗1 − h∗2)

2(x)µ(dx) = 0, showing thath∗1 = h∗2 as elements ofL2(B, µ). We have:

Proposition 3.31 There is a canonical isomorphismι:h 7→ h∗ betweenHµ and the closureRµ

ofB∗ in L2(B, µ). In particular,Hµ is separable.

Proof. We have already shown thatι:Hµ → L2(B, µ) is an isomorphism onto its image, so itremains to show that all ofB∗ belongs to the image ofι. Forh ∈ B∗, defineh∗ ∈ B as in (3.5) by

h∗ =

Bxh(x)µ(dx) .

This integral converges since‖x‖2 is integrable by Fernique’s theorem. Since one has the identityℓ(h∗) = Cµ(ℓ, h), it follows thath∗ ∈ Hµ andh = ι(h∗), as required to conclude the proof.

The separability ofHµ then follows immediately from the fact thatL2(B, µ) is separable when-everB is separable, since its Borelσ-algebra is countably generated.

Remark 3.32 The spaceRµ is called thereproducing kernel Hilbert spacefor µ (or just repro-ducing kernelfor short). However, since it is isomorphic to the Cameron-Martin space in a naturalway, there is considerable confusion between the two in the literature. We retain in these notes theterminology from [Bog98], but we urge the reader to keep in mind that there are authors who usea slightly different terminology.

Remark 3.33 In general, there do exist Gaussian measures with non-separable Cameron-Martinspace, but they are measures on more general vector spaces. One example would be the measureon RR (yes, the space ofall functions fromR to R endowed with the productσ-algebra) givenby the uncountable product of one-dimensional Gaussian measures. The Cameron-Martin spacefor this somewhat pathological measure is given by those functionsf that are non-zero on at mostcountably points and such that

∑t∈R |f (t)|2 < ∞. This is a prime example of a non-separable

Hilbert space.

Page 18: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

18 GAUSSIAN MEASURE THEORY

Exercise 3.34Letµ be a Gaussian measure on a Hilbert spaceH with covarianceK and considerthe spectral decomposition ofK: Ken = λnen with

∑n≥1 λn < ∞ anden an orthonormal

basis of eigenvectors. Such a decomposition exists since wealready know thatK must be traceclass from Proposition 3.15.

Assume now thatλn > 0 for everyn. Show thatHµ is given by the range ofK and thatthe correspondenceh 7→ h∗ is given byh∗ = K−1h. Show furthermore that the Cameron-Martin spaceHµ consists of those elementsh of H such that

∑n≥1 λ

−1n 〈h, en〉2 < ∞ and that

〈h, k〉µ = 〈K−1/2h,K−1/2k〉.

Exercise 3.35Show that one has the alternative characterisation

‖h‖µ = supℓ(h) : Cµ(ℓ, ℓ) ≤ 1 , (3.11)

andHµ = h ∈ B : ‖h‖µ < ∞. Hint: Use the fact that in any Hilbert spaceH, one has‖h‖ = sup〈k, h〉 : ‖k‖ ≤ 1.

Since elements inRµ are built from the space of all bounded linear functionals onB, it shouldcome as little surprise that its elements are ‘almost’ linear functionals onB in the following sense:

Proposition 3.36 For everyℓ ∈ Rµ there exists a measurable linear subspaceVℓ of B such thatµ(Vℓ) = 1 and a linear mapℓ:Vℓ → R such thatℓ = ℓ µ-almost surely.

Proof. Fix ℓ ∈ Rµ. By the definition ofRµ and Borel-Cantelli, we can find a sequenceℓn ∈ B∗

such that limn→∞ ℓn(x) = ℓ(x) for µ-almost everyx ∈ B. (Take for exampleℓn such that‖ℓn − ℓ‖2

µ ≤ n−4.) It then suffices to define

Vℓ =x : lim

n→∞ℓn(x) exists

,

and to setℓ(x) = limn→∞ ℓn(x) onVℓ.

Another very useful fact about the reproducing kernel spaceis given by:

Proposition 3.37 The law of any elementh∗ = ι(h) ∈ Rµ is a centred Gaussian with variance‖h‖2

µ. Furthermore, any two elementsh∗, k∗ have covariance〈h, k〉µ.

Proof. We already know from the definition of a Gaussian measure thatthe law of any element ofB∗ is a centred Gaussian. Let nowh∗ be any element ofRµ and lethn be a sequence inRµ ∩ B∗

such thathn → h∗ in Rµ. We can furthermore choose this approximating sequence such that‖hn‖Rµ = ‖h∗‖Rµ = ‖h‖µ, so that the law of each of thehn is equal toN (0, ‖h‖2

µ).SinceL2-convergence implies convergence in law, we conclude that the law ofh∗ is also given

by N (0, ‖h‖2µ). The statement about the covariance then follows by polarisation, since

Eh∗k∗ =1

2(E(h∗ + k∗)2 − E(h∗)2 − E(k∗)2) =

1

2(‖h+ k‖2

µ − ‖h‖2µ − ‖k‖2

µ) = 〈h, k〉µ ,

by the previous statement.

Remark 3.38 Actually, the converse of Proposition 3.36 is also true: ifℓ:B → R is measurableand linear on a measurable linear subspaceV of full measure, thenℓ belongs toRµ. This isnot an obvious statement. It can be viewed for example as a consequence of the highly non-trivial fact that every Borel measurable linear map betweentwo sufficiently ‘nice’ topological

Page 19: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 19

vector spaces is bounded, see for example [Sch66, Kat82]. (The point here is that the map mustbe linear on the whole space and not just on some “large” subspace as is usually the case withunbounded operators.) This implies by Proposition 3.42 that ℓ is a measurable linear extensionof some bounded linear functional onHµ. Since such extensions are unique (up to null sets) byTheorem 3.47 below, the claim follows from Proposition 3.31.

Exercise 3.39Show that ifB ⊂ B is a continuously embedded Banach space withµ(B) = 1,then the embeddingB∗ → Rµ extends to an embeddingB∗ → Rµ. Deduce from this that therestriction ofµ to B is again a Gaussian measure. In particular, Kolmogorov’s continuity criterionyields a Gaussian measure onCβ

0 ([0, 1]d,R).

The properties of the reproducing kernel space of a Gaussianmeasure allow us to give anotherillustration of the fact that measures on infinite-dimensional spaces behave in a rather differentway from measures onRn:

Proposition 3.40 Letµ be a centred Gaussian measure on a separable Banach spaceB such thatdimHµ = ∞. Denote byDc the dilatation by a real numberc onB, that isDc(x) = cx. Then,µandD∗

cµ are mutually singular for everyc 6= ±1.

Proof. Since the reproducing Kernel spaceRµ is a separable Hilbert space, we can find an or-thonormal basisenn≥0. Consider the sequence of random variablesXN (x) = 1

N

∑Nn=1 |en(x)|2

overB. If B is equipped with the measureµ then, since theen are independent underµ, we canapply the law of large numbers and deduce that

limN→∞

XN (x) = 1 , (3.12)

for µ-almost everyx. On the other hand, it follows from the linearity of theen that when we equipB with the measureD∗

cµ, theen are still independent, but have variancec2, so that

limN→∞

XN (x) = c2 ,

for D∗cµ-almost everyx. This shows that ifc 6= ±1, the set on which the convergence (3.12) takes

place must be ofD∗cµ-measure0, which implies thatµ andD∗

cµ are mutually singular.

As already mentioned earlier, the importance of the Cameron-Martin space is that it representsprecisely those directions in which one can translate the measureµ without changing its null sets:

Theorem 3.41 (Cameron-Martin) For h ∈ B, define the mapTh:B → B by Th(x) = x + h.Then, the measureT ∗

hµ is absolutely continuous with respect toµ if and only ifh ∈ Hµ.

Proof. Fix h ∈ Hµ and leth∗ ∈ L2(B, µ) be the corresponding element of the reproducing kernel.Since the law ofh∗ is Gaussian by Proposition 3.37, the mapx 7→ exp(h∗(x)) is integrable. Sincefurthermore the variance ofh∗ is given by‖h‖2

µ, the function

Dh(x) = exp(h∗(x) − 12‖h‖2

µ) (3.13)

is strictly positive, belongs toL1(B, µ), and integrates to1. It is therefore the Radon-Nikodymderivative of a measureµh that is absolutely continuous with respect toµ. To check that onehas indeedµh = T ∗

hµ, it suffices to show that their Fourier transforms coincide.Assuming thath∗ ∈ B∗, one has

µh(ℓ) =

Bexp(iℓ(x) + h∗(x) − 1

2‖h‖2

µ)µ(dx) = exp(1

2Cµ(iℓ+ h∗, iℓ+ h∗) − 1

2‖h‖2

µ)

Page 20: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

20 GAUSSIAN MEASURE THEORY

= exp(−12Cµ(ℓ, ℓ) − iCµ(ℓ, h∗)) = exp(−1

2Cµ(ℓ, ℓ) + iℓ(h)) .

Using Proposition 3.37 for the joint law ofℓ andh∗, it is an easy exercise to check that this equalitystill holds for arbitraryh ∈ Hµ.

On the other hand, we have

T ∗hµ(ℓ) =

Bexp(iℓ(x))T ∗

hµ(dx) =

Bexp(iℓ(x+ h))µ(dx) = eiℓ(h)

Bexp(iℓ(x))µ(dx)

= exp(−12Cµ(ℓ, ℓ) + iℓ(h)) ,

showing thatµh = T ∗hµ.

To show the converse, note first that one can check by an explicit calculation that‖N (0, 1) −N (h, 1)‖TV ≥ 2 − 2 exp(−h2

8). Fix now some arbitraryn > 0. If h 6∈ Hµ then, by Exercise 3.35,

there existsℓ ∈ B∗ with Cµ(ℓ, ℓ) = 1 such thatℓ(h) ≥ n. Since the imageℓ∗µ of µ underℓ isN (0, 1) and the image ofT ∗

hµ underℓ isN (−ℓ(h), 1), this shows that

‖µ− T ∗hµ‖TV ≥ ‖ℓ∗µ− ℓ∗T ∗

hµ‖TV = ‖N (0, 1) −N (−ℓ(h), 1)‖TV ≥ 2 − 2 exp(−n2

8) .

Since this is true for everyn, we conclude that‖µ − T ∗hµ‖TV = 2, thus showing that they are

mutually singular.

As a consequence, we have the following characterisation ofthe Cameron-Martin space

Proposition 3.42 The spaceHµ ⊂ B is the intersection of all (measurable) linear subspaces offull measure. However, ifHµ is infinite-dimensional, then one hasµ(Hµ) = 0.

Proof. Take an arbitrary linear subspaceV ⊂ B of full measure and take an arbitraryh ∈ Hµ. Itfollows from Theorem 3.41 that the affine spaceV −h also has full measure. Since (V −h)∩V = φunlessh ∈ V , one must haveh ∈ V , so thatHµ ⊂ ⋂V ⊂ B : µ(V ) = 1.

Conversely, take an arbitraryx 6∈ Hµ and let us construct a linear spaceV ⊂ B of full measure,but not containingx. Sincex 6∈ Hµ, one has‖x‖µ = ∞ with ‖ · ‖µ extended toB as in (3.11).Therefore, we can find a sequenceℓn ∈ B∗ such thatCµ(ℓn, ℓn) ≤ 1 andℓn(x) ≥ n. Defining thenorm |y|2 =

∑n n

−2(ℓn(y))2, we see that

B|y|2 µ(dy) =

∞∑

n=1

1

n2

B(ℓn(y))2 µ(dy) ≤ π2

6,

so that the linear spaceV = y : |y| <∞ has full measure. However,|x| = ∞ by construction,so thatx 6∈ V .

To show thatµ(Hµ) = 0 if dim Hµ = ∞, consider an orthonormal sequenceen ∈ Rµ so thatthe random variablesen(x) are i.i.d.N (0, 1) distributed. By the second Borel-Cantelli lemma, itfollows that supn |en(x)| = ∞ for µ-almost everyx, so that in particular‖x‖2

µ ≥ ∑n e

2n(x) = ∞

almost surely.

Exercise 3.43Recall that the (topological) support suppµ of a Borel measure on a complete sep-arable metric space consists of those pointsx such thatµ(U ) > 0 for every neighbourhoodU ofx. Show that, ifµ is a Gaussian measure, then its support is the closureHµ of Hµ in B.

Page 21: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 21

3.3 Images of Gaussian measures

It follows immediately from the definition of a Gaussian measure and the expression for its Fouriertransform that ifµ is a Gaussian measure on some Banach spaceB andA:B → B2 is a boundedlinear map forB2 some other Banach space, thenν = A∗µ is a Gaussian measure onB2 withcovariance

Cν(ℓ, ℓ′) = Cµ(A∗ℓ,A∗ℓ′) ,

whereA∗:B∗2 → B∗ is the adjoint toA, that is the operator such that(A∗ℓ)(x) = ℓ(Ax) for every

x ∈ B and everyℓ ∈ B∗2.

Recall now thatHµ is the intersection over all linear subspaces ofB that have full measureunderµ. This suggests that in order to determine the image ofµ under a linear map, it is sufficientto know how that map acts on elements ofHµ. This intuition is made precise by the followingtheorem:

Theorem 3.44 Let µ be a centred Gaussian probability measure on a separable Banach spaceB. Let furthermoreH be a separable Hilbert space and letA:Hµ → H be a Hilbert-Schmidtoperator. (That isAA∗:H → H is trace class.) Then, there exists a measurable mapA:B → Hsuch thatν = A∗µ is Gaussian with covarianceCν(h, k) = 〈A∗h,A∗k〉µ. Furthermore, thereexists a measurable linear subspaceV ⊂ B of full µ-measure such thatA restricted toV is linearandA restricted toHµ ⊂ V agrees withA.

Proof. Let enn≥1 be an orthonormal basis forHµ and denote bye∗n the corresponding elementsin Rµ ⊂ L2(B, µ) and defineSN (x) =

∑Nn=0 e

∗n(x)Aen. Recall from Proposition 3.36 that we

can find subspacesVen of full measure such thate∗n is linear onVen . Define now a linear subspaceV ⊂ B by

V =x ∈

n≥0

Ven : the sequenceSN (x) converges inH

,

(the fact thatV is linear follows from the linearity of each of thee∗n) and set

A(x) =

limN→∞ SN (x) for x ∈ V ,

0 otherwise.

Since the random variablese∗n are i.i.d.N (0, 1)-distributed underµ, the sequenceSN formsanH-valued martingale and one has

supN

Eµ‖SN (x)‖2 =∞∑

n=0

‖Aen‖2 ≤ trA∗A <∞ ,

where the last inequality is a consequence ofA being Hilbert-Schmidt. It follows thatµ(V ) = 1by Doob’s martingale convergence theorem.

To see thatν = A∗µ has the stated property, fix an arbitraryh ∈ H and note that the series∑n≥1 e

∗n〈Aen, h〉 converges inRµ to an element with covariance‖A∗h‖2. The statement then

follows from Proposition 3.37 and the fact thatCν(h, h) determinesCν by polarisation. To checkthatν is Gaussian, we can compute its Fourier transform in a similar way.

Remark 3.45 Similarly to Proposition 3.36, the converse is again true: if A:B → H is a measur-able map which is linear on a measurable subspace of full measure and agrees withA onHµ, thenit agreesµ-almost surely with the extension constructed in Theorem 3.44.

Page 22: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

22 GAUSSIAN MEASURE THEORY

The proof of Theorem 3.44 can easily be extended to the case where the image space is aBanach space rather than a Hilbert space. However, in this case we cannot give a straightforwardcharacterisation of those mapsA that are ‘admissible’, since we have no good complete character-isation of covariance operators for Gaussian measures on Banach spaces. However, we can takethe pragmatic approach and simply assume that the new covariance determines a Gaussian mea-sure on the target Banach space. With this approach, we can formulate the following version forBanach spaces:

Proposition 3.46 LetB1 andB2 be two separable Banach space and letµ be a centred Gaussianprobability measure onB1. LetA:Hµ → B2 be a bounded linear operator such that there existsa centred Gaussian measureν onB2 with covarianceCν(h, k) = 〈A∗h,A∗k〉µ. Then, there existsa measurable mapA:B1 → B2 such thatν = A∗µ and such that there exists a measurablelinear subspaceV ⊂ B of full µ-measure such thatA restricted toV is linear andA restricted toHµ ⊂ V agrees withA.

Proof. As a first step, we construct a Hilbert spaceH2 such thatB2 ⊂ H2 as a Borel subset.Denote byHν ⊂ B2 the Cameron-Martin space ofν and leten ⊂ Hν be an orthonormal basisof elements such thate∗n ∈ B∗

2 for everyn. (Such an orthonormal basis can always be found byusing the Grahm-Schmidt procedure.) We then define a norm onB2 by

‖x‖22 =

n≥1

e∗n(x)2

n2‖e∗n‖2,

where‖e∗n‖ is the norm ofe∗n in B∗2. It is immediate that‖x‖2 <∞ for everyx ∈ B2, so that this

turnsB2 into a pre-Hilbert space. We finally defineH2 as the completion ofB2 under‖ · ‖2.Denote byν ′ the image of the measureν under the inclusion mapι:B2 → H2. It follows

that the mapA′ = ι A satisfies the assumptions of Theorem 3.44, so that there exists a mapA:B1 → H2 which is linear on a subset of fullµ-measure and such thatA∗µ = ν ′. On the otherhand, we know by construction thatν ′(B2) = 1, so that the setx : Ax ∈ B2 is of full measure.Modifying A outside of this set by for example setting it to0 and using Exercise 3.39 then yieldsthe required statement.

3.3.1 Uniqueness of measurable extensions and the isoperimetric inequality

This section is devoted to a proof of the converse of Theorem 3.44 and Proposition 3.46, namely

Theorem 3.47 Let µ be a Gaussian measure on a separable Banach spaceB1 with Cameron-Martin spaceHµ and letA:Hµ → B2 be a linear map satisfying the assumptions of Proposi-tion 3.46. Then the linear measurable extensionA ofA is unique, up to sets of measure0.

Remark 3.48 As a consequence of this result, the precise Banach spacesB1 andB2 are com-pletely irrelevant when one considers the image of a Gaussian measure under a linear transforma-tion. The only thing that matters is the Cameron-Martin space for the starting measure and theway in which the linear transformation acts on this space. This fact will be used repeatedly in thesequel.

This is probably one of the most remarkable results in Gaussian measure theory. At first sight,it appears completely counterintuitive: the Cameron-Martin spaceHµ has measure0, so how canthe specification of a measurable map on a set of measure0 be sufficient to determine it on a setof measure1? Part of the answer lies of course in the requirement that theextensionA should be

Page 23: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 23

linear on a set of full measure. However, even this requirement would not be sufficient by itselfto determineA since the Hahn-Banach theorem provides a huge number of different extension ofA that do not coincide anywhere except onHµ. The missing ingredient that solves this mysteryis the requirement thatA is not just any linear map, but ameasurablelinear map. This additionalconstraint rules out all of the non-constructive extensions of A provided by the Hahn-Banachtheorem and leaves only one (constructive) extension ofA.

The main ingredient in the proof of Theorem 3.47 is the Borell-Sudakov-Cirel’son inequality[SC74, Bor75], a general form of isoperimetric inequality for Gaussian measures which is veryinteresting and useful in its own right. In order to state this result, we first introduce the notationBε for theHµ-ball of radiusε centred at the origin. We also denote byA+B the sum of two setsdefined by

A+B = x+ y : x ∈ A , y ∈ B ,

and we denote byΦ the distribution function of the normal Gaussian:Φ(t) = 1√2π

∫ t−∞ e−s2/2 ds.

With these notations at hand, we have the following:

Theorem 3.49 (Borell-Sudakov-Cirel’son) Letµ be a Gaussian measure on a separable BanachspaceB with Cameron-Martin spaceHµ and letA ⊂ B be a measurable subset with measureµ(A) = Φ(α) for someα ∈ R. Then, for everyε > 0, one has the boundµ(A+Bε) ≥ Φ(α+ ε).

Remark 3.50 Theorem 3.49 is remarkable since it implies that even thoughHµ itself has measure0, wheneverA is a set of positive measure, no matter how small, the setA+Hµ has full measure!

Remark 3.51 The bound given in Theorem 3.49 is sharp wheneverA is a half space, in the sensethatA = x ∈ B : ℓ(x) ≥ c for someℓ ∈ Rµ and c ∈ R. In the case whereε is small,(A+Bε) \A is a fattened boundary for the setA, so thatµ(A+Bε) − µ(A) can be interpreted asa kind of ‘perimeter’ forA. The statement can then be interpreted as stating that in thecontext ofGaussian measures, half-spaces are the sets of given perimeter that have the largest measure. Thisjustifies the statement that Theorem 3.49 is an isoperimetric inequality.

We are not going to give a proof of Theorem 3.49 in these notes because this would lead ustoo far astray from our main object of study. The interested reader may want to look into themonograph [LT91] for a more exhaustive treatment of probability theory in Banach spaces ingeneral and isoperimetric inequalities in particular. Letus nevertheless remark shortly on how theargument of the proof goes, as it can be found in the original papers [SC74, Bor75]. In a nutshell,it is a consequence of the two following remarks:

• Let νM be the uniform measure on a sphere of radius√M in RM and letΠM,n be the

orthogonal projection fromRM toRn. Then, the sequence of measuresΠM,NνM convergesasM → ∞ to the standard Gaussian measure onRn. This remark is originally due toPoincare.

• A claim similar similar to that of Theorem 3.49 holds for the uniform measure on thesphere, in the sense that the volume of a fattened setA+Bε on the sphere is bounded frombelow by the volume of a fattened ‘cap’ of volume identical tothat ofA. Originally, thisfact was discovered by Levy, and it was then later generalised by Schmidt, see [Sch48] orthe review article [Gar02].

These two facts can then be combined in order to show that half-spaces are optimal for finite-dimensional Gaussian measures. Finally, a clever approximation argument is used in order togeneralise this statement to infinite-dimensional measures as well.

An immediate corollary is given by the following type of zero-one law for Gaussian measures:

Page 24: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

24 GAUSSIAN MEASURE THEORY

Corollary 3.52 LetV ⊂ B be a measurable linear subspace. Then, one has eitherµ(V ) = 0 orµ(V ) = 1.

Proof. Let us first consider the case whereHµ 6⊂ V . In this case, just as in the proof of Proposi-tion 3.42, we conclude thatµ(V ) = 0, for otherwise we could construct an uncountable collectionof disjoint sets with positive measure.

If Hµ ⊂ V , then we haveV + Bε = V for everyε > 0, so that ifµ(V ) > 0, one must haveµ(V ) = 1 by Theorem 3.49.

We have now all the necessary ingredients in place to be able to give a proof of Theorem 3.47:

Proof of Theorem 3.47.Assume by contradiction that there exist two measurable extensionsA1

and A2 of A. In other words, we haveAix = Ax for x ∈ Hµ and there exist measurablesubspacesVi with µ(Vi) = 1 such that the restriction ofAi to Vi is linear. DenoteV = V1 ∩ V2

and∆ = A2 − A1, so that∆ is linear onV and∆|Hµ = 0.Let ℓ ∈ B∗

2 be arbitrary and consider the eventsV cℓ = x : ℓ(∆x) ≤ c. By the linearity

of ∆, each of these events is invariant under translations inHµ, so that by Theorem 3.49 wehaveµ(V c

ℓ ) ∈ 0, 1 for every choice ofℓ andc. Furthermore, for fixedℓ, the mapc 7→ µ(V cℓ )

is increasing and it follows from theσ-additivity of µ that we have limc→−∞ µ(V cℓ ) = 0 and

limc→∞ µ(V cℓ ) = 1. Therefore, there exists a uniquecℓ ∈ R such thatµ(V c

ℓ ) jumps from0 to 1at c = cℓ. In particular, this implies thatℓ(∆x) = cℓ µ-almost surely. However, the measureµ isinvariant under the mapx 7→ −x, so that we must havecℓ = −cℓ, implying thatcℓ = 0. Sincethis is true for everyℓ ∈ B∗

2, we conclude from Proposition 3.6 that the law of∆x is given by theDirac measure at0, so that∆x = 0 µ-almost surely, which is precisely what we wanted.

In the next section, we will see how we can take advantage of this fact to construct a the-ory of stochastic integration with respect to a “cylindrical Wiener process”, which is the infinite-dimensional analogue of a standardn-dimensional Wiener process.

3.4 Cylindrical Wiener processes and stochastic integration

Central to the theory of stochastic PDEs is the notion of acylindrical Wiener process. Recall thatin general a stochastic processX taking values in a separable Banach spaceB is nothing but acollectionX(t) of B-valued random variables indexed by timet ∈ R (or taking values in somesubset ofR). A notable special case which will be of interest here is thecase where the probabilityspace is taken to be for exampleΩ = C([0, T ],B) (or some other space ofB-valued continuousfunctions) endowed with some Gaussian measureP and where the processX is given by

X(t)(ω) = ω(t) , ω ∈ Ω .

In this case,X is called thecanonical processonΩ.Recall that the usual (one-dimensional) Wiener process is areal-valued centred Gaussian

processB(t) such thatB(0) = 0 and E|B(t) − B(s)|2 = |t − s| for any pair of timess, t.From our point of view, the Wiener process on any finite time interval I can always be re-alised as the canonical process for the Gaussian measure onC(I,R) with covariance functionC(s, t) = s ∧ t = mins, t. (Note that such a measure exists by Kolmogorov’s continuity crite-rion.)

Since the spaceC(R,R) is not a Banach space and we have not extended our study of Gaussianmeasures to Frechet spaces, we refrain from defining a measure on it. However, one can defineWiener measure on a separable Banach space of the type

C(R+,R) =f ∈ C(R+,R) : lim

t→∞f (t)/(t) exists

, ‖f‖ = sup

t∈R

|f (t)|(t)

,

Page 25: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 25

for a suitable weight function : R → [1,∞). For example, we will see that(t) = 1 + t2 issuitable, and we will therefore defineCW = C for this particular choice.

Proposition 3.53 There exists a Gaussian measureµ on CW with covariance functionC(s, t) =s ∧ t.

Proof. We use the fact thatf ∈ C([0, π],R) if and only if the functionT (f ) given byT (f )(t) =(1 + t2)f (arctant) belongs toCW . Our aim is then to construct a Gaussian measureµ0 onC([0, π],R) which is such thatT ∗µ0 has the required covariance structure.

The covarianceC0 for µ0 is then given by

C0(x, y) =tanx ∧ tany

(1 + tan2 x)(1 + tan2 y).

It is now a straightforward exercise to check that this covariance function does indeed satisfy theassumption of Kolmogorov’s continuity theorem.

Let us now fix a (separable) Hilbert spaceH, as well as a larger Hilbert spaceH′ containingH as a dense subset and such that the inclusion mapι:H → H′ is Hilbert-Schmidt. GivenH, it isalways possible to construct a spaceH′ with this property: choose an orthonormal basisen ofH and takeH′ to be the closure ofH under the norm

‖x‖2H′ =

∞∑

n=1

1

n2〈x, en〉2 .

One can check that the mapιι∗ is then given byιι∗en = 1

n2 en, so that it is indeed trace class.

Definition 3.54 Let H andH′ be as above. We then call acylindrical Wiener process onH anyH′-valued Gaussian processW such that

E〈h,W (s)〉H′〈W (t), k〉H′ = (s ∧ t)〈ι∗h, ι∗k〉 = (s ∧ t)〈ιι∗h, k〉H′ , (3.14)

for any two timess andt and any two elementsh, k ∈ H′. By Kolmogorov’s continuity theorem,this can be realised as the canonical process for some Gaussian measure onCW (R,H′).

Alternatively, we could have defined the cylindrical Wienerprocess onH as the canonicalprocess for any Gaussian measure with Cameron-Martin spaceH1,2

0 ([0, T ],H), see Exercise 3.27.

Proposition 3.55 In the same setting as above, the Gaussian measureµ on H′ with covarianceιι∗ hasH as its Cameron-Martin space. Furthermore,‖h‖2

µ = ‖h‖2 for everyh ∈ H.

Proof. It follows from the definition ofHµ that this is precisely the range ofιι∗ and that the maph 7→ h∗ is given byh∗ = (ιι∗)−1h. In particular,Hµ is contained in the range ofι. Therefore, foranyh, k ∈ Hµ, there existh, g ∈ H such thath = ιh andk = ιk. Using this, we have

〈h, k〉µ = 〈(ιι∗)h∗, k∗〉H′ = 〈h, (ιι∗)−1k〉H′ = 〈ιh, (ιι∗)−1ιk〉H′ = 〈h, ι∗(ιι∗)−1ιk〉 = 〈h, k〉 ,

from which the claim follows.

Page 26: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

26 GAUSSIAN MEASURE THEORY

The name ‘cylindrical Wiener process onH’ may sound confusing at first, since it is actuallynot anH-valued process. (A better terminology may have been ‘cylindrical Wiener process overH’, but we choose to follow the convention that is found in the literature.) Note however thatif h is an element inH that is in the range ofι∗ (so thatιh belongs to the range ofιι∗ andι∗(ιι∗)−1ιh = h), then

〈h, k〉 = 〈ι∗(ιι∗)−1ιh, k〉 = 〈(ιι∗)−1ιh, ιk〉H′ .

In particular, if we justpretendfor a moment thatW (t) belongs toH for every t (which is ofcourse not true!), then we get

E〈h,W (s)〉〈W (t), k〉 = E〈(ιι∗)−1ιh, ιW (s)〉H′〈(ιι∗)−1ιk, ιW (t)〉H′

= (s ∧ t)〈ιι∗(ιι∗)−1ιh, (ιι∗)−1ιk〉H′

= (s ∧ t)〈ιh, (ιι∗)−1ιk〉H′ = (s ∧ t)〈h, ι∗(ιι∗)−1ιk〉H′

= (s ∧ t)〈h, k〉 .Here we used (3.14) to go from the first to the second line. Thisshows thatW (t) should be thoughtof as anH-valued random variable with covariancet times the identity operator (which is of coursenot trace class ifH is infinite-dimensional, so that such an object cannot existif dim H = ∞).Combining Proposition 3.55 with Theorem 3.44, we see however that if K is some Hilbert spaceandA:H → K is a Hilbert-Schmidt operator, then theK-valued random variableAW (t) is well-defined. (Here we made an abuse of notation and also used the symbol A for the measurableextension ofA to H′.) Furthermore, its law does not depend on the choice of the larger spaceH′.

Example 3.56 (White noise)Recall that we informally defined ‘white noise’ as a Gaussianpro-cessξ with covarianceEξ(s)ξ(t) = δ(t− s). In particular, if we denote by〈·, ·〉 the scalar productin L2(R), this suggests that

E〈g, ξ〉〈h, ξ〉 = E∫∫

g(s)h(t)ξ(s)ξ(t) ds dt =

∫∫g(s)h(t)δ(t − s) ds dt = 〈g, h〉 . (3.15)

This calculation shows that white noise can be constructed as a Gaussian random variable on anyspace of distributions containingL2(R) and such the embedding is Hilbert-Schmidt. Furthermore,by Theorem 3.44, integrals of the form

∫g(s)ξ(s) ds are well-defined random variables, provided

that g ∈ L2(R). Taking forg the indicator function of the interval [0, t], we can check that theprocessB(t) =

∫ t0 ξ(s) ds is a Brownian motion, thus justifying the statement that ‘white noise is

the derivative of Brownian motion’.The interesting fact about this construction is that we can use it to define space-time white

noise in exactly the same way, simply replacingL2(R) byL2(R2).

This will allow us to define a Hilbert space-valued stochastic integral against a cylindricalWiener process in pretty much the same way as what is usually done in finite dimensions. In thesequel, we fix a cylindrical Wiener processW on some Hilbert spaceH ⊂ H′, which we realiseas the canonical coordinate process onΩ = CW (R+,H′) equipped with the measure constructedabove. We also denote byFs theσ-field onΩ generated byWr : r ≤ s.

Consider now a finite collection ofdisjoint intervals (sn, tn] ⊂ R+ with n = 1, . . . , N anda corresponding finite collection ofFsn-measurable random variablesΦn taking values in thespaceL2(H,K) of Hilbert-Schmidt operators fromH into some other fixed Hilbert spaceK. LetfurthermoreΦ be theL2(R+ × Ω,L2(H,K))-valued function defined by

Φ(t, ω) =N∑

n=1

Φn(ω) 1(sn,tn](t) ,

Page 27: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

GAUSSIAN MEASURE THEORY 27

where we denoted by1A the indicator function of a setA. We call such aΦ anelementary processonH.

Definition 3.57 Given an elementary processΦ and a cylindrical Wiener processW on H, wedefine theK-valued stochastic integral

∫ ∞

0

Φ(t) dW (t) def=

N∑

n=1

Φn(W ) (W (tn) −W (sn)) .

Note that sinceΦn is Fsn-measurable,Φn(W ) is independent ofW (tn) −W (sn), therefore eachterm on the right hand side can be interpreted in the sense of the construction of Theorems 3.44and 3.47.

Remark 3.58 Thanks to Theorem 3.47, this construction is well-posed without requiring to spec-ify the larger Hilbert spaceH′ on whichW can be realised as anH′-valued process. This justifiesthe terminology ofW being “the cylindrical Wiener process onH” without any mentioning ofH′,since the value of stochastic integrals againstW is independent of the choice ofH′.

It follows from Theorem 3.44 and (3.6) that one has the identity

E∥∥∥∫ ∞

0

Φ(t) dW (t)∥∥∥2

K=

N∑

n=1

E tr(Φn(W )Φ∗n(W ))(tn − sn) = E

∫ ∞

0

tr Φ(t)Φ∗(t) dt , (3.16)

which is an extension of the usual Ito isometry to the Hilbert space setting. It follows that thestochastic integral is an isometry from the subset of elementary processes inL2(R+×Ω,L2(H,K))toL2(Ω,K).

Let nowFpr be the ‘predictable’σ-field, that is theσ-field overR+×Ω generated by all subsetsof the form (s, t] ×A with t > s andA ∈ Fs. This is the smallestσ-algebra with respect to whichall elementary processes areFpr-measurable. One furthermore has:

Proposition 3.59 The set of elementary processes is dense in the spaceL2pr(R+ × Ω,L2(H,K))

of all predictableL2(H,K)-valued processes.

Proof. Denote byFpr the set of all sets of the form (s, t] × A with A ∈ Fs. Denote furthermoreby L2

pr the closure of the set of elementary processes inL2. One can check thatFpr is closed under

intersections, so that1G ∈ L2pr for every setG in the algebra generated byFpr. It follows from the

monotone class theorem that1G ∈ L2pr for every setG ∈ Fpr. The claim then follows from the

definition of the Lebesgue integral, just as for the corresponding statement inR.

By using the Ito isometry (3.16) and the completeness ofL2(Ω,K), it follows that:

Corollary 3.60 The stochastic integral∫ ∞0

Φ(t) dW (t) can be uniquely defined for every processΦ ∈ L2

pr(R+ × Ω,L2(H,K)).

This concludes our presentation of the basic properties of Gaussian measures on infinite-dimensional spaces. The next section deals with the other main ingredient to solving stochasticPDEs, which is the behaviour of deterministic linear PDEs.

Page 28: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

28 A PRIMER ON SEMIGROUP THEORY

4 A Primer on Semigroup Theory

This section is strongly based on Davies’s excellent monograph [Dav80] for the first part onstrongly continuous semigroups and very loosely follows [Yos95] and [Lun95] for the secondpart on analytic semigroups. Another good reference on someof the material covered here is themonograph [Paz83]. Its aim is to give a rigorous meaning to solutions to linear equations of thetype

∂tx = Lx , x(0) = x0 ∈ B , (4.1)

wherex takes values in some Banach spaceB andL is a possibly unbounded operator onB. Froma formal point of view, if such a solution exists, one expectsthe existence of a linear operatorS(t)that maps the initial conditionx0 onto the solutionx(t) of (4.1) at timet. If such a solution isunique, then the family of operatorsS(t) should satisfyS(0) = 1 andS(t) S(s) = S(t+ s). Thisis called thesemigroupproperty.

Furthermore, such a family of solution operatorsS(t) should have some regularity ast → 0in order to give a meaning to the notion of an initial condition. (The family given byS(t) = 0for t > 0 andS(0) = 1 does satisfy the semigroup property but clearly doesn’t define a family ofsolution operators to an equation of the type (4.1).)

This motivates the following definition:

Definition 4.1 A semigroupS(t) on a Banach spaceB is a family of bounded linear operatorsS(t)t≥0 with the properties thatS(t) S(s) = S(t + s) for anys, t ≥ 0 and thatS(0) = Id. Asemigroup is furthermore called

• strongly continuousif the map (x, t) 7→ S(t)x is strongly continuous.

• analytic if there existsθ > 0 such that the operator-valued mapt 7→ S(t) has an analyticextension toλ ∈ C : | arg λ| < θ, satisfies the semigroup property there, and is suchthatt 7→ S(eiϕt) is a strongly continuous semigroup for every angleϕ with |ϕ| < θ.

A strongly continuous semigroup is also sometimes called aC0-semigroup.

Exercise 4.2Show that being strongly continuous is equivalent tot 7→ S(t)x being continuous att = 0 for everyx ∈ B and the operator norm ofS(t) being bounded byMeat for some constantsM anda. Show then that the first condition can be relaxed tot 7→ S(t)x being continuous forall x in some dense subset ofB. (However, the second condition cannot be relaxed in general.See Exercise 5.19 on how to construct a semigroup of bounded operators such that‖S(t)‖ isunbounded neart = 0.)

Remark 4.3 Some authors, like [Lun95], do not impose strong continuityin the definition of ananalytic semigroup. This can result in additional technical complications due to the fact that thegenerator may then not have dense domain. The approach followed here has the slight drawbackthat with our definitions the heat semigroup is not analytic on L∞(R). (It lacks strong continuityas can be seen by applying it to a step function.) It is howeveranalytic for example onC0(R), thespace of continuous functions vanishing at infinity.

This section is going to assume some familiarity with functional analysis. All the necessaryresults can be found for example in the classical monograph by Yosida [Yos95]. Recall that anunbounded operatorL on a Banach spaceB consists of a linear subspaceD(L) ⊂ B called thedomainof L and a linear mapL:D(L) → B. Thegraph of an operator is the subset ofB × Bconsisting of all elements of the form (x,Lx) with x ∈ D(L). An operator isclosedif its graph is

Page 29: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 29

a closed subspace ofB × B. It is closableif the closure of its graph is again the graph of a linearoperator and that operator is called theclosureof L.

The domainD(L∗) of the adjoint L∗ of an unbounded operatorL:D(L) → B is defined asthe set of all elementsℓ ∈ B∗ such that there exists an elementL∗ℓ ∈ B∗ with the property that(L∗ℓ)(x) = ℓ(Lx) for everyx ∈ D(L). It is clear that in order for the adjoint to be well-defined,we have to require that the domain ofL is dense inB. Fortunately, this will be the case for all theoperators that will be considered in these notes.

Exercise 4.4Show thatL being closed is equivalent to the fact that ifxn ⊂ D(L) is Cauchy inB andLxn is also Cauchy, thenx = limn→∞ xn belongs toD(L) andLx = limn→∞Lxn.

Exercise 4.5Show that the adjoint of an operator with dense domain is always closed.

Theresolvent set (L) of an operatorL is defined by

(L) = λ ∈ C : range(λ− L) is dense inB andλ− L has a continuous inverse. ,

and theresolventRλ is given forλ ∈ (L) byRλ = (λ− L)−1. (Here and in the sequel we viewB as a complex Banach space. If an operator is defined on a real Banach space, it can always beextended to its complexification in a canonical way and we will identify the two without furthernotice in the sequel.) The spectrum ofL is the complement of the resolvent set.

The most important results regarding the resolvent of an operator that we are going to use arethat any closed operatorLwith non-empty resolvent set is defined in a unique way by its resolvent.Furthermore, the resolvent set is open and the resolvent is an analytic function from (L) to thespaceL(B) of bounded linear operators onB. Finally, the resolvent operators for different valuesof λ all commute and satisfy the resolvent identity

Rλ −Rµ = (µ− λ)RµRλ ,

for any twoλ, µ ∈ (L).The fact that the resolvent is operator-valued should not bea conceptual obstacle to the use of

notions from complex analysis. Indeed, forD ⊂ C an open domain, a functionf :D → B whereB is any complex Banach space (typically the complexificationof a real Banach space which weidentify with the original space without further ado) is said to be analytic in exactly the same wayas usual by imposing that its Taylor series at any pointa ∈ D converges tof uniformly in Bon a neighbourhood ofa. The same definition applies ifD ⊂ R and analytic continuation thenworks in exactly the same way as for complex-valued functions. In particular, Cauchy’s residuetheorem, which is the main result from complex analysis thatwe are going to use later on, worksfor Banach-space valued functions in exactly the same way asfor complex-valued functions.

4.1 Strongly continuous semigroups

We start our investigation of semigroup theory with a discussion of the main results that can beobtained for strongly continuous semigroups. Given aC0-semigroup, one can associate to it a‘generator’, which is essentially the derivative ofS(t) at t = 0:

Definition 4.6 ThegeneratorL of aC0-semigroup is given by

Lx = limt→0

t−1(S(t)x− x) , (4.2)

on the setD(L) of all elementsx ∈ B such that this limit exists (in the sense of strong convergencein B).

Page 30: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

30 A PRIMER ON SEMIGROUP THEORY

The following result shows that ifL is the generator of aC0-semigroupS(t), thenx(t) =S(t)x0 is indeed the solution to (4.1) in a weak sense.

Proposition 4.7 The domainD(L) of L is dense inB, invariant underS, and the identities∂tS(t)x = LS(t)x = S(t)Lx hold for everyx ∈ D(L) and everyt ≥ 0. Furthermore, forevery ℓ ∈ D(L∗) and everyx ∈ B, the mapt 7→ 〈ℓ, S(t)x〉 is differentiable and one has∂t〈ℓ, S(t)x〉 = 〈L∗ℓ, S(t)x〉.

Proof. Fix some arbitraryx ∈ B and setxt =∫ t0 S(s)x ds. One then has

limh→0

h−1(S(h)xt − xt) = limh→0

h−1(∫ t+h

hS(s)x ds −

∫ t

0

S(s)x ds)

= limh→0

h−1(∫ t+h

tS(s)x ds −

∫ h

0

S(s)x ds)

= S(t)x− x ,

where the last equality follows from the strong continuity of S. This shows thatxt ∈ D(L). Sincet−1xt → x ast → 0 and sincex was arbitrary, it follows thatD(L) is dense inB. To show that itis invariant underS, note that forx ∈ D(L) one has

limh→0

h−1(S(h)S(t)x− S(t)x) = S(t) limh→0

h−1(S(h)x− x) = S(t)Lx ,

so thatS(t)x ∈ D(L) andLS(t)x = S(t)Lx. To show that it this is equal to∂tS(t)x, it suffices tocheck that the left derivative of this expression exists andis equal to the right derivative. This isleft as an exercise.

To show that the second claim holds, it is sufficient (using the strong continuity ofS) to checkthat it holds forx ∈ D(L). Since one then hasS(t)x ∈ D(L) for every t, it follows from thedefinition (4.2) ofD(L) thatt 7→ S(t)x is differentiable and that its derivative is equal toLS(t)x.

It follows as a corollary that no two semigroups can have the same generator (unless the semi-groups coincide of course), which justifies the notationS(t) = eLt that we are occasionally goingto use in the sequel.

Corollary 4.8 If a function x: [0, 1] → D(L) satisfies∂txt = Lxt for everyt ∈ [0, 1], thenxt = S(t)x0. In particular, no two distinctC0-semigroups can have the same generator.

Proof. It follows from an argument almost identical to that given inthe proof of Proposition 4.7that the mapt 7→ S(t)xT−t is continuous on [0, T ] and differentiable on (0, T ). Computing itsderivative, we obtain∂tS(t)xT−t = LS(t)xT−t − S(t)LxT−t = 0, so thatxT = S(T )x0.

Exercise 4.9Show that the semigroupS(t) onL2(R) given by

(S(t)f)(ξ) = f (ξ + t) ,

is strongly continuous and that its generator is given byL = ∂ξ with D(L) = H1. Similarly, showthat the heat semigroup onL2(R) given by

(S(t)f)(ξ) =1√4πt

∫exp

(−|ξ − η|2

4t

)f (η) dη ,

is strongly continuous and that its generator is given byL = ∂2ξ with D(L) = H2. Hint: Use

Exercise 4.2 to show strong continuity.

Page 31: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 31

Remark 4.10 We did not make any assumption on the structure of the Banach spaceB. However,it is a general rule of thumb (although this isnot a theorem) that semigroups on non-separableBanach spaces tend not to be strongly continuous. For example, neither the heat semigroup northe translation semigroup from the previous exercise are strongly continuous onL∞(R) or evenonCb(R), the space of all bounded continuous functions.

Recall now that the resolvent set for an operatorL consists of thoseλ ∈ C such that theoperatorλ − L is one to one. Forλ in the resolvent set, we denote byRλ = (λ − L)−1 theresolvent ofL. It turns out that the resolvent of the generator of aC0-semigroup can easily becomputed:

Proposition 4.11 Let S(t) be aC0-semigroup such that‖S(t)‖ ≤ Meat for some constantsMand a. If Reλ > a, thenλ belongs to the resolvent set ofL and one has the identityRλx =∫ ∞0e−λtS(t)x dt.

Proof. By the assumption on the bound onS, the expressionZλ =∫ ∞0 e−λtS(t)x dt is well-

defined for everyλ with Reλ > a. In order to show thatZλ = Rλ, we first show thatZλx ∈ D(L)for everyx ∈ B and that (λ− L)Zλx = x. We have

LZλx = limh→0

h−1(S(h)Zλx− Zλx) = limh→0

h−1

∫ ∞

0

e−λt(S(t+ h)x− S(t)x) dt

= limh→0

(eλh − 1

h

∫ ∞

0

e−λtS(t)x dt− eλh

h

∫ h

0

e−λtS(t)x dt)

= λZλx− x ,

which is the required identity. To conclude, it remains to show thatλ−L is an injection onD(L).If it was not, we could findx ∈ D(L) \ 0 such thatLx = λx. Settingxt = eλtx and applyingCorollary 4.8, this yieldsS(t)x = eλtx, thus contradicting the bound‖S(t)‖ ≤Meat if Reλ > a.

We can deduce from this that:

Proposition 4.12 The generatorL of aC0-semigroup is a closed operator.

Proof. We are going to use the characterisation of closed operatorsgiven in Exercise 4.4. ShiftingL by a constant if necessary (which does not affect it being closed or not), we can assume thata = 0. Take now a sequencexn ∈ D(L) such thatxn andLxn are both Cauchy inB and setx = limn→∞ xn andy = limn→∞Lxn. Settingzn = (1 − L)xn, we have limn→∞ zn = x− y.

On the other hand, we know that1 belongs to the resolvent set, so that

x = limn→∞xn = lim

n→∞R1zn = R1(x− y) .

By the definition of the resolvent, this implies thatx ∈ D(L) and thatx − Lx = x − y, so thatLx = y as required.

We are now ready to give a full characterisation of the generators ofC0-semigroups. This isthe content of the following theorem:

Theorem 4.13 (Hille-Yosida) A closed densely defined operatorL on the Banach spaceB is thegenerator of aC0-semigroupS(t) with ‖S(t)‖ ≤ Meat if and only if allλ with Reλ > a lie in itsresolvent set and the bound‖Rn

λ‖ ≤M (Reλ− a)−n holds there for everyn ≥ 1.

Page 32: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

32 A PRIMER ON SEMIGROUP THEORY

Proof. The generatorL of aC0-semigroup is closed by Proposition 4.12. The fact that its resolventsatisfies the stated bound follows immediately from the factthat

Rnλx =

∫ ∞

0

· · ·∫ ∞

0

e−λ(t1+...+tn)S(t1 + . . . + tn)x dt1 · · · dtn

by Proposition 4.11.To show that the converse also holds, we are going to construct the semigroupS(t) by using

the so-called ‘Yosida approximations’Lλ = λLRλ for L. Note first that limλ→∞ LRλx = 0 foreveryx ∈ B: it obviously holds forx ∈ D(L) since then‖LRλx‖ = ‖RλLx‖ ≤ ‖Rλ‖‖Lx‖ ≤M (Reλ − a)−1‖Lx‖. Furthermore,‖LRλx‖ = ‖λRλx − x‖ ≤ (Mλ(λ − a)−1 + 1)‖x‖ ≤(M + 2)‖x‖ for λ large enough, so that limλ→∞ LRλx = 0 for everyx by a standard densityargument.

Using this fact, we can show that the Yosida approximation ofL does indeed approximateLin the sense that limλ→∞ Lλx = Lx for everyx ∈ D(L). Fixing an arbitraryx ∈ D(L), we have

limλ→∞

‖Lλx− Lx‖ = limλ→∞

‖(λRλ − 1)Lx‖ = limλ→∞

‖LRλLx‖ = 0 . (4.3)

Define now a family of bounded operatorsSλ(t) by Sλ(t) = eLλt =∑

n≥0

tnLnλ

n!. This series

converges in the operator norm sinceLλ is bounded and one can easily check thatSλ is indeed aC0-semigroup (actually a group) with generatorLλ. SinceLλ = −λ+ λ2Rλ, one has forλ > athe bound

‖Sλ(t)‖ = e−λt∑

n≥0

tnλ2n‖Rnλ‖

n!= M exp

(−λt+

λ2

λ− at)

= M exp( λat

λ− a

), (4.4)

so that lim supλ→∞ ‖Sλ(t)‖ ≤ Meat. Let us show next that the limit limλ→∞ Sλ(t)x exists forevery t ≥ 0 and everyx ∈ B. Fixing λ andµ large enough so that max‖Sλ(t)‖, ‖Sµ(t)‖ ≤Me2at, and fixing some arbitraryt > 0, we have fors ∈ [0, t]

‖∂sSλ(t− s)Sµ(s)x‖ = ‖Sλ(t− s)(Lµ − Lλ)Sµ(s)x‖ = ‖Sλ(t− s)Sµ(s)(Lµ − Lλ)x‖≤M2e2at‖(Lµ − Lλ)x‖ .

Integrating this bound between0 andt, we obtain

‖Sλ(t)x− Sµ(t)x‖ ≤M2te2at‖Lµx− Lλx‖ , (4.5)

which converges to0 for everyx ∈ D(L) asλ, µ → ∞ since one then hasLλx → Lx. We canthereforedefinea family of linear operatorsS(t) by S(t)x = limλ→∞ Sλ(t)x.

It is clear from (4.4) that‖S(t)‖ ≤Meat and it follows from the semigroup property ofSλ thatS(s)S(t) = S(s + t). Furthermore, it follows from (4.5) and (4.3) that for every fixedx ∈ D(L),the convergenceSλ(t)x → S(t)x is uniform in bounded intervals oft, so that the mapt 7→ S(t)xis continuous. Combining this wit oura priori bounds on the operator norm ofS(t), it followsfrom Exercise 4.2 thatS is indeed aC0-semigroup. It remains to show that the generatorL of Scoincides withL. Taking first the limitλ→ ∞ and then the limitt → 0 in the identity

t−1(Sλ(t)x− x) = t−1

∫ t

0

Sλ(s)Lλx ds ,

we see thatx ∈ D(L) impliesx ∈ D(L) andLx = Lx, so thatL is an extension ofL. However,for λ > a, bothλ − L andλ − L are one-to-one between their domain andB, so that they mustcoincide.

Page 33: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 33

One might think that the resolvent bound in the Hille-Yosidatheorem is a consequence of thefact that the spectrum ofL is assumed to be contained in the half planeλ : Reλ ≤ a. Thishowever isn’t the case, as can be seen by the following example:

Example 4.14 We takeB =⊕

n≥1 C2 (equipped with the usual Euclidean norms) and we defineL =

⊕n≥1 Ln, whereLn: C2 → C2 is given by the matrix

Ln =

(in n0 in

).

In particular, the resolventR(n)λ of Ln is given by

R(n)λ =

1

(λ− in)2

(λ− in n

0 λ− in

),

so that one has the upper and lower bounds

n

|λ− in|2 ≤ ‖R(n)λ ‖ ≤ n

|λ− in|2 +

√2

|λ− in| .

Note now that the resolventRλ of L satisfies‖Rλ‖ = supn≥1 ‖R(n)λ ‖. On one hand, this shows

that the spectrum ofL is given by the setin2 : n ≥ 1, so that it does indeed lie in a half plane.On the other hand, for every fixed valuea > 0, we have‖Ra+in‖ ≥ n

a2 , so that the resolventbound of the Hille-Yosida theorem is certainly not satisfied.

It is therefore not surprising thatL does not generate aC0-semigroup onB. Even worse,trying to defineS(t) = ⊕n≥1Sn(t) with Sn(t) = eLnt results in‖Sn(t)‖ ≥ nt, so thatS(t) is anunbounded operator for everyt > 0!

4.1.1 Adjoint semigroups

It will be very useful in the sequel to have a good understanding of the behaviour of the adjointsof strongly continuous semigroups. The reason why this is not a completely trivial topic is that, ingeneral, it is simply not true that the adjoint semigroupS∗(t):B∗ → B∗ of a strongly continuoussemigroup is again strongly continuous. This is probably best illustrated by an example.

TakeB = C([0, 1],R) and letS(t) be the heat semigroup (with Neumann boundary conditions,say). ThenS∗(t) acts on finite signed measures by convolving them with the heat kernel. While itis true thatS∗(t)µ → µ weakly ast → 0, it is not true in general that this convergence is strong.For example,S∗(t)δx doesnot converge toδx in the total variation norm (which is the dual to thesupremum norm onC([0, 1],R)). However, this difficulty can always be overcome by restrictingS∗(t) to a slightly smaller space thanB∗. This is the content of the following result:

Proposition 4.15 If S(t) is aC0-semigroup onB, thenS∗(t) is a C0-semigroup on the closureB†

of D(L∗) in B∗ and its generatorL† is given by the restriction ofL∗ to the setD(L†) = x ∈D(L∗) : L∗x ∈ B†.

Proof. We first show thatS∗(t) is strongly continuous onB† and we will then identify its generator.Note first that it follows from Proposition 4.7 thatS∗(t) mapsD(L∗) into itself, so that it doesindeed define a family of bounded operators onB†. Since the norm ofS∗(t) is O(1) ast → 0 andsinceD(L∗) is dense inB† by definition, it is sufficient to show that limt→0 S

∗(t)x = x for everyx ∈ D(L∗). It follows immediately from Proposition 4.7 that forx ∈ D(L∗) one has the identity

S∗(t)x− x =

∫ t

0

S∗(s)L∗x ds ,

Page 34: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

34 A PRIMER ON SEMIGROUP THEORY

from which we conclude thatS∗(t)x→ x.It follows from Proposition 4.11 that the resolventR†

λ for S∗(t) on B† is nothing but therestriction ofR∗

λ toB†. This immediately implies thatD(L†) is given by the stated expression.

Remark 4.16 As we saw in the example of the heat semigroup,B† is in general strictly smallerthanB∗. This fact was first pointed out by Phillips in [Phi55]. In ourexample,B∗ consists ofall finite signed Borel measures on [0, 1], whereasB† only consists of those measures that have adensity with respect to Lebesgue measure.

Even thoughB† is in general a proper closed subspace ofB∗, it is large enough to be dense inB∗, when equipped with the (much weaker) weak-* topology. Thisthe content of our last result inthe theory of strongly continuous semigroups:

Proposition 4.17 For everyℓ ∈ B∗ there exists a sequenceℓn ∈ B† such thatℓn(x) → ℓ(x) foreveryx ∈ B.

Proof. It suffices to chooseℓn = nR∗nℓ. Since we haveℓn ∈ D(L∗), it is clear thatℓn ∈ B†. On

the other hand, we know from the proof of the Hille-Yosida theorem that limn→∞ ‖nRnx−x‖ = 0for everyx ∈ B, from which the claim follows at once.

4.2 Semigroups with selfadjoint generators

In this section, we consider the particular case of stronglycontinuous semigroups consisting ofself-adjoint operators on a Hilbert spaceH. The reason why this is an interesting case is that itimmediately implies very strong smoothing properties of the operatorsS(t) in the sense that forevery t > 0, they mapH into the domain of arbitrarily high powers ofL. Furthermore, it isvery easy to obtain explicit bounds on the norm ofS(t) as an operator fromH into one of thesedomains. We will then see later in Section 4.3 on analytic semigroups that most of these propertiesstill hold true for a much larger class of semigroups.

LetL be a selfadjoint operator onH which is bounded from above. Without loss of generality,we are going to assume that it is actually negative definite, so that〈x,Lx〉 ≤ 0 for anyx ∈ H.In this case, we can use functional calculus (see for example[RS80], in particular chapter VIII involume I) to define selfadjoint operatorsf (L) for any measurable mapf : R → R. This is becausethe spectral decomposition theorem can be formulated as:

Theorem 4.18 (Spectral decomposition)LetL be a selfadjoint operator on a separable HilbertspaceH. Then, there exists a measure space(M, µ), an isomorphismK:H → L2(M, µ), anda functionfL:M → R such that viaK, L is equivalent to the multiplication operator byfL onL2(M, µ). In other words, one hasL = K−1fLK andKD(L) = g : fLg ∈ L2(M, µ).

In particular, this allows one to definef (L) = K−1(f fL)K, which has all the nice prop-erties that one would expect from functional calculus, likefor example (fg)(L) = f (L)g(L),‖f (L)‖ = ‖f‖L∞(M,µ), etc. DefiningS(t) = eLt, it is an exercise to check thatS is indeed aC0-semigroup with generatorL (either use the Hille-Yosida theorem and make sure that the semi-group constructed there coincides withS or check ‘by hand’ thatS(t) is indeedC0 with generatorL).

The important property of semigroups generated by selfadjoint operators is that they do notonly leaveD(L) invariant, but they have a regularising effect in that theymapH into the domainof any arbitrarily high power ofL. More precisely, one has:

Page 35: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 35

Proposition 4.19 LetL be self-adjoint and negative definite and letS(t) be the semigroup onHgenerated byL. Then,S(t) mapsH into the domain of(1 − L)α for anyα, t > 0 and there existconstantsCα such that‖(1 − L)αS(t)‖ ≤ Cα(1 + t−α).

Proof. By functional calculus, it suffices to show that supλ≥0(1 + λ)αe−λt ≤ Cα(1 + t−α). Onehas

supλ≥0

λαe−λt = t−α supλ≥0

(λt)αe−λt = t−α supλ≥0

λαe−λ = ααe−αt−α .

The claim now follows from the fact that there exists a constant C ′α such that (1 − λ)α ≤ C ′

α(1 +(−λ)α) for everyλ ≤ 0.

4.3 Analytic semigroups

Obviously, the conclusion of Proposition 4.19 does not holdfor arbitraryC0-semigroups since thegroup of translations from Example 4.9 does not have any smoothing properties. It does howeverhold for a very large class of semigroups, the so-called analytic semigroups. The study of thesesemigroups is the object of the remainder of this section, and the equivalent of Proposition 4.19is going to be one of our two main results. The other result is acharacterisation of generatorsfor analytic semigroups that is analogous to the Hille-Yosida theorem forC0-semigroups. Thedifference will be that the role of the half-plane Reλ > a will be played by the complement of asector of the complex plane with an opening angle strictly smaller thanπ.

Recall that a semigroupS on a Banach spaceB is analytic if there existsθ > 0 such that themapt 7→ S(t) (taking values inL(B)) admits an analytic extension to the sectorSθ = λ ∈ C :| arg λ| < θ, satisfies the semigroup property there, and is such thatt 7→ Sϕ(t) = S(eiϕt) isa strongly continuous semigroup for every|ϕ| < θ. If θ is the largest angle such that the aboveproperty holds, we callS analytic with angleθ. The strong continuity oft 7→ S(eiϕt) implies thatthere exist constantsM (ϕ) anda(ϕ) such that

‖Sϕ(t)‖ ≤M (ϕ)ea(ϕ)t .

Using the semigroup property, it is not difficult to show thatM anda can be chosen bounded overcompact intervals:

Proposition 4.20 LetS be an analytic semigroup with angleθ. Then, for everyθ′ < θ, there existM anda such that‖Sϕ(t)‖ ≤Meat for everyt > 0 and every|ϕ| ≤ θ′.

Proof. Fix θ′ ∈ (0, θ), so that in particularθ < π/2. Then there exists a constantC such that,for everyt > 0 and everyϕ with |ϕ| < θ′, there exist numberst+, t− ∈ [0, Ct] such thatteiϕ =t+e

iθ′ + t−e−iθ′ . It follows that one has the bound‖Sϕ(t)‖ ≤ M (θ′)M (−θ′)ea(θ′)Ct+a(−θ′)Ct,thus proving the claim.

We next compute the generators of the semigroupsSϕ obtained by evaluatingS along a ‘ray’extending out of the origin into the complex plane:

Proposition 4.21 LetS be an analytic semigroup with angleθ. Then, for|ϕ| < θ, the generatorLϕ of Sϕ is given byLϕ = eiϕL, whereL is the generator ofS.

Proof. Recall Proposition 4.11 showing that for Reλ large enough the resolventRλ for L is givenby

Rλx =

∫ ∞

0

e−λtS(t)x dt .

Page 36: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

36 A PRIMER ON SEMIGROUP THEORY

Since the mapt 7→ e−λtS(t) is analytic inSθ by assumption and since, provided again that Reλ islarge enough, it decays exponentially to0 as|t| → ∞, we can deform the contour of integrationto obtain

Rλx = eiϕ∫ ∞

0

e−λeiϕtS(eiϕt)x dt .

Denoting byRϕλ the resolvent for the generatorLϕ of Sϕ, we thus have the identityRλ =

eiϕRϕλeiϕ , which is equivalent to (λ − L)−1 = (λ− e−iϕLϕ)−1, thus showing thatLϕ = eiϕL as

stated.

We now use this to show that ifS is an analytic semigroup, then the resolvent set of its gener-atorL not only contains the right half plane, but it contains a larger sector of the complex plane.Furthermore, this characterises the generators of analytic semigroups, providing a statement simi-lar to the Hille-Yosida theorem:

Theorem 4.22 A closed densely defined operatorL on a Banach spaceB is the generator of ananalytic semigroup if and only if there existsθ ∈ (0, π

2) anda ≥ 0 such that the spectrum ofL is

contained in the sector

Sθ,a = λ ∈ C : arg(a− λ) ∈ [−π2

+ θ, π2− θ] ,

and there existsM > 0 such that the resolventRλ satisfies the bound‖Rλ‖ ≤Md(λ, Sθ,a)−1 foreveryλ 6∈ Sθ,a.

Proof. The fact that generators of analytic semigroups are of the prescribed form is a consequenceof Proposition 4.21 and the Hille-Yosida theorem.

a

b

ϕ

θ

Sa,θ

γb,ϕ

Re

Im

To show the converse statement, letL be such anoperator, letϕ ∈ (0, θ), let b > a, and letγϕ,b bethe curve in the complex plane obtained by going ina counterclockwise way around the boundary ofSϕ,b

(see the figure on the right). Fort with | arg t| < ϕ,defineS(t) by

S(t) =1

2πi

γϕ,b

etzRz dz (4.6)

=1

2πi

γϕ,b

etz(z − L)−1 dz .

It follows from the resolvent bound that‖Rz‖ is uni-formly bounded forz ∈ γϕ,b. Furthermore, since| arg t| < ϕ, it follows thatetz decays exponentiallyas|z| → ∞ alongγϕ,b, so that this expression is well-defined, does not depend on the choice ofb andϕ,and (by choosingϕ arbitrarily close toθ) determines an analytic functiont 7→ S(t) on the sectort : | arg t| < θ. As in the proof of the Hille-Yosida theorem, the function (x, t) 7→ S(t)x isjointly continuous because the convergence of the integraldefiningS is uniform over boundedsubsets oft : | arg t| < ϕ for any |ϕ| < θ.

It therefore remains to show thatS satisfies the semigroup property on the sectort : | arg t| <θ and that its generator is indeed given byL. Choosings and t such that| arg s| < θ and| arg t| < θ and using the resolvent identityRz −Rz′ = (z′ − z)RzRz′ , we have

S(s)S(t) = − 1

4π2

γϕ,b′

γϕ,b

etz+sz′RzRz′ dz dz′ = − 1

4π2

γϕ,b′

γϕ,b

etz+sz′Rz −Rz′

z′ − zdz dz′

Page 37: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 37

= − 1

4π2

γϕ,b

etzRz

γϕ,b′

esz′

z′ − zdz′ dz − 1

4π2

γϕ,b′eszRz

γϕ,b

etz′

z′ − zdz′ dz .

Here, the choice ofb and b′ is arbitrary, as long asb 6= b′ so that the inner integrals are well-defined, sayb′ > b for definiteness. In this case, since the contourγϕ,b can be ‘closed up’ to the

left but not to the right, the integral∫γϕ,b′

esz′

z′−z dz′ is equal to2iπesz for everyz ∈ γϕ,b, whereas

the integral withb andb′ inverted vanishes, so that

S(s)S(t) =1

2iπ

γϕ,b

e(t+s)zRz = S(s+ t) ,

as required. The continuity of the mapt 7→ S(t)x is a straightforward consequence of the resolventbound, noting that it arises as a uniform limit of continuousfunctions. ThereforeS is a stronglycontinuous semigroup; let us call its generatorL andRλ the corresponding resolvent.

To show thatL = L, it suffices to show thatRλ = Rλ, so we make use again of Proposi-tion 4.11. Choosing Reλ > b so that Re(z − λ) < 0 for everyz ∈ γϕ,b, we have

Rλ =

∫ ∞

0

e−λtS(t) dt =1

2πi

∫ ∞

0

γϕ,b

et(z−λ)Rz dz dt

=1

2πi

γϕ,b

∫ ∞

0

et(z−λ) dtRz dz =1

2πi

γϕ,b

Rz

z − λdz = Rλ .

The last inequality was obtained by using the fact that‖Rz‖ decays like1/|z| for large enoughzwith | arg z| ≤ π

2+ ϕ, so that the contour can be ‘closed’ to enclose the pole atz = λ.

As a consequence of this characterisation theorem, we can study perturbations of generatorsof analytic semigroups. The idea is to give a constructive criterion which allows to make sure thatan operator of the typeL = L0 + B is the generator of an analytic semigroup, provided thatL0

is such a generator andB satisfies a type of ‘relative total boundedness’ condition.The precisestatement of this result is:

Theorem 4.23 Let L0 be the generator of an analytic semigroup and letB:D(B) → B be anoperator such that

• The domainD(B) containsD(L0).

• For everyε > 0 there existsC > 0 such that‖Bx‖ ≤ ε‖L0x‖+C‖x‖ for everyx ∈ D(L0).

Then the operatorL = L0 +B (with domainD(L) = D(L0)) is also the generator of an analyticsemigroup.

Proof. In view of Theorem 4.22 it suffices to show that there exists a sectorSθ,a containing thespectrum ofL and such that the resolvent boundRλ ≤Md(λ, Sθ,a)−1 holds away from it.

Denote byR0λ the resolvent forL0 and consider the resolvent equation forL:

(λ− L0 −B)x = y , x ∈ D(L0) .

Since (at least forλ outside of some sector)x belongs to the range ofR0λ, we can setx = R0

λz sothat this equation is equivalent to

z −BR0λz = y .

Page 38: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

38 A PRIMER ON SEMIGROUP THEORY

The claim therefore follows if we can show that there exists asectorSθ,a and a constantc < 1such that‖BR0

λ‖ ≤ c for λ 6∈ Sθ,a. This is because one then has the bound

‖Rλy‖ = ‖R0λz‖ ≤ ‖R0

λ‖1 − c

‖y‖ .

Using our assumption onB, we have the bound

‖BR0λz‖ ≤ ε‖L0R

0λz‖ + C‖R0

λz‖ . (4.7)

Furthermore, one has the identityL0R0λ = λR0

λ − 1 and, sinceL0 is the generator of an analyticsemigroup by assumption, the resolvent bound‖R0

λ‖ ≤Md(λ, Sα,b)−1 for some parametersα, b.Inserting this into (4.7), we obtain the bound

‖BR0λ‖ ≤ (ε|λ| + C)M

d(λ, Sα,b)+ ε .

Note now that by choosingθ ∈ (0, α), we can find someδ > 0 such thatd(λ, Sα,b) > δ|λ| for allλ 6∈ Sθ,a and alla > 1 ∨ (b + 1). We fix such aθ and we makeε sufficiently small such that onehas bothε < 1/4 andεδ−1 < 1/4.

We can then makea large enough so thatd(λ, Sα,b) ≥ 4CM for λ 6∈ Sθ,a, so that‖BR0λ‖ ≤

3/4. for these values ofλ, as requested.

Remark 4.24 As one can see from the proof, one actually needs the bound‖Bx‖ ≤ ε‖L0x‖ +C‖x‖ only for some particular value ofε that depends on the characteristics ofL0.

As a consequence, we have:

Proposition 4.25 Letf ∈ L∞(R). Then, the operator

(Lg)(x) = g′′(x) + f (x)g′(x) ,

onL2(R) with domainD(L) = H2 is the generator of an analytic semigroup.

Proof. It is well-known that the operator(L0g)(x) = g′′(x) with domainD(L) = H2 is self-adjoint and negative definite, so that it is the generator of an analytic semigroup with angleθ =π/2.

SettingBg = fg′, we have forg ∈ H2 the bound

‖Bg‖2 =

Rf2(x)(g′(x))2 dx ≤ ‖f‖2

L∞〈g′, g′〉 = −‖f‖2L∞〈g, g′′〉 ≤ ‖f‖L∞‖g‖‖L0g‖ .

It now suffices to use the fact that2|xy| ≤ εx2 + ε−1y2 to conclude that the assumptions ofTheorem 4.23 are satisfied.

Similarly, one can show:

Exercise 4.26Show that the generator of an elliptic diffusion with smoothcoefficients on a com-pact Riemannian manifoldM generates an analytic semigroup onL2(M, ), where is the vol-ume measure given by the Riemannian structure.

Page 39: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 39

4.4 Interpolation spaces

The remainder of this section will be devoted to the study of the domains of fractional powersof the generatorL of an analytic semigroupS(t). For simplicity, we will assumethroughout thissectionthat there existM > 0 andw > 0 such that‖S(t)‖ ≤ Me−wt, thus making sure thatthe resolvent set ofL contains all the right half of the complex plane. The generalcase can berecovered easily by ‘shifting the generator to the left’. For α > 0, we definenegative fractionalpowers ofL by

(−L)−α def=

1

Γ(α)

∫ ∞

0

tα−1S(t) dt , (4.8)

which is a bounded operator by the decay assumption on‖S(t)‖. SinceΓ(1) = 1, note that ifα = 1one does indeed recover the resolvent ofL evaluated at0. Furthermore, it is straightforward tocheck that one has the identity (−L)−α(−L)−β = (−L)−α−β , which together justify the definition(4.8).

Note that it follows from this identity that (−L)−α is injective for everyα > 0. Indeed, givensomeα > 0, one can find an integern > 0 such that (−L)−n = (−L)−n+α(−L)−α. A failurefor (−L)−α to be injective would therefore result in a failure for (−L)n and therefore (−L)−1

to be injective. This is ruled out by the fact that0 belongs to the resolvent set ofL. We cantherefore define (−L)α as the unbounded operator with domainD((−L)α) = range(−L)−α givenby the inverse of (−L)−α. This definition is again consistent with the usual definition of (−L)α

for integer values ofα. This allows us to set:

Definition 4.27 For α > 0 and given an analytic semigroupS on a Banach spaceB, we definethe interpolation spaceBα as the domain of (−L)α endowed with the norm‖x‖α = ‖(−L)αx‖.We similarly defineB−α as the completion ofB for the norm‖x‖−α = ‖(−L)−αx‖.

Remark 4.28 If the norm ofS(t) grows instead of decaying witht, then we useλ− L instead of−L for someλ sufficiently large. The choice of different values ofλ leads to equivalent norms onBα.

Exercise 4.29Show that the inclusionBα ⊂ Bβ for α ≥ β hold, whatever the signs ofα andβ.

Exercise 4.30Show that forα ∈ (0, 1) andx ∈ D(L), one has the identity

(−L)αx =sinαππ

∫ ∞

0

tα−1(t− L)−1(−L)x dt . (4.9)

Hint: Write the resolvent appearing in (4.9) in terms of the semigroup and apply the resultingexpression to (−L)−αx, as defined in (4.8). The aim of the game is then to perform a smartchange of variables.

Exercise 4.31Use (4.9) to show that, for everyα ∈ (0, 1), there exists a constantC such that thebound‖(−L)αx‖ ≤ C‖Lx‖α‖x‖1−α holds for everyx ∈ D(L).Hint: Split the integral as

∫ ∞0 =

∫ K0 +

∫ ∞K and optimise overK. (The optimal value forK will

turn out to be proportional to‖Lx‖/‖x‖.) In the first integral, the identity (t − L)−1(−L) =1 − t(t− L)−1 might come in handy.

Exercise 4.32LetL be the generator of an analytic semigroup onB and denote byBα the corre-sponding interpolation spaces. LetB be a (possibly unbounded) operator onB. Using the results

Page 40: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

40 A PRIMER ON SEMIGROUP THEORY

from the previous exercise, show that if there existsα ∈ [0, 1) such thatBα ⊂ D(B) so thatB isa bounded operator fromBα toB, then one has the bound

‖Bx‖ ≤ C(ε‖Lx‖ + ε−α/(1−α)‖x‖) ,

for some constantC > 0 and for allε ≤ 1. In particular,L+B is also the generator of an analyticsemigroup onB.Hint: The assumption onB implies that there exists a constantC such that‖Bx‖ ≤ C‖x‖α.

Exercise 4.33LetL andB be as in Exercise 4.32 and denote bySB the analytic semigroup withgeneratorL+B. Use the relationRλ −R0

λ = R0λBRλ to show that one has the identity

SB(t)x = S(t)x+

∫ t

0

S(t− s)BSB(s)x ds .

Hint: Start from the right hand side of the equation and use an argument similar to that of theproof of Theorem 4.22.

Exercise 4.34Show that (−L)α commutes withS(t) for everyt > 0 and everyα ∈ R. DeducethatS(t) leavesBα invariant for everyα > 0.

Exercise 4.35 It follows from Theorem 4.22 that the restrictionL† of the adjointL∗ of the gener-ator of an analytic semigroup onB to the ‘semigroup dual’ spaceB† is again the generator of ananalytic semigroup onB†. Denote byB†

α the corresponding interpolation spaces. Show that onehasB†

α = D((−L†)α) ⊂ D(((−L)α)∗) = (B−α)∗ for everyα ≥ 0.

We now show that an analytic semigroupS(t) always mapsB intoBα for t > 0, so that it has a‘smoothing effect’. Furthermore, the norm in the domains ofinteger powers ofL can be boundedby:

Proposition 4.36 For everyt > 0 and every integerk > 0, S(t) mapsB into D(Lk) and thereexists a constantCk such that

‖LkS(t)x‖ ≤ Ck

tk

for everyt ∈ (0, 1].

Proof. In order to show thatS mapsB into the domain of every power ofL, we use (4.6), togetherwith the identityLRλ = λRλ − 1 which is an immediate consequence of the definition of theresolventRλ of L. Since

∫γϕ,b

etzdz = 0 for everyt such that| arg t| < ϕ and since the domain

of Lk is complete under the graph norm, this shows thatS(t)x ∈ D(Lk) and

LkS(t) =1

2πi

γϕ,b

zketzRz dz .

It follows that there exist positive constantsci such that

‖LkS(t)‖ ≤ 1

γϕ,b

|z|k|etz|‖Rz‖ d|z| ≤ c1

∫ ∞

0

(1 + x)ke−c2t(x−c3)(1 + x)−1dx .

Integrating by partsk − 1 times, we obtain

‖LkS(t)‖ ≤ c4tk−1

∫ ∞

0

e−c2t(x−c4) dx =c5e

c6t

tk,

which implies the announced bound.

Page 41: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 41

It turns out that a similar bound also holds for interpolation spaces with non-integer indices:

Proposition 4.37 For everyt > 0 and everyα > 0, S(t) mapsB into Bα and there exists aconstantCα such that

‖(−L)αS(t)x‖ ≤ Cα

tα(4.10)

for everyt ∈ (0, 1].

Proof. The fact thatS(t) mapsB into Bα follows from Proposition 4.36 since there existsn suchthatD(Ln) ⊂ Bα. We assume again that the norm ofS(t) decays exponentially for larget. Theclaim for integer values ofα is known to hold by Proposition 4.36, so we fix someα > 0 which isnot an integer. Note first that (−L)α = (−L)α−[α]−1(−L)[α]+1, were we denote by [α] the integerpart ofα. We thus obtain from (4.8) the identity

(−L)αS(t) =(−1)[α]+1

Γ([α] − α+ 1)

∫ ∞

0

s[α]−αL[α]+1S(t+ s) ds .

Using the previous bound fork = [α], we thus get for someC > 0 the bound

‖(−L)αS(t)‖ ≤ C

∫ ∞

0

s[α]−α e−w(t+s)

(t+ s)[α]+1ds ≤ Ct−α

∫ ∞

0

s[α]−α

(1 + s)[α]+1ds ,

where we used the substitutions 7→ ts. Since the last function is integrable for everyα > 0, theclaim follows at once.

Exercise 4.38Using the fact thatS(t) commutes with any power of its generator, show thatS(t)mapsBα into Bβ for everyα, β ∈ R and that, forβ > α, there exists a constantCα,β such that‖S(t)x‖Bβ

≤ Cα,β‖x‖Bα tα−β for all t ∈ (0, 1].

Exercise 4.39Using the bound from the previous exercise and the definitionof the resolvent,show that for everyα ∈ R and everyβ ∈ [α,α + 1) there exists a constantC such that the bound‖(t− L)−1x‖Bβ

≤ C(1 + t)β−α−1‖x‖Bα holds for allt ≥ 0.

Exercise 4.40Consider an analytic semigroupS(t) on B and denote byBα the correspondinginterpolation spaces. Fix someγ ∈ R and denote byS(t) the semigroupS viewed as a semigroupon Bγ . Denoting byBα the interpolation spaces corresponding toS(t), show that one has theidentity Bα = Bγ+α for everyα ∈ R.

Another question that can be answered in a satisfactory way with the help of interpolationspaces is the speed of convergence ofS(t)x to x as t → 0. We know that ifx ∈ D(L), thent 7→ S(t)x is differentiable att = 0, so that‖S(t)x−x‖ = t‖Lx‖+ o(t). Furthermore, one can ingeneral find elementsx ∈ B so that the convergenceS(t)x → x is arbitrarily slow. This suggeststhat if x ∈ D((−L)α) for α ∈ (0, 1), one has‖S(t)x− x‖ = O(tα). This is indeed the case:

Proposition 4.41 LetS be an analytic semigroup with generatorL on a Banach spaceB. Then,for everyα ∈ (0, 1), there exists a constantCα, so that the bound

‖S(t)x− x‖ ≤ Cαtα‖x‖Bα (4.11)

holds for everyx ∈ Bα and everyt ∈ (0, 1].

Page 42: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

42 A PRIMER ON SEMIGROUP THEORY

Proof. By density, it is sufficient to show that (4.11) holds for every x ∈ D(L). For such anx, onehas indeed the chain of inequalities

‖S(t)x− x‖ =∥∥∥∫ t

0

S(s)Lxdx∥∥∥ =

∥∥∥∫ t

0

(−L)1−αS(s)(−L)αx dx∥∥∥

≤ C‖x‖Bα

∫ t

0

‖(−L)1−αS(s)‖ dx ≤ C‖x‖Bα

∫ t

0

sα−1 ds = C‖x‖Bαtα .

Here, the constantC depends only onα and changes from one expression to the next.

We conclude this section with a discussion on the interpolation spaces arising from a perturbedanalytic semigroup. As a consequence of Exercises 4.31, 4.32, and 4.39, we have the followingresult:

Proposition 4.42 LetL0 be the generator of an analytic semigroup onB and denote byB0γ the

corresponding interpolation spaces. LetB be a bounded operator fromB0α to B for someα ∈

[0, 1). Let furthermoreBγ be the interpolation spaces associated toL = L0 + B. Then, one hasBγ = B0

γ for everyγ ∈ [0, 1].

Proof. The statement is clear forγ = 0 andγ = 1. For intermediate values ofγ, we will showthat there exists a constantC such thatC−1‖(−L0)γx‖ ≤ ‖(−L)γx‖ ≤ C‖(−L0)γx‖ for everyx ∈ D(L0).

Since the domain ofL is equal to the domain ofL0, we know that the operatorBRt is boundedfor everyt > 0, whereRt is the resolvent ofL. Making use of the identity

Rt = R0t +R0

tBRt , (4.12)

(where we similarly denoted byR0t the resolvent ofL0) it then follows from Exercise 4.39 and the

assumption onB that one has for everyx ∈ B0γ the bound

‖BRtx‖ ≤ ‖BR0tx‖ + ‖BR0

tBRtx‖ ≤ C(‖R0tx‖B0

α+ ‖R0

tBRtx‖B0α)

≤ C(1 + t)α−γ−1‖x‖B0γ

+ C(1 + t)α−1‖BRtx‖ .

It follows that, fort sufficiently large, one has the bound

‖BRtx‖ ≤ C(1 + t)α−γ−1‖x‖B0γ. (4.13)

(Note that this bound is also valid forγ = 0.) Since one furthermore has the resolvent identityRs = Rt + (t− s)RsRt, this bound can be extended to allt > 0 by possibly changing the valueof the constantC.

We now show that‖(−L)γx‖ can be bounded by‖(−L0)γx‖. We make use of Exercise 4.31to get, forx ∈ D(L0), the bound

‖x‖Bγ = C∥∥∥∫ ∞

0

tγ−1LRtx dt∥∥∥

≤ C∥∥∥∫ ∞

0

tγ−1L0R0tx dt

∥∥∥ + C

∫ ∞

0

tγ−1‖(L0R0t + 1)BRtx‖ dt

≤ ‖x‖B0γ

+ C

∫ ∞

0

tγ−1‖BRtx‖ dt

≤ ‖x‖B0γ

+ C

∫ ∞

0

tγ−1(1 + t)α−γ−1 dt‖x‖B0γ.

Page 43: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

A PRIMER ON SEMIGROUP THEORY 43

Here, we used again the identity (4.12) to obtain the first inequality and we used (4.13) in the laststep. Since this integral converges, we have obtained the required bound.

In order to obtain the converse bound, we have similarly to before

‖x‖B0γ≤ ‖x‖Bγ + C

∫ ∞

0

tγ−1‖BRtx‖ dt .

Making use of the resolvent identity, this yields for arbitraryK > 0 the bound

‖x‖B0γ≤ ‖x‖Bγ + C

∫ ∞

0

tγ−1‖BRt+Kx‖ dt + CK

∫ ∞

0

tγ−1‖BRt+KRtx‖ dt

≤ ‖x‖Bγ + C

∫ ∞

0

tγ−1(t+K)α−γ−1 dt‖x‖B0γ

+ CK

∫ ∞

0

tγ−1(1 + t)−1 dt‖x‖

≤ ‖x‖Bγ + CKα−1‖x‖B0γ

+ CK‖x‖ .

By makingK sufficiently large, the prefactor of the second term can be made smaller than12, say,

so that the required bound follows by the usual trick of moving the term proportional to‖x‖B0γ

tothe left hand side of the inequality.

Exercise 4.43Assume thatB = H is a Hilbert space and that the antisymmetric part ofL is‘small’ in the sense thatD(L∗) = D(L) and, for everyε > 0 there exists a constantC such that‖(L−L∗)x‖ ≤ ε‖Lx‖+C‖x‖ for everyx ∈ D(L). Show that in this case the spaceH−α can beidentified with the dual ofHα (under the pairing given by the scalar product ofH) for α ∈ [0, 1].

It is interesting to note that the range [0, 1] appearing in the statement of Proposition 4.42is not just a restriction of the technique of proof employed here. There are indeed examplesof perturbations of generators of analytic semigroups of the type considered here which inducechanges in the corresponding interpolation spacesBα for α 6∈ [0, 1].

Consider for example the caseB = L2([0, 1]) and L0 = ∆, the Laplacian with periodicboundary conditions. Denote byB0

α the corresponding interpolation spaces. Let nowδ ∈ (0, 1) besome arbitrary index and letg ∈ B be such thatg 6∈ B0

δ . Such an elementg exists since∆ is anunbounded operator. DefineB as the operator with domainC1([0, 1]) ⊂ B given by

(Bf)(x) = f ′(1/2)g(x) . (4.14)

It turns out thatB0α ⊂ C1([0, 1]) for α > 3/4 (see for example Lemma 6.13 below), so that the

assumptions of Proposition 4.42 are indeed satisfied. Consider now the interpolation spaces ofindex1 + δ. Since we know thatBδ = B0

δ , we have the characterisations

B1+δ = f ∈ D(∆) : ∆f + f ′(1/2)g ∈ B0δ ,

B01+δ = f ∈ D(∆) : ∆f ∈ B0

δ .

Since on the other handg 6∈ B0δ by assumption, it follows thatB1+δ ∩ B0

1+δ consists precisely ofthose functions inD(∆) that have a vanishing derivative at1/2. In particular,B1+δ 6= B0

1+δ.One can also show thatB−1/4 6= B0

−1/4in the following way. Letfn ⊂ D(L) be an arbitrary

sequence of elements that form a Cauchy sequence inB3/4. Since we have already shown thatB3/4 = B0

3/4, this implies thatfn is Cauchy inB0

3/4as well. It then follows from the definition

of the interpolation spaces that the sequence∆fn is Cauchy inB0−1/4

and that the sequence

(∆ +B)fn is Cauchy inB−1/4. Assume now by contradiction thatB−1/4 = B0−1/4

.

Page 44: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

44 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

This would entail that both∆fn and∆fn+Bfn are Cauchy inB−1/4, so thatf ′n(1/2)gis Cauchy inB−1/4. This in turn immediately implies that the sequencef ′n(1/2) must be Cauchyin R. Define nowfn by

fn(x) =n∑

k=1

sin(4πkx)k2 logk

.

It is then straightforward to check that, since∑

k(k log2 k)−1 converges, this sequence is Cauchyin B0

3/4. On the other hand, we havef ′n(1/2) =

∑nk=1(k logk)−1 which diverges, thus leading to

the required contradiction.

Exercise 4.44Show, again in the same setting as above, that ifg ∈ B0δ for someδ > 0, then one

hasBα = B0α for everyα ∈ [0, 1 + δ).

Remark 4.45 The operatorB defined in (4.14) is not a closed operator onB. In fact, it is not evenclosable! This is however of no consequence for Proposition4.42 since the operatorL = L0 +Bis closed and this is all that matters.

5 Linear SPDEs / Stochastic Convolutions

We now apply the knowledge gathered in the previous sectionsto discuss the solution to linearstochastic PDEs. Most of the material from this section can also be found in one way or the otherin the monographs [DPZ92b, DPZ96] by Da Prato and Zabczyk. The aim of this section is todefine what we mean by the solution to a linear stochastic PDE of the form

dx = Lxdt+QdW (t) , x(0) = x0 , (5.1)

where we wantx to take values in a separable Banach spaceB,L is the generator of aC0 semigrouponB, W is a cylindrical Wiener process on some Hilbert spaceK, andQ:K → B is a boundedlinear operator.

We do not in general expectx to take values inD(L) and we do not even in general expectQW (t) to be aB-valued Wiener process, so that the usual way of defining solutions to (5.1) bysimply integrating both sides of the identity does not work.However, if we apply someℓ ∈ D(L∗)to both sides of (5.1), then there is much more hope that the usual definition makes sense. Thismotivates the following definition:

Definition 5.1 A B-valued processx(t) is said to be aweak solutionto (5.1) if, for everyt > 0,∫ t0‖x(s)‖ ds <∞ almost surely and the identity

〈ℓ, x(t)〉 = 〈ℓ, x0〉 +

∫ t

0

〈L∗ℓ, x(s)〉 ds +

∫ t

0

〈Q∗ℓ, dW (s)〉 , (5.2)

holds almost surely for everyℓ ∈ D(L∗).

Remark 5.2 (Very important!) The term ‘weak’ refers to the PDE notion of a weak solutionandnot to the probabilistic notion of a weak solution to a stochastic differential equation. From aprobabilistic point of view, we are always going to be dealing with strong solutions in these notes,in the sense that (5.1) can be solved pathwise for almost every realisation of the cylindrical WienerprocessW .

Just as in the case of stochastic ordinary differential equations, there are examples of stochasticPDEs that are sufficiently irregular so that they can only be solved in the probabilistic weak sense.We will however not consider any such example in these notes.

Page 45: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 45

Remark 5.3 The stochastic integral in (5.2) can be interpreted in the sense of Section 3.4 sincethe mapQ∗ℓ:K → R is Hilbert-Schmidt for everyℓ ∈ B∗.

Remark 5.4 Although separability ofB was not required in the previous section on semigrouptheory, it is again needed in this section, since many of the results from the section on Gaussianmeasure theory would not hold otherwise.

On the other hand, suppose thatf : R+ → D(L) is a continuous function and consider thefunctionx: R+ → D(L) given byx(t) = S(t)x0+

∫ t0S(t−s)f (s) ds, whereS is theC0-semigroup

generated byL. If x0 ∈ D(L) as well, then this function is differentiable and it is easyto check,using Proposition 4.7, that it satisfies the differential equation∂tx = Lx+ f . Formally replacingf (s) ds byQdW (s), this suggests the following alternative definition of a solution to (5.1):

Definition 5.5 A B-valued processx(t) is said to be amild solutionto (5.1) if the identity

x(t) = S(t)x0 +

∫ t

0

S(t− s)QdW (s) , (5.3)

holds almost surely for everyt > 0. The right hand side of (5.3) is also sometimes called astochastic convolution.

Remark 5.6 By the results from Section 3.4, the right hand side of (5.3) makes sense in anyHilbert spaceH containingB and such that

∫ t0

tr ιS(t−s)QQ∗S(t−s)∗ι∗ ds <∞, whereι:B → His the inclusion map. The statement should then be interpreted as saying that the right hand sidebelongs toB ⊂ H almost surely. In the case whereB is itself a Hilbert space, (5.3) makes sense ifand only if

∫ t0 trS(t− s)QQ∗S(t− s)∗ ds <∞.

It turns out that these two notions of solutions are actuallyequivalent:

Proposition 5.7 If the mild solution is almost surely integrable, then it is also a weak solution.Conversely, every weak solution is a mild solution.

Proof. Note first that, by considering the processx(t) − S(t)x0 and using Proposition 4.7, we canassume without loss of generality thatx0 = 0.

We now assume that the processx(t) defined by (5.3) takes values inB almost surely and weshow that this implies that it satisfies (5.2). Fixing an arbitrary ℓ ∈ D(L†), applyingL∗ℓ to bothsides of (5.3), and integrating the result between0 andt, we obtain:

∫ t

0

〈L∗ℓ, x(s)〉 ds =

∫ t

0

∫ s

0

〈L∗ℓ, S(s− r)QdW (r)〉 ds =

∫ t

0

⟨∫ t

rS∗(s− r)L∗ℓ ds,Q dW (r)

⟩.

Using Proposition 4.7 and the fact that, by Proposition 4.15, S∗ is a strongly continuous semigrouponB†, the closure ofD(L∗) in B∗, we obtain

∫ t

0

〈L∗ℓ, x(s)〉 ds =

∫ t

0

〈S∗(t− r)ℓ,Q dW (r)〉 −∫ t

0

〈ℓ,Q dW (r)〉

=⟨ℓ,

∫ t

0

S(t− r)QdW (r)⟩−

∫ t

0

〈ℓ,Q dW (r)〉

= 〈ℓ, x(t)〉 −∫ t

0

〈ℓ,Q dW (r)〉 ,

Page 46: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

46 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

thus showing that (5.2) holds for everyℓ ∈ D(L†). To show thatx is indeed a weak solution to(5.1), we have to extend this to everyℓ ∈ D(L∗). This however follows immediately from the factthatB† is weak-* dense inB∗, which was the content of Proposition 4.17.

To show the converse, let nowx(t) be any weak solution to (5.1) (again withx0 = 0). Fix anarbitraryℓ ∈ D(L†), some final timet > 0, and consider the functionf (s) = S∗(t − s)ℓ. Sinceℓ ∈ D(L†), it follows from Proposition 4.7 that this function belongs toE def

= C([0, t],D(L†)) ∩C1([0, t],B†). We are going to show that one has for such functions the almost sure identity

〈f (t), x(t)〉 =

∫ t

0

〈f (s) + L∗f (s), x(s)〉 ds +

∫ t

0

〈f (s), Q dW (s)〉 . (5.4)

Since in our casef (s) + L∗f (s) = 0, this implies that the identity

〈ℓ, x(t)〉 =

∫ t

0

〈ℓ, S(t− s)QdW (s)〉 , (5.5)

holds almost surely for allℓ ∈ D(L†). By the closed graph theorem,B† is large enough to separatepoints inB.2 SinceD(L†) is dense inB† and sinceB is separable, it follows that countably manyelements ofD(L†) are already sufficient to separate points inB. This then immediately impliesfrom (5.5) thatx is indeed a mild solution.

It remains to show that (5.4) holds for allf ∈ E . Since linear combinations of functions ofthe typeϕℓ(s) = ℓϕ(s) for ϕ ∈ C1([0, t],R) and ℓ ∈ D(L†) are dense inE (see Exercise 5.9below) and sincex is almost surely integrable, it suffices to show that (5.4) holds for f = ϕℓ.Since〈ℓ,QW (s)〉 is a standard one-dimensional Brownian motion, we can applyIto’s formula toϕ(s)〈ℓ, x(s)〉, yielding

ϕ(t)〈ℓ, x(t)〉 =

∫ t

0

ϕ(s)〈L∗ℓ, x(s)〉 +

∫ t

0

ϕ(s)〈ℓ, x(s)〉 +

∫ t

0

ϕ(s)〈ℓ,Q dW (s)〉 ,

which coincides with (5.4) as required.

Remark 5.8 It is actually possible to show that if the right hand side of (5.3) makes sense for somet, then it makes sense for allt and the resulting process belongs almost surely toLp([0, T ],B) foreveryp. Therefore, the concepts of mild and weak solution actuallyalways coincide. This followsfrom the fact that the covariance ofx(t) increases witht (which is a concept that can easily bemade sense of in Banach spaces as well as Hilbert spaces), seefor example [DJT95].

Exercise 5.9Consider the setting of the proof of Proposition 5.7. Letf ∈ E = C([0, 1],D(L†))∩C1([0, 1],B†) and, forn > 0, definefn on the intervals ∈ [k/n, (k + 1)/n] by cubic splineinterpolation:

fn(s) = f (k/n)(k + 1 − ns)2(1 + 2ns− 2k) + f ((k + 1)/n)(ns − k)2(3 − 2ns+ 2k)

+ (ns− k)(k + 1 − ns)2n(f ((k + 12)/n) − f ((k − 1

2)/n))

+ (ns− k)2(ns− k − 1)n(f ((k + 32)/n) − f ((k + 1

2)/n)) .

Show thatfn is a finite linear combinations of functions of the formℓϕ(s) with ϕ ∈ C1([0, 1],R)and thatfn → f in C([0, 1],D(L†)) ∩ C1([0, 1],B†).

2Assume that, for somex, y ∈ B, we have〈ℓ, x〉 = 〈ℓ, y〉 for everyℓ ∈ D(L∗). We can also assume without lossof generality that the range ofL is B, so thatx = Lx′ andy = Ly′, thus yielding〈L∗ℓ, x′〉 = 〈L∗ℓ, y′〉. SinceL isinjective and has dense domain, the closed graph theorem states that the range ofL∗ is all of B∗, so thatx′

= y′ andthus alsox = y.

Page 47: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 47

5.1 Time and space regularity

In this subsection, we are going to study the space and time regularity of solutions to linear stochas-tic PDEs. For example, we are going to see how one can easily derive the fact that the solutions tothe stochastic heat equation are ‘almost’1

4-Holder continuous in time and ‘almost’1

2-Holder con-

tinuous in space. Since we are often going to use the Hilbert-Schmidt norm of a linear operator,we introduce the notation

‖A‖2HS = trAA∗ .

For most of this section, we are going to make use of the theoryof analytic semigroups. How-ever, we start with a very weak regularity result for the solutions to stochastic PDEs whose linearoperatorL generates an arbitraryC0-semigroup:

Theorem 5.10 LetH andK be separable Hilbert spaces, letL be the generator of aC0-semigroupon H, let Q:K → H be a bounded operator and letW be a cylindrical Wiener process onK.Assume furthermore that‖S(t)Q‖HS < ∞ for everyt > 0 and that there existsα ∈ (0, 1

2) such

that∫ 1

0t−2α‖S(t)Q‖2

HSdt <∞. Then the solutionx to (5.1) has almost surely continuous samplepaths inH.

Proof. Note first that‖S(t+s)Q‖HS ≤ ‖S(s)‖‖S(t)Q‖HS, so that the assumptions of the theoremimply that

∫ T0t−2α‖S(t)Q‖2

HSdt < ∞ for everyT > 0. Let us fix an arbitrary terminal timeTfrom now on. Defining the processy by

y(t) =

∫ t

0

(t− s)−αS(t− s)QdW (s) ,

we obtain the existence of a constantC such that

E‖y(t)‖2 =

∫ t

0

(t− s)−2α‖S(t− s)Q‖2HSds =

∫ t

0

s−2α‖S(s)Q‖2HSds ≤ C ,

uniformly for t ∈ [0, T ]. It therefore follows from Fernique’s theorem that for every p > 0 thereexist a constantCp such that

E∫ T

0

‖y(t)‖p dt < Cp . (5.6)

Note now that there exists a constantcα (actuallycα = (sin2πα)/π) such that the identity

∫ t

s(t− r)α−1(r − s)−α dr =

1

cα,

holds for everyt > s. It follows that one has the identity

x(t) = S(t)x0 + cα

∫ t

0

∫ t

s(t− r)α−1(r − s)−αS(t− s) dr QdW (s)

= S(t)x0 + cα

∫ t

0

∫ r

0

(t− r)α−1(r − s)−αS(t− s)QdW (s) dr

= S(t)x0 + cα

∫ t

0

S(t− r)∫ r

0

(r − s)−αS(r − s)QdW (s) (t− r)α−1 dr

= S(t)x0 + cα

∫ t

0

S(t− r)y(r) (t− r)α−1 dr . (5.7)

Page 48: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

48 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

The claim thus follows from (5.6) if we can show that for everyα ∈ (0, 12) there existsp > 0 such

that the map

y 7→ Fy , Fy(t) =

∫ t

0

(t− r)α−1S(t− r)y(r) dr

mapsLp([0, T ],H) into C([0, T ],H). Since the semigroupt 7→ S(t) is uniformly bounded (in theusual operator norm) on any bounded time interval and sincet 7→ (t − r)α−1 belongs toLq forq ∈ [1, 1/(1−α)), we deduce from Holder’s inequality that there exists a constantCT such that onedoes indeed have the bound supt∈[0,T ] ‖Fy(t)‖p ≤ CT

∫ T0 ‖y(t)‖p dt, provided thatp > 1

α . Sincecontinuous functions are dense inLp, the proof is complete if we can show thatFy is continuousfor every continuous functiony with y(0) = 0.

Fixing such ay, we first show thatFy is right-continuous and then that it is left continuous.Fixing t > 0, we have forh > 0 the bound

‖Fy(t+ h) − Fy(t)‖ ≤∫ t

0

‖((t+ h− r)α−1S(h) − (t− r)α−1)S(t− r)y(r)‖ dr

+

∫ t+h

t(t+ h− r)α−1‖S(t+ h− r)y(r)‖ dr

The second term is bounded byO(hδ) for someδ > 0 by Holder’s inequality. It follows fromthe strong continuity ofS that the integrand of the first term converges to0 pointwise ash → 0.Since on the other hand the integrand is bounded byC(t − r)α−1‖y(r)‖ for some constantC,this term also converges to0 by the dominated convergence theorem. This shows thatFy is rightcontinuous.

To show thatFy is also left continuous, we write

‖Fy(t) − Fy(t− h)‖ ≤∫ t−h

0

‖((t− r)α−1S(h) − (t− h− r)α−1)S(t− h− r)y(r)‖ dr

+

∫ t

t−h(t− r)α−1‖S(t− r)y(r)‖ dr .

We bound the second term by Holder’s inequality as before. The second term can be rewritten as

∫ t

0

‖((t+ h− r)α−1S(h) − (t− r)α−1)S(t− r)y(r − h)‖ dr ,

with the understanding thaty(r) = 0 for r < 0. Since we assumed thaty is continuous, we canagain use the dominated convergence theorem to show that this term tends to0 ash→ 0.

Remark 5.11 The trick employed in (5.7) is sometimes called the “factorisation method” andwas introduced in the context of stochastic convolutions byDa Prato, Kwapien, and Zabczyk[DPKZ87, DPZ92a].

This theorem is quite sharp in the sense that, without any further assumption onQ andL, itis not possible in general to deduce thatt 7→ x(t) has more regularity than just continuity, even ifwe start with a very regular initial condition, sayx0 = 0. We illustrate this fact with the followingexercise:

Exercise 5.12Consider the caseH = L2(R), K = R, L = ∂x andQ = g for someg ∈ L2(R)such thatg ≥ 0 andg(x) = |x|−β for someβ ∈ (0, 1

2) and all|x| < 1. This satisfies the conditions

of Theorem 5.10 for anyα < 1.

Page 49: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 49

SinceL generates the translation group, the solution to

du(x, t) = ∂xu(x, t) dt + g(x) dW (t) , u(x, 0) = 0 ,

is given by

u(x, t) =

∫ t

0

g(x+ t− s) dW (s) .

Convince yourself that for fixedt, the mapx 7→ u(x, t) is in generalγ-Holder continuous forγ < 1

2− β, but no better. Deduce from this that the mapt 7→ u(·, t) is in general alsoγ-Holder

continuous forγ < 12− β (if we consider it either as anH-valued map or as aCb(R)-valued map),

but cannot be expected to have more regularity than that. Sinceβ can be chosen arbitrarily closeto 1

2, it follows that the exponentα appearing in Theorem 5.10 is in general independent of the

Holder regularity of the solution.

One of the main insights of regularity theory for parabolic PDEs (both deterministic andstochastic) is that space regularity is intimately linked to time regularity in several ways. Veryoften, the knowledge that a solution has a certain spatial regularity for fixed time implies that italso has a certain temporal regularity at a given spatial location.

From a slightly different point of view, if we consider time-regularity of the solution to a PDEviewed as an evolution equation in some infinite-dimensional space of functions, then the amountof regularity that one obtains depends on the functional space under consideration. As a generalrule, the smaller the space (and therefore the more spatial regularity it imposes) the lower theregularity of the solution, viewed as a function with valuesin that space.

We start by giving a general result that tells us precisely inwhich interpolation space one canexpect to find the solution to a linear SPDE associated with ananalytic semigroup. This providesus with the optimal spatial regularity for a given SPDE:

Theorem 5.13 Consider (5.1) on a Hilbert spaceH, assume thatL generates an analytic semi-group, and denote byHα the corresponding interpolation spaces. If there existsα ≥ 0 such thatQ:K → Hα is bounded andβ ∈ (0, 1

2+α] such that‖(−L)−β‖HS <∞ then the solutionx takes

values inHγ for everyγ < γ0 = 12

+ α− β.

Proof. As usual, we can assume without loss of generality that0 belongs to the resolvent set ofL.It suffices to show that

I(T ) def=

∫ T

0

‖(−L)γS(t)Q‖2HSdt <∞ , ∀T > 0 .

SinceQ is assumed to be bounded fromK to Hα, there exists a constantC such that

I(T ) ≤ C

∫ T

0

‖(−L)γS(t)(−L)−α‖2HSdt = C

∫ T

0

‖(−L)γ−αS(t)‖2HSdt .

Since (−L)−β is Hilbert-Schmidt, we have the bound

‖(−L)γ−αS(t)‖HS ≤ ‖(−L)−β‖HS‖(−L)β+γ−αS(t)‖ ≤ C(1 ∨ tα−γ−β) .

For this expression to be square integrable neart = 0, we needα−γ−β > −12, which is precisely

the stated condition.

Exercise 5.14Show that if we are in the setting of Theorem 5.13 andL is selfadjoint, then thesolutions to (5.1) actually belong toHγ for γ = γ0.

Page 50: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

50 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

Exercise 5.15Show that the solution to the stochastic heat equation on [0, 1] with periodic bound-ary conditions has solutions in the fractional Sobolev spaceHs for everys < 1/2. Recall thatHs

is the Hilbert space with scalar product〈f, g〉s =∑

k fkgk(1 + k2)s, wherefk denotes thekthFourier coefficient off .

Exercise 5.16Consider the following modified stochastic heat equation on[0, 1]d with periodicboundary conditions:

dx = ∆x dt+ (1 − ∆)−γ dW ,

whereW is a cylindrical Wiener process onL2([0, 1]d). For any givens ≥ 0, how large doesγneed to be forx to take values inHs ?

Using this knowledge about the spatial regularity of solutions, we can now turn to the time-regularity. We have:

Theorem 5.17 Consider the same setting as in Theorem 5.13 and fixγ < γ0. Then, at all timest > 0, the processx is almost surelyδ-Holder continuous inHγ for everyδ < 1

2∧ (γ0 − γ).

Proof. It follows from Kolmogorov’s continuity criteria, Proposition 3.18, that it suffices to checkthat the bound

E‖x(t) − x(s)‖2γ ≤ C|t− s|1∧2(γ−γ)

holds uniformly ins, t ∈ [t0, T ] for every t0, T > 0 and for everyγ < γ0. Here and below,C isan unspecified constant that changes from expression to expression. Assume thatt > s from nowon. It follows from the semigroup property and the independence of the increments ofW that theidentity

x(t) = S(t− s)x(s) +

∫ t

sS(t− r)QdW (r) , (5.8)

holds almost surely, where the two terms in the sum are independent. This property is also calledthe Markov property. Loosely speaking, it states that the future ofx depends on its present, butnot on its past. This transpires in (5.8) through the fact that the right hand side depends onx(s)and on the increments ofW between timess andt, but it does not depend onx(r) for anyr < s.

Furthermore,x(s) is independent of the increments ofW over the interval [s, t], so that Propo-sition 4.41 allows us to get the bound

E‖x(t) − x(s)‖2γ = E‖S(t− s)x(s) − x(s)‖2

γ +

∫ t−s

0

‖(−L)γS(r)Q‖2HSdr

≤ C|t− s|2(γ−γ)∧2E‖x(s)‖2γ + C

∫ t−s

0

(1 ∨ rα−γ−β)2 dr .

Here, we obtained the bound on the second term in exactly the same way as in the proof ofTheorem 5.13. The claim now follows from the fact thatα− γ − β = (γ0 − γ) − 1

2.

5.2 Long-time behaviour

This section is devoted to the behaviour of the solutions to (5.1) for large times. Let’s again startwith an example that illustrates some of the possible behaviours.

Example 5.18 Let x 7→ V (x) be some smooth ‘potential’ and letH = L2(R,exp(−V (x)) dx).LetS denote the translation semigroup (to the right) onH and denote its generator by−∂x. Let usfirst discuss which conditions onV ensure thatS is a strongly continuous semigroup onH. It is

Page 51: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 51

clear that it is a semigroup and thatS(t)u → u for u any smooth function with compact support.It therefore only remains to show that‖S(t)‖ is uniformly bounded fort ∈ [0, 1] say. We have

‖S(t)u‖2 =

∫u2(x− t)e−V (x) dx =

∫u2(x)e−V (x)eV (x)−V (x+t) dx . (5.9)

This shows that a necessary and sufficient condition forS to be a strongly continuous semigroupon H is that, for everyt > 0, there existsCt such that supx∈R(V (x) − V (x + t)) ≤ Ct andsuch thatCt remains bounded ast → 0. Examples of potentials leading to aC0-semigroup arex,√

1 + x2, log(1 + x2), etc or any increasing function. Note however that the potential V (x) = x2

doesnot lead to a strongly continuous semigroup. One different way of interpreting this is toconsider the unitary transformationK:u 7→ exp(1

2V )u from the ‘flat’ spaceL2 intoH. Under this

transformation, the generator−∂x is turned into

−(K−1∂xKu)(x) = −∂xu(x) − 1

2V ′(x)u(x) .

Considering the characterisation of generators ofC0-semigroups given by the Hille-Yosida theo-rem, one would expect this to be the generator of a strongly continuous semigroup ifV ′ is boundedfrom below, which is indeed a sufficient condition.

Let nowV be such thatS is aC0-semigroup and consider the SPDE onH given by

du(x, t) = −∂xu(x, t) dt + f (x) dW (t) , (5.10)

whereW is a one-dimensional Wiener process andf is some function inH. The solution to (5.10)with initial conditionu0 = 0 is given as before by

u(x, t) =

∫ t

0

f (x+ s− t) dW (s) . (5.11)

If we fix the time t, we can make the change of variables 7→ t − s, so thatu(x, t) is equal indistribution to

∫ t0f (x− s) dW (s).

We see that iff happens to be also square integrable (we will assume that this is the case in thesequel and we will also assume thatf is not identically zero), then (5.11) has a limit in distributionast → ∞ given by

u(x) =

∫ ∞

0

f (x− s) dW (s) . (5.12)

It is however not cleara priori that u does belong toH. On one hand, we have the bound

E∫

Ru(x)2e−V (x) dx =

R

∫ ∞

0

f2(x− t) dt e−V (x) dx ≤∫

Rf2(t) dt

Re−V (x) dx ,

thus showing thatu definitely belongs toH if e−V has finite mass. On the other hand, there areexamples whereu ∈ H even thoughe−V has infinite mass. For example, iff (x) = 0 for x ≤ 0,then it is necessary and sufficient to have

∫ ∞0e−V (x) dx <∞. Denote byν the law ofu for further

reference.Furthermore, ife−V is integrable, there are many measures onH that are invariant under the

action of the semigroupS. For example, given a functionh ∈ H which is periodic with periodτ (that isS(τ )h = h), we can check that the push-forward of the Lebesgue measureon [0, τ ]under the mapt 7→ S(t)h is invariant under the action ofS. This is simply a consequence of theinvariance of Lebesgue measure under the shift map. Given any invariant probability measureµh

of this type, letv be anH-valued random variable with lawµh that is independent ofW . We canthen consider the solution to (5.10) with initial conditionv. Since the law ofS(t)v is equal to the

Page 52: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

52 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

law of v by construction, it follows that the law of the solution converges to the distribution of therandom variableu+ v, with the understanding thatu andv are independent.

This shows that in the case∫e−V (x) dx < ∞, it is possible to construct solutionsu to (5.10)

such that the law ofu(· , t) converges toµh ⋆ ν for any periodic functionh.

Exercise 5.19Construct an example of a potentialV such that the semigroupS from the previousexample isnot strongly continuous by choosing it such that limt→0 ‖S(t)‖ = +∞, even thougheach of the operatorsS(t) for t > 0 is bounded! Hint: ChooseV of the formV (x) = x3 −∑

n>0 nW (x−cnn ), whereW is an isolated ‘spike’ andcn are suitably chosen constants.

This example shows that in general, the long-time behaviourof solutions to (5.1) may dependon the choice of initial condition. It also shows that depending on the behaviour ofH, L andQ,the law of the solutions may or may not converge to a limiting distribution in the space in whichsolutions are considered.

In order to formalise the concept of ‘long-time behaviour ofsolutions’ for (5.1), it is convenientto introduce theMarkov semigroupassociated to (5.1). Given a linear SPDE with solutions inB,we can define a familyPt of bounded linear operators onBb(B), the space of Borel measurablebounded functions fromB to R by

(Ptϕ)(x) = Eϕ(S(t)x+

∫ t

0

S(t− s)QdW (s)). (5.13)

The operatorsPt areMarkov operatorsin the sense that the mapA 7→ Pt1A(x) is a probabilitymeasure onB for every fixedx. In particular, one hasPt1 = 1 andPtϕ ≥ 0 if ϕ ≥ 0, that isthe operatorsPt preserve positivity. It follows furthermore from (5.8) andthe independence ofthe increments ofW over disjoint time intervals thatPt satisfies the semigroup propertyPt+s =Pt Ps for any two timess, t ≥ 0.

Exercise 5.20Show thatPt maps the spaceCb(B) of continuous bounded functions fromB to Rinto itself. (Recall that we assumedB to be separable.)

If we denote byPt(x, · ) the law ofS(t)x+∫ t0S(t− s)QdW (s), thenPt can alternatively be

represented as

(Ptϕ)(x) =

Bϕ(y)Pt(x, dy) .

It follows that its dualP∗t acts on measures with finite total variation by

(P∗t µ)(A) =

BPt(x,A)µ(dx) .

Since it preserves the mass of positive measures,P∗t is a continuous map from the spaceP1(B) of

Borel probability measures onB (endowed with the total variation topology) into itself. Itfollowsfrom (5.13) and the definition of the dual thatPtµ is nothing but the law at timet of the solutionto (5.1) with its initial conditionu0 distributed according toµ, independently of the increments ofW over [0, t]. With these notations in place, we define:

Definition 5.21 A Borel probability measureµ onB is aninvariant measurefor (5.1) if P∗t µ = µ

for everyt > 0, wherePt is the Markov semigroup associated to solutions of (5.1) via(5.13).

In the caseB = H where we consider (5.1) on a Hilbert spaceH, the situations in which suchan invariant measure exists are characterised in the following theorem:

Page 53: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 53

Theorem 5.22 Consider (5.1) with solutions in a Hilbert spaceH and define the self-adjointoperatorQt:H → H by

Qt =

∫ t

0

S(t)QQ∗S∗(t) dt .

Then there exists an invariant measureµ for (5.1) if and only if one of the following two equivalentconditions are satisfied:

1. There exists a positive definite trace class operatorQ∞:H → H such that the identity2Re〈Q∞L∗x, x〉 + ‖Q∗x‖2 = 0 holds for everyx ∈ D(L∗).

2. One hassupt>0 trQt <∞.

Furthermore, any invariant measure is of the formν ⋆ µ∞, whereν is a measure onH that isinvariant under the action of the semigroupS and µ∞ is the centred Gaussian measure withcovarianceQ∞.

Proof. The proof goes as follows. We first show thatµ being invariant implies that 2. holds. Thenwe show that 2. implies 1., and we conclude the first part by showing that 1. implies the existenceof an invariant measure.

Let us start by showing that ifµ is an invariant measure for (5.1), then 2. is satisfied. Bychoosingϕ(x) = ei〈h,x〉 for arbitraryh ∈ H, it follows from (5.13) that the Fourier transform ofPtµ satisfies the equation

Ptµ(x) = µ(S∗(t)x)e−1

2〈x,Qtx〉 . (5.14)

Taking logarithms and using the fact that|µ(x)| ≤ 1 for everyx ∈ H and every probabilitymeasureµ, It follows that ifµ is invariant, then

〈x,Qtx〉 ≤ −2 log |µ(x)| , ∀x ∈ H , ∀t > 0 . (5.15)

Choose now a sufficiently large value ofR > 0 so thatµ(‖x‖ > R) < 1/8 (say) and define asymmetric positive definite operatorAR:H → H by

〈h,ARh〉 =

‖x‖≤R|〈x, h〉|2 µ(dx) .

Since, for any orthonormal basis, one has‖x‖2 =∑

n |〈x, en〉|2, it follows thatAR is trace classand that trAR ≤ R2. Furthermore, one has the bound

|1 − µ(h)| ≤∫

H|1 − ei〈h,x〉|µ(dx) ≤

√〈h,ARh〉 +

1

4.

Combining this with (5.15), it follows that〈x,Qtx〉 is bounded by2 log4 for everyx ∈ H suchthat〈x,ARx〉 ≤ 1/4 so that, by homogeneity,

〈x,Qtx〉 ≤ (8 log4)〈x,ARx〉 .

It follows that trQt ≤ (8 log4)R2, so that 2. is satisfied. To show that 2. implies 1., note thatsup trQt <∞ implies that

Q∞ =

∫ ∞

0

S(t)QQ∗S∗(t) dt ,

is a well-defined positive definite trace class operator (since t 7→ Q1/2

t forms a Cauchy sequencein the space of Hilbert-Schmidt operators). Furthermore, one has the identity

〈x,Q∞x〉 = 〈S∗(t)x,Q∞S∗(t)x〉 +

∫ t

0

‖Q∗S∗(s)x‖2 ds .

Page 54: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

54 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

for x ∈ D(L∗), both terms on the right hand side of this expression are differentiable. Taking thederivative att = 0, we get

0 = 2Re〈Q∞L∗x, x〉 + ‖Q∗x‖2 ,

which is precisely the identity in 1.Let nowQ∞ be a given operator as in 1., we want to show that the centred Gaussian measure

µ∞ with covarianceQ∞ is indeed invariant forPt. Forx ∈ D(L∗), it follows from Proposition 4.7that the mapFx: t 7→ 〈Q∞S∗(t)x, S∗(t)x〉 is differentiable with derivative given by∂tFx(t) =2Re〈Q∞L∗S∗(t)x, S∗(t)x〉. It follows that

Fx(t) − Fx(0) = 2

∫ t

0

Re〈Q∞L∗S∗(s)x, S∗(s)x〉 ds = −

∫ t

0

‖Q∗S∗(s)x‖2 ds ,

so that one has the identity

Q∞ = S(t)Q∞S∗(t) +

∫ t

0

S(s)QQ∗S∗(s) ds = S(t)Q∞S∗(t) +Qt .

Inserting this into (5.14), the claim follows. Here, we usedthe fact thatD(L∗) is dense inH,which is always the case on a Hilbert space, see [Yos95, p. 196].

Since it is obvious from (5.14) that every measure of the typeν ⋆ µ∞ with ν invariant forSis also invariant forPt, it remains to show that the converse also holds. Letµ be invariant forPt

and defineµt as the push-forward ofµ under the mapS(t). Sinceµt(x) = µ(S∗(t)x), it followsfrom (5.14) and the invariance ofµ that there exists a functionψ:H → R such thatµt(x) → ψ(x)uniformly on bounded sets,ψ S(t)∗ = ψ, and such thatµ(x) = ψ(x) exp(−1

2〈x,Q∞x〉). It

therefore only remains to show that there exists a probability measureν onH such thatψ = ν.In order to show this, it suffices to show that the family of measuresµt is tight, that is for

everyε > 0 there exists a compact setK such thatµt(K) ≥ 1−ε for everyt. Prokhorov’s theorem[Bil68, p. 37] then ensures the existence of a sequencetn increasing to∞ and a measureν suchthatµtn → ν weakly. In particular,µtn(x) → ν(x) for everyx ∈ H, thus concluding the proof.

To show tightness, denote byνt the centred Gaussian measure onH with covarianceQt andnote that one can find a sequence of bounded linear operatorsAn:H → H with the followingproperties:

a. One has‖An+1x‖ ≥ ‖Anx‖ for everyx ∈ H and everyn ≥ 0.

b. The setBR = x : supn ‖Anx‖ ≤ R is compact for everyR > 0.

c. One has supn trAnQ∞A∗n <∞.

(By diagonalisingQ∞, the construction of such a family of operators is similar tothe construction,given a positive sequenceλn with

∑n λn < ∞, of a positive sequencean with limn→∞ an =

+∞ and∑

n anλn < ∞.) Let nowε > 0 be arbitrary. It follows from Prokhorov’s theorem thatthere exists a compact setK ⊂ H such thatµ(H\ K) ≤ ε

2. Furthermore, it follows from property

c. above and the fact thatQ∞ ≥ Qt that there existsR > 0 such thatνt(H \ BR) ≤ ε2. Define a

setK ⊂ H byK = z − y : z ∈ K ,y ∈ BR .

It is straightforward to check, using the Heine-Borel theorem, thatK is precompact.If we now takeX andY to be independentH-valued random variables with lawsµt andνt

respectively, then it follows from the definition of a mild solution and the invariance ofµ thatZ =X + Y has lawµ. Since one has the obvious implicationZ ∈ K&Y ∈ BR ⇒ X ∈ K, itfollows that

µt(H \K) = P(X 6∈ K) ≤ P(Z 6∈ K) + P(Y 6∈ BR) ≤ ε ,

thus showing that the sequenceµt is tight as requested.

Page 55: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 55

It is clear from Theorem 5.22 that if (5.1) does have a solution in some Hilbert spaceH andif ‖S(t)‖ → 0 as t → ∞ in that same Hilbert space, then it also possesses a unique invariantmeasure onH. It turns out that as far as the “uniqueness” part of this statement is concerned, it issufficient to have limt→∞ ‖S(t)x‖ = 0 for everyx ∈ H:

Proposition 5.23 If limt→∞ ‖S(t)x‖ = 0 for everyx ∈ H, then (5.1) can have at most oneinvariant measure. Furthermore, if an invariant measureµ∞ exists in this situation, then one hasP∗

t ν → µ∞ weakly for every probability measureν onH.

Proof. In view of Theorem 5.22, the first claim follows if we show thatδ0 is the only measure thatis invariant under the action of the semigroupS. Let ν be an arbitrary probability measure onHsuch thatS(t)∗ν = ν for everyt > 0 and letϕ:H → R be a bounded continuous function. Onthen has indeed ∫

Hϕ(x)ν(dx) = lim

t→∞

Hϕ(S(t)x)ν(dx) = ϕ(0) , (5.16)

where we first used the invariance ofν and then the dominated convergence theorem.To show thatP∗

t ν → µ∞ whenever an invariant measure exists we use the fact that in thiscase, by Theorem 5.22, one hasQt ↑ Q∞ in the trace class topology. Denoting byµt the centredGaussian measure with covarianceQt, the fact thatL2 convergence implies weak convergencethen implies that there exists a measureµ∞ such thatµt → µ∞ weakly. Furthermore, the samereasoning as in (5.16) shows thatS(t)∗ν → δ0 weakly ast → ∞. The claim then follows fromthe fact thatP∗

t ν = (S(t)∗ν) ⋆ µt and from the fact that convolving two probability measures is acontinuous operation in the topology of weak convergence.

Note that the condition limt→∞ ‖S(t)x‖ = 0 for everyx is not sufficient in general to guaran-tee the existence of an invariant measure for (5.1). This canbe seen again with the aid of Exam-ple 5.18. Take an increasing functionV with limx→∞ V (x) = ∞, but such that

∫ ∞0e−V (x) dx =

∞. Then, since exp(V (x) − V (x + t)) ≤ 1 and limt→∞ exp(V (x) − V (x + t)) = 0 for everyx ∈ R, it follows from (5.9) and the dominated convergence theorem that limt→∞ ‖S(t)u‖ = 0 foreveryu ∈ H. However, the fact that

∫ ∞0e−V (x) dx = ∞ prevents the random processu defined in

(5.12) from belonging toH, so that (5.10) has no invariant measure in this particular situation.

Exercise 5.24Show that if (5.1) has an invariant measureµ∞ but there existsx ∈ H such thatlim supt→∞ ‖S(t)x‖ > 0, then one cannot haveP∗

t δx → µ∞ weakly. In this sense, the statementof Proposition 5.23 is sharp.

5.3 Convergence in other topologies

Proposition 5.23 shows that if (5.1) has an invariant measureµ∞, one can in many cases expect tohaveP∗

t ν → µ∞ weakly for every initial measureν. It is however not cleara priori whether sucha convergence also holds in some stronger topologies on the space of probability measures. If weconsider the finite-dimensional case (that isH = Rn for somen > 0), the situation is clear: thecondition limt→∞ ‖S(t)x‖ = 0 for everyx ∈ H then implies that limt→∞ ‖S(t)‖ = 0, so thatLhas to be a matrix whose eigenvalues all have strictly negative real parts. One then has:

Proposition 5.25 In the finite-dimensional case, assume that all eigenvaluesofL strictly negativereal parts and thatQ∞ has full rank. Then, there existsT > 0 such thatP∗

t δx has a smoothdensitypt,x with respect to Lebesgue measure for everyt > T . Furthermore,µ∞ has a smooth

Page 56: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

56 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

densityp∞ with respect to Lebesgue measure and there existsc > 0 such that, for everyλ > 0,one has

limt→∞

ect supy∈Rn

eλ|y||p∞(y) − pt,x(y)| = 0 .

In other words,pt,x converges top∞ exponentially fast in any weighted norm with exponentiallyincreasing weight.

The proof of Proposition 5.25 is left as an exercise. It follows in a straightforward way fromthe explicit expression for the density of a Gaussian measure.

In the infinite-dimensional case, the situation is much lessstraightforward. The reason is thatthere exists no natural reference measure (the equivalent of the Lebesgue measure) with respect towhich one could form densities.

In particular, even though one always has‖µt − µ∞‖∞ → 0 in the finite-dimensional case(provided thatµ∞ exists and that all eigenvalues ofL have strictly negative real part), one cannotexpect this to be true in general. Consider for example the SPDE

dx = −x dt+QdW (t) , x(t) ∈ H ,

whereW is a cylindrical process onH andQ:H → H is a Hilbert-Schmidt operator. One thenhas

Qt =1 − e−2t

2QQ∗ , Q∞ =

1

2QQ∗ .

Combining this with Proposition 3.40 (dilates of an infinite-dimensional Gaussian measure aremutually singular) shows that ifQQ∗ has infinitely many non-zero eigenvalues, thenµt andµ∞are mutually singular in this case.

ν

µOne question that one may ask oneself is under which condi-tions the convergencePν

t → µ∞ takes place in the total variationnorm. The total variation distance between two probabilitymea-sures is determined by their ‘overlap’ as depicted in the figure onthe right: the total variation distance betweenµ andν is given bythe dark gray area, which represents the parts that do not overlap.If µ andν have densitiesDµ andDν with respect to some commonreference measureπ (one can always takeπ = 1

2(µ+ν)), then one

has‖µ− ν‖TV =∫ |Dµ(x) −Dν (x)|π(dx).

Exercise 5.26Show that this definition of the total variation distance does not depend on theparticular choice of a reference measure.

The total variation distance between two probability measuresµ andν on a separable BanachspaceB can alternatively be characterised as

‖µ− ν‖TV = 2 infπ∈C(µ,ν)

π(x 6= y) , (5.17)

where the infimum runs over the setC(µ, ν) of all probability measuresπ onB×B with marginalsµ andν. (This set is also called the set of allcouplingsof µ andν.) In other words, if the totalvariation distance betweenµ andν is smaller than2ε, then it is possible to constructB-valuedrandom variablesX andY with respective lawsµ andν such thatX = Y with probability largerthan1 − ε.

This yields a straightforward interpretation to the total variation convergencePνt → µ∞: for

large times, a sample drawn from the invariant distributionis with high probability indistinguish-able from a sample drawn from the Markov process at timet. Compare this with the notion of

Page 57: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 57

weak convergence which relies on the topology of the underlying space and only asserts that thetwo samples are close with high probability in the sense determined by the topology in question.For example,‖δx − δy‖ is always equal to2 if x 6= y, whereasδx → δy weakly if x→ y.

Exercise 5.27Show that the two definitions of the total variation distancegiven above are indeedequivalent by constructing a coupling that realises the infimum in (5.17). It is useful for this toconsider the measureµ ∧ ν which, in µ andν have densitiesDµ andDν with respect to somecommon reference measureπ, is given by(Dµ(x) ∧ Dν(x))π(dx).

An alternative characterisation of the total variation norm is as the dual norm to the supremumnorm on the spaceBb(B) of bounded Borel measurable functions onB:

‖µ− ν‖TV = sup∫

ϕ(x)µ(dx) −∫ϕ(x)ν(dx) : sup

x∈B|ϕ(x)| ≤ 1

.

It turns out that, instead of showing directly thatP∗t ν → µ∞ in the total variation norm, it is

somewhat easier to show that one hasP∗t ν → µ∞ in a type of ‘weighted total variation norm’,

which is slightly stronger than the usual total variation norm. Given a weight functionV :B → R+,we define a weighted supremum norm on measurable functions by

‖ϕ‖V = supx∈B

|ϕ(x)|1 + V (x)

,

as well as the dual norm on measures by

‖µ− ν‖TV,V = sup∫

ϕ(x)µ(dx) −∫ϕ(x)ν(dx) : ‖ϕ‖V ≤ 1

. (5.18)

Since we assumed thatV > 0, it is obvious that one has the relation‖µ− ν‖TV ≤ ‖µ− ν‖TV,V ,so that convergence in the weighted norm immediately implies convergence in the usual totalvariation norm. By considering the Jordan decomposition ofµ− ν = + − −, it is clear that thesupremum in (5.18) is attained at functionsϕ such thatϕ(x) = 1 + V (x) for +-almost everyxandϕ(x) = −1 − V (x) for −-almost everyx. In other words, an alternative expression for theweighted total variation norm is given by

‖µ− ν‖TV,V =

X(1 + V (x)) |µ− ν|(dx) , (5.19)

just like the total variation norm is given by‖µ− ν‖TV = |µ− ν|(X ).The reason why it turns out to be easier to work in a weighted norm is the following: For a

suitable choice ofV , we are going to see that in a large class of examples, one can construct aweight functionV and find constantsc < 1 andT > 0 such that

‖P∗Tµ− P∗

T ν‖TV,V ≤ c‖µ− ν‖TV,V , (5.20)

for any two probability measuresµ andν. This implies that the mapPT is a contraction on thespace of probability measures, which must therefore have exactly one fixed point, yielding boththe existence of an invariant measureµ∞ and the exponential convergence ofP∗

t ν toµ∞ for everyinitial probability measureν which integratesV .

This argument is based on the following abstract result thatworks for arbitrary Markov semi-groups on Polish (that is separable, complete, metric) spaces:

Page 58: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

58 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

Theorem 5.28 (Harris) LetPt be a Markov semigroup on a Polish spaceX such that there existsa timeT0 > 0 and a functionV :X → R+ such that:

• The exist constantsγ < 1 andK > 0 such thatPT0V (x) ≤ γV (x) +K for everyx ∈ X .

• For everyK ′ > 0, there existsδ > 0 such that‖P∗T0δx − P∗

T0δy‖TV ≤ 2 − δ for every pair

x, y such thatV (x) + V (y) ≤ K ′.

Then, there existsT > 0 such that (5.20) holds for somec < 1.

In a nutshell, the argument for the proof of Theorem 5.28 is the following. There are twomechanisms that allow to decrease the weighted total variation distance between two probabilitymeasures:

2. The mass of the two measures moves into regions where the weight V (x) becomes smaller.

1. The two measures ‘spread out’ in such a way that there is an increase in the overlap betweenthem.

The two conditions of Theorem 5.28 are tailored such as to combine these two effects in order toobtain an exponential convergence ofP∗

t µ to the unique invariant measure forPt ast → ∞.

Remark 5.29 The condition that there existsδ > 0 such that‖P∗T0δx − P∗

T0δy‖TV ≤ 2 − δ for

anyx, y ∈ A is sometimes referred to in the literature as the setA being asmall set.

Remark 5.30 Traditional proofs of Theorem 5.28 as given for example in [MT93] tend to makeuse of coupling arguments and estimates of return times of the Markov process described byPt tolevel sets ofV . The basic idea is to make use of (5.17) to get a bound on the total variation betweenP∗

Tµ andP∗T ν by constructing an explicit coupling between two instancesxt andyt of a Markov

process with transition semigroupPt. Because of the second assumption in Theorem 5.28,one can construct this coupling in such a way that every time the process (xt, yt) returns to somesufficiently large level set ofV , there is a probabilityδ that xt′ = yt′ for t′ ≥ t + T0. Thefirst assumption then guarantees that these return times have exponential tails and a renewal-typeargument allows to conclude.

Such proofs are quite involved at a technical level and are byconsequent not so easy to follow,especially if one wishes to get a spectral gap bound like (5.20) and not “just” an exponential decaybound like

‖P∗T δx − P∗

T δy‖TV ≤ Ce−γT ,

with a constantC depending onx andy. Furthermore, they require more background in advancedprobability theory than what is assumed for the scope of these notes.

The elementary proof given here is taken from [HM08b] and is based on the arguments firstexposed in [HM08a]. It has the disadvantage of being less intuitively appealing than proofs basedon coupling arguments, but this is more than offset by the advantage of fitting into less than twopages without having to appeal to advanced mathematical concepts. It also has the advantage ofbeing generalisable to situations where level sets of the Lyapunov function are not small sets, see[HMS09].

Before we turn to the proof of Theorem 5.28, we define for everyβ > 0 the distance function

dβ(x, y) =

0 if x = y

2 + βV (x) + βV (y) if x 6= y.

Page 59: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 59

One can check that the positivity ofV implies that this is indeed a distance function, albeit a ratherstrange one. We define the corresponding ‘Lipschitz’ seminorm on functionsϕ:X → R by

‖ϕ‖Lipβ= sup

x 6=y

|ϕ(x) − ϕ(y)|dβ(x, y)

.

We are going to make use of the following lemma:

Lemma 5.31 With the above notations, one has‖ϕ‖Lipβ= infc∈R ‖ϕ+ c‖βV .

Proof. It is obvious that‖ϕ‖Lipβ≤ ‖ϕ + c‖βV for everyc ∈ R. On the other hand, ifx0 is any

fixed point inX , one has

|ϕ(x)| ≤ |ϕ(x0)| + ‖ϕ‖Lipβ(2 + βV (x) + βV (x0)) , (5.21)

for all x ∈ X . Set now

c = − supx∈X

(ϕ(x) − ‖ϕ‖Lipβ(1 + βV (x))) .

It follows from (5.21) thatc is finite. Furthermore, one has

ϕ(y) + c ≤ ϕ(y) − (ϕ(y) − ‖ϕ‖Lipβ(1 + βV (y))) = ‖ϕ‖Lipβ

(1 + βV (y)) ,

and

ϕ(y) + c = infx∈X

(ϕ(y) − ϕ(x) + ‖ϕ‖Lipβ(1 + βV (x)))

≥ infx∈X

‖ϕ‖Lipβ(1 + βV (x) − dβ(x, y)) = −‖ϕ‖Lipβ

(1 + βV (y)) ,

which implies that‖ϕ + c‖βV ≤ ‖ϕ‖Lipβ.

Proof of Theorem 5.28.During this proof, we use the notationP def= PT0

for simplicity. We aregoing to show that there exists a choice ofβ ∈ (0, 1) such that there isα < 1 satisfying the bound

|Pϕ(x) − Pϕ(y)| ≤ αdβ(x, y)‖ϕ‖Lipβ, (5.22)

uniformly over all measurable functionsϕ:X → R and all pairsx, y ∈ X . Note that this isequivalent to the bound‖Pϕ‖Lipβ

≤ α‖ϕ‖Lipβ. Combining this with Lemma 5.31 and (5.19), we

obtain that, forT = nT0, one has the bound

‖P∗Tµ− P∗

T ν‖TV,V = inf‖ϕ‖V ≤1

X(PTϕ)(x) (µ− ν)(dx)

= inf‖ϕ‖V ≤1

infc∈R

X((PTϕ)(x) + c) (µ− ν)(dx)

≤ inf‖ϕ‖V ≤1

infc∈R

‖PTϕ+ c‖V

X(1 + V (x)) |µ− ν|(dx)

= inf‖ϕ‖V ≤1

β−1 infc∈R

‖PTϕ+ c‖βV ‖µ− ν‖TV,V

= β−1 inf‖ϕ‖V ≤1

‖PTϕ‖Lipβ‖µ− ν‖TV,V

≤ αn

βinf

‖ϕ‖V ≤1‖ϕ‖Lipβ

‖µ− ν‖TV,V ≤ αn

β2‖µ− ν‖TV,V .

Page 60: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

60 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

Sinceα < 1, the result (5.20) then follows at once by choosingn sufficiently large.Let us turn now to (5.22). Ifx = y, there is nothing to prove, so we assume thatx 6= y.

Fix an arbitrary non-constant functionϕ and assume without loss of generality that‖ϕ‖Lipβ= 1.

It follows from Lemma 5.31 that, by adding a constant to it if necessary, we can assume that|ϕ(x) + c| ≤ (1 + βV (x)).

This immediately implies the bound

|Pϕ(x) − Pϕ(y)| ≤ (2 + βPV (x) + βPV (y))

≤ 2 + 2βK + βγV (x) + βγV (y) .

Suppose now thatx andy are such thatV (x) + V (y) ≥ 2K+21−γ . A straightforward calculation

shows that in this case, for everyβ > 0 there existsα1 < 1 such that (5.22) holds. One can choosefor example

α1 = 1 − 1

2

β

1 − γ + βK + β.

Take now a pairx, y such thatV (x) + V (y) ≤ 2K+21−γ . Note that we can writeϕ = ϕ1 + ϕ2

with |ϕ1(x)| ≤ 1 and |ϕ2(x)| ≤ βV (x). (Setϕ1(x) = (ϕ(x) ∧ 1) ∨ (−1).) It follows from theassumptions onV that there exists someδ > 0 such that

|Pϕ(x) − Pϕ(y)| ≤ |Pϕ1(x) − Pϕ1(y)| + |Pϕ2(x) − Pϕ2(y)|≤ ‖P∗δx − P∗δy‖TV + β(PV )(x) + β(PV )(y)

≤ 2 − δ + β(2K + γV (x) + γV (y)) ≤ 2 − δ + 2βK1 + γ

1 − γ.

If we now chooseβ < δ4K

1−γ1+γ , (5.22) holds withα = 1 − 1

2δ < 1. Combining this estimate with

the one obtained previously shows that one can indeed findα andβ such that (5.22) holds for allx andy in X , thus concluding the proof of Theorem 5.28.

One could argue that this theorem does not guarantee the existence of an invariant measuresince the fact thatP∗

Tµ = µ does not guarantee thatPtµ = µ for everyt. However, one has:

Lemma 5.32 If there exists a probability measure such thatP∗Tµ = µ for some fixed timeT > 0,

then there also exists a probability measureµ∞ such thatP∗t µ∞ = µ∞ for all t > 0.

Proof. Define the measureµ∞ by

µ∞(A) =1

T

∫ T

0

P∗t µ(A) dt .

It is then a straightforward exercise to check that it does have the requested property.

We are now able to use this theorem to obtain the following result on the convergence of thesolutions to (5.1) to an invariant measure in the total variation topology:

Theorem 5.33 Assume that (5.1) has a solution in some Hilbert spaceH and that there exists atimeT such that‖S(T )‖ < 1 and such thatS(T ) mapsH into the image ofQ1/2

T . Then (5.1)admits a unique invariant measureµ∞ and there existsγ > 0 such that

‖P∗t ν − µ∞‖TV ≤ C(ν)e−γt ,

for every probability measureν onH with finite second moment.

Page 61: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

L INEAR SPDES / STOCHASTIC CONVOLUTIONS 61

Proof. Let V (x) = ‖x‖ and denote byµt the centred Gaussian measure with covarianceQt. Wethen have

PtV (x) ≤ ‖S(t)x‖ +

H‖x‖µt(dx) ,

which shows that the first assumption of Theorem 5.28 is satisfied. A simple variation of Exer-cise 3.34 (use the decompositionH = H ⊕ kerK) shows that the Cameron-Martin space ofµT isgiven by ImQ1/2

T endowed with the norm

‖h‖T = inf‖x‖ : h = Q1/2

T x .

Since we assumed thatS(T ) mapsH into the image ofQ1/2

T , it follows from the closed graphtheorem that there exists a constantC such that‖S(T )x‖T ≤ C‖x‖ for everyx ∈ H. It followsfrom the Cameron-Martin formula that the total variation distance betweenP∗

T δx andP∗T δy is

equal to the total variation distance betweenN (0, 1) andN (‖S(T )x − S(T )y‖T , 1), so that thesecond assumption of Theorem 5.28 is also satisfied.

Both the existence ofµ∞ and the exponential convergence ofP∗t ν towards it is then a con-

sequence of Banach’s fixed point theorem in the dual to the space of measurable functions with‖ϕ‖V <∞.

Remark 5.34 The proof of Theorem 5.33 shows that if its assumptions are satisfied, then the mapx 7→ P∗

t δx is continuous in the total variation distance fort ≥ T .

Remark 5.35 Since ImS(t) decreases witht and ImQ1/2

t increases witht, it follows that if

ImS(t) ⊂ ImQ1/2

t for somet, then this also holds for any subsequent time. This is consis-tent with the fact that Markov operators are contraction operators in the supremum norm, so thatif x 7→ P∗

t δx is continuous in the total variation distance for somet, the same must be true for allsubsequent times.

While Theorem 5.33 is very general, it is sometimes not straightforward at all to verify itsassumptions for arbitrary linear SPDEs. In particular, it might be complicateda priori to determinethe image ofQ1/2

t . The task of identifying this subspace can be made easier by the following result:

Proposition 5.36 The image ofQ1/2

t is equal to the image of the mapAt given by

At:L2([0, t],K) → H , At:h 7→

∫ t

0

S(s)Qh(s) ds .

Proof. SinceQt = AtA∗t , we can use polar decomposition [RS80, Thm VI.10] to find an isometry

Jt of (kerAt)⊥ ⊂ H (which extends toH by settingJtx = 0 for x ∈ kerAt) such thatQ1/2

t =AtJt.

Alternatively, one can show that, in the situation of Theorem 3.44, the Cameron-Martin spaceof µ = A∗µ is given by the image underA of the Cameron-Martin space ofµ. This follows fromProposition 3.31 since, as a consequence of the definition ofthe push-forward of a measure, thecomposition withA yields an isometry betweenL2(B, µ) andL2(B, µ).

One case where it is straightforward to check whetherS(t) mapsH into the image ofQ1/2

t isthe following:

Page 62: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

62 L INEAR SPDES / STOCHASTIC CONVOLUTIONS

Example 5.37 Consider the case whereK = H, L is selfadjoint, and there exists a functionf : R → R+ such thatQ = f (L). (This identity should be interpreted in the sense of the functionalcalculus already mentioned in Theorem 4.18.)

If we assume furthermore thatf (λ) > 0 for everyλ ∈ R, then the existence of an invariantmeasure is equivalent to the existence of a constantc > 0 such that〈x,Lx〉 ≤ −c‖x‖2 for everyx ∈ H. Using functional calculus, we see that the operatorQT is then given by

QT =f2(L)2L

(1 − e−2LT ) ,

and, for everyT > 0, the Cameron-Martin norm forµT is equivalent to the norm

‖x‖f =∥∥∥√L

f (L)x∥∥∥ .

In order to obtain convergenceP∗t ν → µ∞ in the total variation topology, it is therefore sufficient

that there exist constantsc, C > 0 such thatf (λ) ≥ Ce−cλ for λ ≥ 0.

This shows that one cannot expect convergence in the total variation topology to take placeunder similarly weak conditions as in Proposition 5.23. In particular, convergence in the totalvariation topology requires some non-degeneracy of the driving noise which was not the case forweak convergence.

Exercise 5.38Consider again the caseK = H andL selfadjoint with〈x,Lx〉 ≤ −c‖x‖2 for somec > 0. Assume furthermore thatQ is selfadjoint and thatQ andL commute, so that there existsa spaceL2(M, µ) isometric toH and such that bothQ andL can be realised as multiplicationoperators (sayf andg respectively) on that space. Show that:

• In order for there to exist solutions inH, the setAQdef= λ ∈ M : f (λ) 6= 0 must be

‘essentially countable’ in the sense that it can be written as the union of a countable set anda set ofµ-measure0.

• If there existsT > 0 such that ImS(T ) ⊂ ImQ1/2

T , thenµ is purely atomic and there existssome possibly different timet > 0 such thatS(t) is trace class.

Exercise 5.37 suggests that there are many cases where, ifS(t) mapsH to ImQ1/2

t for somet > 0, then it does so for allt > 0. It also shows that, in the case whereL andQ are selfadjoint andcommute,Q must have an orthnormal basis of eigenvectors with all eigenvalues non-zero. Bothstatements are certainly not true in general. We see from thefollowing example that there can beinfinite-dimensional situations whereS(t) mapsH to ImQ

1/2

t even thoughQ is of rank one!

Example 5.39 Consider the spaceH = R⊕L2([0, 1],R) and denote elements ofH by (a, u) witha ∈ R. Consider the semigroupS onH given by

S(T )(a, u) = (a, u) , u(x) =

a for x ≤ t

u(x− t) for x > t.

It is easy to check thatS is indeed a strongly continuous semigroup onH and we denote itsgenerator by (0, ∂x). We drive this equation by adding noise only on the first component ofH. Inother words, we setK = R andQ1 = (1, 0) so that, formally, we are considering the equation

da = dW (t) , du = ∂xu dt .

Page 63: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 63

Even though, at a formal level, the equations fora andu look decoupled, they are actually coupledvia the domain of the generator ofS. In order to check whetherS(t) mapsH into Ht

def= ImQ

1/2

t ,we make use of Proposition 5.36. This shows thatHt consists of elements of the form

∫ t

0

h(s)χs ds ,

whereh ∈ L2([0, t]) andχs is the image of (1, 0) underS(s), which is given by (1,1[0,s∧1] ). Onthe other hand, the image ofS(t) consists of all elements (a, u) such thatu(x) = a for x ≤ t.Since one hasχs(x) = 0 for x > s, it is obvious that ImS(t) 6⊂ Ht for t < 1.

On the other hand, fort > 1, given anya > 0, we can find a functionh ∈ L2([0, t]) such thath(x) = 0 for x ≤ 1 and

∫ t0h(x) dx = a. Since, fors ≥ 1, one hasχs(x) = 1 for everyx ∈ [0, 1],

it follows that one does have ImS(t) ⊂ Ht for t < 1.

6 Semilinear SPDEs

Now that we have a good working knowledge of the behaviour of solutions to linear stochasticPDEs, we are prepared to turn to nonlinear SPDEs. In these notes, we will restrict ourselves to thestudy ofsemilinearSPDEs withadditivenoise.

In this context, asemilinearSPDE is one such that the nonlinearity can be treated as a pertur-bation of the linear part of the equation. The wordadditivefor the noise refers to the fact that, asin (5.1), we will only consider noises described by a fixed operatorQ:K → B, rather than by anoperator-valued function of the solution. We will therefore consider equations of the type

dx = Lxdt+ F (x) dt +QdW (t) , x(0) = x0 ∈ B , (6.1)

whereL is the generator of a strongly continuous semigroupS on a separable Banach spaceB,W is a cylindrical Wiener process on some separable Hilbert spaceK, andQ:K → B is bounded.Furthermore,F is a measurable function from some linear subspaceD(F ) ⊂ B into B. We willsay that a processt 7→ x(t) ∈ D(F ) is amild solutionto (6.1) if the identity

x(t) = S(t)x0 +

∫ t

0

S(t− s)F (x(s)) ds +

∫ t

0

S(t− s)QdW (s) . (6.2)

holds almost surely for everyt > 0.

6.1 Local solutions

Throughout this section, we will make the standing assumption that the linearisation to (6.1) (thatis the corresponding equation withF = 0) does have a continuous solution with values inB. Inorder to simplify notations, we are going to write

WL(t) def=

∫ t

0

S(t− s)QdW (s) ,

In the nonlinear case, there are situations where solutionsexplode after a finite (but possiblyrandom) time interval. In order to be able to account for sucha situation, we introduce the notionof a local solution. Recall first that, given a cylindrical Wiener processW defined on some proba-bility space (Ω,P), we can associate to it the natural filtrationFtt≥0 generated by the incrementsof W . In other words, for everyt > 0, Ft is the smallestσ-algebra such that the random variablesW (s) −W (r) for s, r ≤ t are allFt-measurable.

In this context, astopping timeis a positive random variableτ such that the eventτ ≤ T isFT -measurable for everyT ≥ 0. With this definition at hand, we have:

Page 64: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

64 SEMILINEAR SPDES

Definition 6.1 A local mild solutionto (6.1) is aD(F )-valued stochastic processx together witha stopping timeτ such thatτ > 0 almost surely and such that the identity

x(t) = S(t)x0 +

∫ t

0

S(t− s)F (x(s)) ds +WL(t) , (6.3)

holds almost surely for every stopping timet such thatt < τ almost surely.

Remark 6.2 In some situations, it might be of advantage to allowF to be a map fromD(F ) toB′

for some superspaceB′ such thatB ⊂ B′ densely and such thatS(t) extends to a continuous linearmap fromB′ to B. The prime example of such a spaceB′ is an interpolation space with negativeindex in the case where the semigroupS is analytic. The definition of a mild solution carries overto this situation without any change.

A local mild solution (x, τ ) is calledmaximalif, for every mild solution (x, τ ), one hasτ ≤ τalmost surely.

Exercise 6.3Show that mild solutions to (6.1) coincide with mild solutions to (6.1) withL re-placed byL = L− c andF replaced byF = F + c for any constantc ∈ R.

Our first result on the existence and uniqueness of mild solutions to nonlinear SPDEs makesthe strong assumption that the nonlinearityF is defined on the whole spaceB and that it is locallyLipschitz there:

Theorem 6.4 Consider (6.1) on a Banach spaceB and assume thatWL is a continuousB-valuedprocess. Assume furthermore thatF :B → B is such that it restriction to every bounded set is Lip-schitz continuous. Then, there exists a unique maximal mildsolution(x, τ ) to (6.1). Furthermore,this solution has continuous sample paths and one haslimt↑τ ‖x(t)‖ = ∞ almost surely on the setτ <∞.

If F is globally Lipschitz continuous, thenτ = ∞ almost surely.

Proof. Given any realisationWL ∈ C(R+,B) of the stochastic convolution, we are going to showthat there exists a timeτ > 0 depending only onWL up to timeτ and a unique continuous functionx: [0, τ ) → B such that (6.3) holds for everyt < τ . Furthermore, the construction will be suchthat eitherτ = ∞, or one has limt↑τ ‖x(t)‖ = ∞, thus showing that (x, τ ) is maximal.

The proof relies on the Banach fixed point theorem. Given a terminal timeT > 0 and acontinuous functiong: R+ → B, we define the mapMg,T : C([0, T ],B) → C([0, T ],B) by

(Mg,Tu)(t) =

∫ t

0

S(t− s)F (u(s)) ds + g(t) .

The proof then works in almost exactly the same way as the usual proof of uniqueness of a maximalsolution for ordinary differential equations with Lipschitz coefficients. Note that we can assumewithout loss of generality that the semigroupS is bounded, since we can always subtract a constanttoL and add it back toF . Using the fact that‖S(t)‖ ≤ M for some constantM , one has for anyT > 0 the bound

supt∈[0,T ]

‖Mg,Tu(t) −Mg,Tv(t)‖ ≤MT supt∈[0,T ]

‖F (u(t)) − F (v(t))‖ . (6.4)

Furthermore, one has

supt∈[0,T ]

‖Mg,Tu(t) − g(t)‖ ≤MT supt∈[0,T ]

‖F (u(t))‖ . (6.5)

Page 65: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 65

Fix now an arbitrary constantR > 0. SinceF is locally Lipschitz, it follows from (6.4) and(6.5) that there exists a maximalT > 0 such thatMg,T maps the ball of radiusR aroundg inC([0, T ],B) into itself and is a contraction with contraction constant1

2there. This shows that

Mg,T has a unique fixed point forT small enough and the choice ofT was obviously performedby using knowledge ofg only up to timeT . Settingg(t) = S(t)x0 +WL(t), the pair (x, T ), whereT is as just constructed andx is the unique fixed point ofMg,T thus yields a local mild solution to(6.1).

In order to construct the maximal solution, we iterate this construction in the same way asin the finite-dimensional case. Uniqueness and continuity in time also follows as in the finite-dimensional case. In the case whereF is globally Lipschitz continuous, denote its Lipschitzconstant byK. We then see from (6.4) thatMg,T is a contraction on the whole space forT <1/(KM ), so that the choice ofT can be made independently of the initial condition, thus showingthat the solution exists for all times.

While this setting is very straightforward and did not make use of any PDE theory, it never-theless allows to construct solutions for an important class of examples, since every compositionoperator of the form(N (u))(ξ) = (f u)(ξ) is locally Lipschitz onC(K,Rd) (for K a compactsubset ofRn, say), provided thatf : Rd → Rd is locally Lipschitz continuous.

A much larger class of examples can be found if we restrict theregularity properties ofF , butassume thatL generates an analytic semigroup:

Theorem 6.5 LetL generate an analytic semigroup onB (denote byBα,α ∈ R the correspondinginterpolation spaces) and assume thatQ is such that the stochastic convolutionWL has almostsurely continuous sample paths inBα for someα ≥ 0. Assume furthermore that there existsγ ≥ 0andδ ∈ [0, 1) such that, for everyβ ∈ [0, γ], the mapF extends to a locally Lipschitz continuousmap fromBβ to Bβ−δ that grows at most polynomially.

Then, (6.1) has a unique maximal mild solution(x, τ ) with x taking values inBβ for every

β < β⋆def= α ∧ (γ + 1 − δ).

Proof. In order to show that (6.1) has a unique mild solution, we proceed in a way similar to theproof of Theorem 6.4 and we make use of Exercise 4.38 to bound‖S(t − s)F (u(s))‖ in terms of‖F (u(s))‖−δ . This yields instead of (6.4) the bound

supt∈[0,T ]

‖Mg,Tu(t) −Mg,T v(t)‖ ≤MT 1−δ supt∈[0,T ]

‖F (u(t)) − F (v(t))‖ , (6.6)

and similarly for (6.5), thus showing that (6.1) has a uniqueB-valued maximal mild solution (x, τ ).In order to show thatx(t) actually belongs toBβ for t < τ andβ ≤ α ∧ γ, we make use of a‘bootstrapping’ argument, which is essentially an induction onβ.

For notational convenience, we introduce the family of processesW aL (t) =

∫ tat S(t −

r)QdW (r), wherea ∈ [0, 1) is a parameter. Note that one has the identity

W aL (t) = WL(t) − S((1 − a)t)WL(at)

so that ifWL is continuous with values inBα, then the same is true forW aL .

We are actually going to show the following stronger statement. Fix an arbitrary timeT > 0.Then, for everyβ ∈ [0, β⋆) there exist exponentspβ ≥ 1, qβ ≥ 0, and constantsa ∈ (0, 1), C > 0such that the bound

‖xt‖β ≤ Ct−qβ

(1 + sup

s∈[at,t]‖xs‖ + sup

0≤s≤t‖W a

L (s)‖β

)pβ, (6.7)

Page 66: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

66 SEMILINEAR SPDES

holds almost surely for allt ∈ (0, T ].The bound (6.7) is obviously true forβ = 0 with pβ = 1 andqβ = 0. Assume now that, for

someβ = β0 ∈ [0, γ], the bound (6.7) holds. We will then argue that, for anyε ∈ (0, 1 − δ), thestatement (6.7) also holds forβ = β0 + ε (and therefore also for all intermediate values), providedthat we adjust the constants appearing in the expression. Since it is possible to go fromβ = 0 toany value ofβ < γ + 1 − δ in a finite number of such steps, the claim then follows at once.

From the definition of a mild solution, we have the identity

xt = S((1 − a)t)xat +

∫ t

atS(t− s)F (x(s)) ds +W a

L (t) .

Sinceβ ≤ γ, it follows from our polynomial growth assumption onF that there existsn > 0 suchthat, fort ∈ (0, T ],

‖xt‖β+ε ≤ Ct−ε‖xat‖β + ‖W aL (t)‖β+ε + C

∫ t

at(t− s)−(ε+δ)(1 + ‖xs‖β)n ds

≤ C(t−ε + t1−ε−δ) supat≤s≤t

(1 + ‖xs‖nβ) + ‖W a

L (t)‖β+ε

≤ Ct−ε supat≤s≤t

(1 + ‖xs‖nβ) + ‖W a

L (t)‖β+ε .

Here, the constantC depends on everything butt andx0. Using the induction hypothesis, thisyields the bound

‖xt‖β+ε ≤ Ct−ε−nqβ(1 + sups∈[a2t,t]

‖xs‖ + sup0≤s≤t

‖W aL (s)‖β)npβ + ‖W a

L (t)‖β+ε ,

thus showing that (6.7) does indeed hold forβ = β0 + ε, provided that we replacea by a2 and setpβ+ε = npβ andqβ+ε = ε+ nqβ. This concludes the proof of Theorem 6.5.

6.2 Interpolation inequalities and Sobolev embeddings

The kind of bootstrapping arguments used in the proof of Theorem 6.5 above are extremely usefulto obtain regularity properties of the solutions to semilinear parabolic stochastic PDEs. However,they rely on obtaining bounds on the regularity ofF from one interpolation space into another.In many important situations, the interpolation spaces turn out to be given by fractional Sobolevspaces. For the purpose of these notes, we are going to restrict ourselves to the analytically easiersituation where the space variable of the stochastic PDE under consideration takes values in thed-dimensional torusTd. In other words, we restrict ourselves to situations where the operatordescribing the linearised evolution is endowed with periodic boundary conditions.

This will make the proof of the embedding theorems presentedin these notes technically morestraightforward. For the corresponding embeddings with more general boundary conditions oreven on more general manifolds or unbounded domains, we refer for example the comprehensiveseries of monographs [Tri83, Tri92, Tri06].

Recall that, given a distributionu ∈ L2(Td), we can decompose it as a Fourier series:

u(x) =∑

k∈Zd

ukei〈k,x〉 ,

where the identity holds for (Lebesgue) almost everyx ∈ Td. Furthermore, theL2 norm ofu isgiven by Parseval’s identity‖u‖2 =

∑ |uk|2. We have

Page 67: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 67

Definition 6.6 The fractional Sobolev spaceHs(Td) for s ≥ 0 is given by the subspace of func-tionsu ∈ L2(Td) such that

‖u‖2Hs

def=

k∈Zd

(1 + |k|2)s|uk|2 <∞ . (6.8)

Note that this is a separable Hilbert space and thatH0 = L2. For s < 0, we defineHs as theclosure ofL2 under the norm (6.8).

Remark 6.7 By the spectral decomposition theorem,Hs for s > 0 is the domain of (1 − ∆)s/2

and we have‖u‖Hs = ‖(1 − ∆)s/2u‖L2 .

A very important situation is the case whereL is a differential operator with constant coef-ficients (formallyL = P (∂x) for some polynomialP : Rd → R) andH is either anL2 space orsome Sobolev space. In this case, one has

Lemma 6.8 Assume thatP : Rd → R is a polynomial of degree2m such that there exist positiveconstantsc, C such that the bound

(−1)m+1c|k|2m ≤ P (k) ≤ (−1)m+1C|k|2m ,

holds for all k outside of some compact set. Then, the operatorP (∂x) generates an analyticsemigroup onH = Hs for everys ∈ R and the corresponding interpolation spaces are given byHα = Hs+2mα.

Proof. By inspection, noting thatP (∂x) is conjugate to the multiplication operator byP (ik) viathe Fourier decomposition.

Note first that for any two positive real numbersa andb and any pair of positive conjugateexponentsp andq, one has Young’s inequality

ab ≤ ap

p+bq

q,

1

p+

1

q= 1 . (6.9)

As a corollary of this elementary bound, we obtain Holder’sinequality, which can be viewed as ageneralisation of the Cauchy-Schwartz inequality:

Proposition 6.9 (Holder’s inequality) Let (M, µ) be a measure space and letp andq be a pairof positive conjugate exponents. Then, for any pair of measurable functionsu, v:M → R, onehas ∫

M|u(x)v(x)|µ(dx) ≤ ‖u‖p ‖v‖q ,

for any pair(p, q) of conjugate exponents.

Proof. It follows from (6.9) that, for everyε > 0, one has the bound

M|u(x)v(x)|µ(dx) ≤ εp‖u‖p

p

p+

‖v‖qq

qεq,

Settingε = ‖v‖1

pq ‖u‖

1

p−1

p concludes the proof.

Page 68: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

68 SEMILINEAR SPDES

One interesting consequence of Holder’s inequality is thefollowing interpolation inequalityfor powers of selfadjoint operators:

Proposition 6.10 LetA be a positive definite selfadjoint operator on a separable Hilbert spaceH and letα ∈ [0, 1]. Then, the bound‖Aαu‖ ≤ ‖Au‖α‖u‖1−α holds for everyu ∈ D(Aα) ⊂ H.

Proof. The extreme casesα ∈ 0, 1 are obvious, so we assumeα ∈ (0, 1). By the spectraltheorem, we can assume thatH = L2(M, µ) and thatA is the multiplication operator by somepositive functionf . Applying Holder’s inequality withp = 1/α andq = 1/(1 − α), one then has

‖Aαu‖2 =

∫f2α(x)u2(x)µ(dx) =

∫|fu|2α(x) |u|2−2α(x)µ(dx)

≤(∫

f2(x)u2(x)µ(dx))α (∫

u2(x)µ(dx))1−α

,

which is exactly the bound we wanted to show.

An immediate corollary is:

Corollary 6.11 For anyt > s and anyr ∈ [s, t], the bound

‖u‖t−sHr ≤ ‖u‖r−s

Ht ‖u‖t−rHs (6.10)

is valid for everyu ∈ Ht.

Proof. Apply Proposition 6.10 withH = Hs,A = (1 − ∆)t−s2 , andα = (r − s)/(t− s).

Exercise 6.12As a consequence of Holder’s inequality, show that for any collection ofnmeasur-able functions and any exponentspi > 1 such that

∑ni=1 p

−1i = 1, one has the bound

M|u1(x) · · · un(x)|µ(dx) ≤ ‖u1‖p1

· · · ‖un‖pn .

Following our earlier discussion regarding fractional Sobolev spaces, it would be convenientto be able to bound theLp norm of a function in terms of one of the fractional Sobolev norms. Itturns out that bounding theL∞ norm is rather straightforward:

Lemma 6.13 For everys > d2, the spaceHs(Td) is contained in the space of continuous functions

and there exists a constantC such that‖u‖L∞ ≤ C‖u‖Hs .

Proof. It follows from Cauchy-Schwarz that

k∈Zd

|uk| ≤( ∑

k∈Zd

(1 + |k|2)s|uk|2)1/2( ∑

k∈Zd

(1 + |k|2)−s)1/2

.

Since the sum in the second factor converges if and only ifs > d2, the claim follows.

Exercise 6.14 In dimensiond = 2, find an example of an unbounded functionu such that‖u‖H1 <∞.

Exercise 6.15Show that fors > d2,Hs is contained in the spaceCα(Td) for everyα < s− d

2.

Page 69: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 69

As a consequence of Lemma 6.13, we are able to obtain a more general Sobolev embeddingfor all Lp spaces:

Theorem 6.16 (Sobolev embeddings)Let p ∈ [2,∞]. Then, for everys > d2− d

p , the space

Hs(Td) is contained in the spaceLp(Td) and there exists a constantC such that‖u‖Lp ≤C‖u‖Hs .

Proof. The casep = 2 is obvious and the casep = ∞ has already been shown, so it remains toshow the claim forp ∈ (2,∞). The idea is to divide Fourier space into ‘blocks’ corresponding todifferent length scales and to estimate separately theLp norm of every block. More precisely, wedefine a sequence of functionsu(n) by

u−1(x) = u0 , u(n)(x) =∑

2n≤|k|<2n+1

ukei〈k,x〉 ,

so that one hasu =∑

n≥−1 u(n). Forn ≥ 0, one has

‖u(n)‖pLp ≤ ‖u(n)‖2

L2‖u(n)‖2−pL∞ . (6.11)

Choose nows′ = d2

+ ε and note that the construction ofu(n), together with Lemma 6.13, guaran-tees that one has the bounds

‖u(n)‖L2 ≤ 2−ns‖u(n)‖Hs , ‖u(n)‖L∞ ≤ C‖u(n)‖Hs′ ≤ C2n(s′−s)‖u(n)‖Hs .

Inserting this into (6.11), we obtain

‖u(n)‖Lp ≤ C‖u(n)‖Hs2n((s′−s) 2−pp

− 2sp) = C‖u(n)‖Hs2n( d

p− d

2−s) ≤ C‖u‖Hs2n( d

p− d

2−s) .

It follows that‖u‖Lp ≤ |u0| +∑

n≥0 ‖u(n)‖Lp ≤ C‖u‖Hs , provided that the exponent appearingin this expression is negative, which is precisely the case whenevers > d

2− d

p .

Remark 6.17 Forp 6= ∞, one actually hasHs(Td) ⊂ Lp(Td) with s = d2− d

p , but this borderlinecase is more difficult to obtain.

Combining the Sobolev embedding theorem and Holder’s inequality, it is eventually possibleto estimate in a similar way the fractional Sobolev norm of a product of two functions:

Theorem 6.18 Let r, s andt be positive exponents such thats ∧ r > t ands+ r > t+ d2. Then,

if u ∈ Hr andv ∈ Hs, the productw = uv belongs toHt.

Proof. Defineu(n) andv(m) as in the proof of the Sobolev embedding theorem and setw(m,n) =u(m)v(n). Note that one hasw(m,n)

k = 0 if |k| > 23+(m∨n). It then follows from Holder’s inequalitythat if p, q ≥ 2 are such thatp−1 + q−1 = 1

2, one has the bound

‖w(m,n)‖Ht ≤ C2t(m∨n)‖w(m,n)‖L2 ≤ C2t(m∨n)‖u(m)‖Lp‖v(n)‖Lq .

Assume now thatm > n. The conditions onr, s andt are such that there exists a pair (p, q) asabove with

r > t+d

2− d

p, s >

d

2− d

q.

Page 70: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

70 SEMILINEAR SPDES

In particular, we can find someε > 0 such that

‖u(m)‖Lp ≤ C‖u(m)‖Hr−t−ε ≤ C2−m(t+ε)‖u‖Hr , ‖v(n)‖Lp ≤ C‖v(n)‖Hs−ε ≤ C2−mε‖v‖Hs .

Inserting this into the previous expression, we find that

‖w(m,n)‖Ht ≤ C2−mε−nε‖u‖Hr‖u‖Hs .

Since our assumptions are symmetric inu andv, we obtain a similar bound for the casem ≤ n,so that

‖w‖Ht ≤∑

m,n>0

‖w(m,n)‖Ht ≤ C‖u‖Hr‖u‖Hs

m,n>0

2−mε−nε ≤ C‖u‖Hr‖u‖Hs ,

as requested.

Exercise 6.19Show that the conclusion of Theorem 6.18 still holds ifs = t = r is a positiveinteger, provided thats > d

2.

Exercise 6.20Similarly to Exercise 6.12, show that one can iterate this bound so that ifsi > s ≥ 0are exponents such that

∑i si > s+ (n−1)d

2, then one has the bound

‖u1 · · · un‖s ≤ C‖u1‖s1· · · ‖un‖sn .

Hint: The cases ≥ d2

is simple, so it suffices to consider the cases < d2.

The functional inequalities from the previous section allow to check that the assumptions ofTheorems 6.4 and 6.5 are verified by a number of interesting equations.

6.3 Reaction-diffusion equations

This is a class of partial differential equations that modelthe evolution of reactants in a gel, de-scribed by a spatial domainD. They are of the type

du = ∆u dt + f u dt+QdW (t) , (6.12)

whereu(x, t) ∈ Rd, x ∈ D, describes the density of the various components of the reaction attime t and locationx. The nonlinearityf : Rd → Rd describes the reaction itself and the term∆udescribes the diffusion of the reactants in the gel. The noise termQdW should be interpreted asa crude attempt to describe the fluctuations in the quantities of reactant due both to the discretenature of the underlying particle system and the interaction with the environment3.

Equations of the type (6.12) also appear in the theory of amplitude equations, where they ap-pear as a kind of ‘normal form’ near a change of linear instability. In this particular case, oneoften hasd = 2 andf (u) = κu − u|u|2 for someκ ∈ R, see [BHP05]. A natural choice for theBanach spaceB in which to consider solutions to (6.12) is the space of bounded continuous func-tionsB = C(D,Rd) since the composition operatoru 7→ f u (also sometimes called Nemitskiioperator) then mapsB into itself and inherits the regularity properties off . If the domainD issufficiently regular then the semigroup generated by the Laplacian∆ is the Markov semigroup fora Brownian motion inD. The precise description of the domain of∆ is related to the behaviourof the corresponding Brownian motion when it hits the boundary of D. In order to avoid techni-calities, let us assume from now on thatD consists of the torusTn, so that there is no boundary toconsider.

3A more realistic description of these fluctuations would result in a covarianceQ that depends on the solutionu.Since we have not developed the tools necessary to treat thistype of equations, we restrict ourselves to the simple caseof a constant covariance operatorQ.

Page 71: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 71

Exercise 6.21Show that in this case,∆ generates an analytic semigroup onB = C(Tn,Rd) andthat forα ∈ N, the interpolation spaceBα is given byBα = C2α(Tn,Rd).

If Q is such that the stochastic convolution has continuous sample paths inB almost surelyandf is locally Lipschitz continuous, we can directly apply Theorem 6.4 to obtain the existenceof a unique local solution to (6.12) inC(Tn,Rd). We would like to obtain conditions onf thatensure that this local solution is also a global solution, that is the blow-up timeτ is equal to infinityalmost surely.

If f happens to be a globally Lipschitz continuous function, then the existence and uniquenessof global solutions follows from Theorem 6.4. Obtaining global solutions whenf is not globallyLipschitz continuous is slightly more tricky. The idea is toobtain somea priori estimate on somefunctional of the solution which dominates the supremum norm and ensures that it cannot blow upin finite time.

Let us first consider the deterministic part of the equation alone. The structure we are goingto exploit is the fact that the Laplacian generates a Markov semigroup. We have the followinggeneral result:

Lemma 6.22 LetPt be a Feller4 Markov semigroup over a Polish spaceX . Extend it toCb(X ,Rd)by applying it to each component independently. LetV : Rd → R+ be convex (that isV (αx+ (1−α)y) ≤ αV (x) + (1 − α)V (y) for all x, y ∈ Rd andα ∈ [0, 1]) and defineV : Cb(X ,Rd) → R+

by V (u) = supx∈X V (u(x)). ThenV (Ptu) ≤ V (u) for everyt ≥ 0 and everyu ∈ Cb(X ,Rd).

Proof. Note first that ifV is convex, then it is continuous and, for every probability measureµ onRd, one has the inequality

V(∫

Rdxµ(dx)

)≤

RdV (x)µ(dx) . (6.13)

One can indeed check by induction that (6.13) holds ifµ is a ‘simple’ measure consisting of aconvex combination of finitely many Dirac measures. The general case then follows from thecontinuity of V and the fact that every probability measure onRd can be approximated (in thetopology of weak convergence) by a sequence of simple measures.

Denote now byPt(x, · ) the transition probabilities forPt, so thatPtu is given by the formula(Ptu)(x) =

∫X u(y)Pt(x, dy). One then has

V (Ptu) = supx∈X

V(∫

Xu(y)Pt(x, dy)

)= sup

x∈XV

(∫

Rdv (u∗Pt(x, · ))(dv)

)

≤ supx∈X

RdV (v) (u∗Pt(x, · ))(dv) = sup

x∈X

XV (u(y))Pt(x, dy)

≤ supy∈X

V (u(y)) = V (u) ,

as required.

In particular, this result can be applied to the semigroupS(t) generated by the Laplacian in(6.12), so thatV (S(t)u) ≤ V (u) for every convexV and everyu ∈ C(Tn,Rd). This is the mainingredient allowing us to obtain a priori estimates on the solution to (6.12):

4A Markov semigroup is Feller if it maps continuous functionsinto continuous functions.

Page 72: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

72 SEMILINEAR SPDES

Proposition 6.23 Consider the setting for equation (6.12) described above. Assume thatQ issuch thatW∆ has continuous sample paths inB = C(Tn,Rd) and that there exists a convex twicedifferentiable functionV : Rd → R+ such thatlim|x|→∞ V (x) = ∞ and such that, for everyR > 0, there exists a constantC such that〈∇V (x), f (x + y)〉 ≤ CV (x) for everyx ∈ Rd andeveryy with |y| ≤ R. Then (6.12) has a global solution inB.

Proof. We denote byu(t) the local mild solution to (6.12). Our aim is to obtaina priori bounds onV (u(t)) that are sufficiently good to show that one cannot have limt→τ ‖u(t)‖ = ∞ for any finite(stopping) timeτ .

Settingv(t) = u(t)−W∆(t), the definition of a mild solution shows thatv satisfies the equation

v(t) = e∆tv(0) +

∫ t

0

e∆(t−s)(f (v(s) +W∆(s))) ds def= e∆tv(0) +

∫ t

0

e∆(t−s)F (s) ds .

Sincet 7→ v(t) is continuous by Theorem 6.4 and the same holds forW∆ by assumption, thefunction t 7→ F (t) is continuous inB. Therefore, one has

limh→0

1

h

(∫ h

0

e∆(h−s)F (s) ds − he∆hF (0))

= 0 .

We therefore obtain forV (v) the bound

lim suph→0

h−1(V (v(t+ h)) − V (v(t))) = lim suph→0

h−1(V (v(t) + hF (t)) − V (v(t))) .

SinceV belongs toC2 by assumption, we have

V (v(t) + hF (t)) = supx∈Tn

(V (v(x, t)) + h〈∇V (v(x, t)), F (x, t)〉) + O(h2) .

Using the definition ofF and the assumptions onV , it follows that for everyR > 0 there exists aconstantC such that, provided that‖W∆(t)‖ ≤ R, one has

lim suph→0

h−1(V (v(t+ h)) − V (v(t))) ≤ CV (v(t)) .

A standard comparison argument for ODEs then shows thatV (v(t)) cannot blow up as long as‖W∆(t)‖ does not blow up, thus concluding the proof.

Exercise 6.24 In the cased = 1, show that the assumptions of Proposition 6.23 are satisfiedforV (u) = u2 if f is any polynomial of odd degree with negative leading coefficient.

Exercise 6.25Show that in the cased = 3, (6.12) has a unique global solution when we take forf the right-hand side of the Lorentz attractor:

f (u) =

σ(u2 − u1)u1(− u3) − u2

u1u2 − βu3

,

where, σ andβ are three arbitrary positive constants.

Page 73: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 73

6.4 The stochastic Navier-Stokes equations

The Navier-Stokes equations govern the motion of an idealised incompressible fluid and are oneof the most studied models in the theory of partial differential equations, as well as in theoreticaland mathematical physics. We are going to use the symbolu(x, t) to denote the instantaneousvelocity of the fluid at positionx ∈ Rd and timet, so thatu(x, t) ∈ Rd. With these notations, thedeterministic Navier-Stokes equations are given by

∂tu = ν∆u− (u · ∇)u−∇p , divu = 0 , (6.14)

where the (scalar) pressurep is determined implicitly by the incompressibility condition divu = 0andν > 0 denotes the kinematic viscosity of the fluid. In principle, these equations make sensefor any value of the dimensiond. However, even the deterministic equations (6.14) are known tohave global smooth solutions for arbitrary smooth initial data only in dimensiond = 2. We aretherefore going to restrict ourselves to the two-dimensional case in the sequel. As we saw alreadyin the introduction, solutions to (6.14) tend to0 as times goes to∞, so that an external forcing isrequired in order to obtain an interesting stationary regime.

One natural way of adding an external forcing is given by a stochastic force that is white intime and admits a translation invariant correlation function in space. In this way, it is possibleto maintain the translation invariance of the equations (ina statistical sense), even though theforcing is not constant in space. We are furthermore going torestrict ourselves to solutions thatare periodic in space in order to avoid the difficulties arising from partial differential equations inunbounded domains. The incompressible stochastic Navier-Stokes equations on the torusR2 aregiven by

du = ν∆u dt− (u · ∇)u dt−∇p dt+QdW (t) , divu = 0 , (6.15)

wherep andν > 0 are as above. In order to put these equations into the more familiar form(6.1), we denote byΠ the orthogonal projection onto the space of divergence-free vector fields. InFourier components,Π is given by

(Πu)k = uk − k〈k, uk〉|k|2 . (6.16)

(Note here that the Fourier coefficients of a vector field are themselves vectors.) With this notation,one has

du = ν∆u dt+ Π(u · ∇)u dt +QdW (t) def= ∆u dt+ F (u) dt +QdW (t) .

It is clear from (6.16) thatΠ is a contraction in any fractional Sobolev space. Fort ≥ 0, it thereforefollows that

‖F (u)‖Ht ≤ ‖u‖Hs‖∇u‖Hr ≤ C‖u‖2Hs , (6.17)

provided thats > t ∨ ( t2

+ 12

+ d4) = t ∨ ( t

2+ 1). In particular, this bound holds fors = t+ 1,

provided thatt > 0.Furthermore, in this setting, sinceL is just the Laplacian, if we chooseH = Hs, then the

interpolation spacesHα are given byHα = Hs+2α. This allows us to apply Theorem 6.5 to showthat the stochastic Navier-Stokes equations admit local solutions for any initial condition inHs,provided thats > 1, and that the stochastic convolution takes values in that space. Furthermore,these solutions will immediately lie in any higher order Sobolev space, all the way up to the spacein which the stochastic convolution lies.

This line of reasoning does however not yield anya priori bounds on the solution, so that itmay blow up in finite time. The Navier-Stokes nonlinearity satisfies 〈u, F (u)〉 = 0 (the scalar

Page 74: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

74 SEMILINEAR SPDES

product is theL2 scalar product), so one may expect bounds inL2, but we do not know at thisstage whether initial conditions inL2 do indeed lead to local solutions. We would therefore liketo obtain bounds onF (u) in negative Sobolev spaces. In order to do this, we exploit the fact thatH−s can naturally be identified with the dual ofHs, so that

‖F (u)‖H−s = sup∫

F (u)(x) v(x) dx , v ∈ C∞ , ‖v‖Hs ≤ 1.

Making use of the fact that we are working with divergence-free vector fields, one has (usingEinstein’s convention of summation over repeated indices):

∫F (u) v dx = −

∫vjui∂iuj dx ≤ ‖v‖Lp‖∇u‖L2‖u‖Lq ,

provided thatp, q > 2 and 1p + 1

q = 12. We now make use of the fact that‖u‖Lq ≤ Cq‖∇u‖2 for

everyq ∈ [2,∞) (but q = ∞ is excluded) to conclude that for everys > 0 there exists a constantC such that

‖F (u)‖−s ≤ C‖∇u‖2L2 . (6.18)

In order to geta priori bounds for the solution to the 2D stochastic Navier-Stokes equations,one can then make use of the following trick: introduce the vorticity w = ∇∧ u = ∂1u2 − ∂2u1.Then, provided that

∫u dx = 0 (which, provided that the range ofQ consists of vector fields with

mean0, is a condition that is preserved under the solutions to (6.15)), the vorticity is sufficient todescribeu completely by making use of the incompressibility assumption divu = 0. Actually, themapw 7→ u can be described explicitly by

uk = (Kw)k =k⊥wk

|k|2 , (k1, k2)⊥ = (−k2, k1) .

This shows in particular thatK is actually a bounded operator fromHs intoHs+1 for everys. Itfollows that one can rewrite (6.15) as

dw = ν∆w dt+ (Kw · ∇)w dt+ Q dW (t) def= ∆w dt+ F (w) dt + Q dW (t) . (6.19)

SinceF (w) = ∇∧ F (Kw), it follows from (6.18) that one has the bounds

‖F (w)‖−1−s ≤ C‖w‖2L2 ,

so thatF is a locally Lipschitz continuous map fromL2 into Hs for everys < −1. This showsthat (6.19) has unique local solutions for every initial condition in L2 and that these solutionsimmediately become as regular as the corresponding stochastic convolution.

Denote now byWL the stochastic convolution

WL(t) =

∫ t

0

e∆(t−s)Q dW (s) ,

and define the processv(t) = w(t) −WL(t). With this notation,v is the unique solution to therandom PDE

∂tv = ν∆v + F (v + WL) .

It follows from (6.17) that‖F (w)‖H−s ≤ C‖w‖2Hs , provided thats > 1/3. For the sake of

simplicity, assume from now on thatWL takes values inH1/2 almost surely. Using the fact that〈v, F (v)〉 = 0, we then obtain for theL2-norm ofv the followinga priori bound:

∂t‖v‖2 = −2ν‖∇v‖2 − 2〈WL, F (v + WL)〉

Page 75: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 75

≤ −2ν‖∇v‖2 + 2‖WL‖H1/2‖v + WL‖2

H1/2

≤ −2ν‖∇v‖2 + 4‖WL‖H1/2(‖v‖2

H1/2 + ‖WL‖2

H1/2)

≤ −2ν‖∇v‖2 + 4‖WL‖H1/2(‖v‖‖∇v‖ + ‖WL‖2

H1/2)

≤ 8

ν‖WL‖2

H1/2‖v‖2 + 2‖WL‖3

H1/2 , (6.20)

so that global existence of solutions then follows from Gronwall’s inequality.This calculation is only formal, since it is not known in general whether theL2-norm ofv is

differentiable as a function of time. The bound that one obtains from (6.20) can however be maderigorous in a very similar way as for the example of the stochastic reaction-diffusion equation, sothat we will not reproduce this argument here.

Page 76: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

76 SEMILINEAR SPDES

References

[BHP05] D. BLOMKER, M. HAIRER, and G. A. PAVLIOTIS . Modulation equations: stochastic bifurca-tion in large domains.Comm. Math. Phys.258, no. 2, (2005), 479–512.

[Bil68] P. BILLINGSLEY . Convergence of probability measures. John Wiley & Sons Inc., New York,1968.

[Boc33] S. BOCHNER. Integration von Funktionen, deren Werte die Elemente eines Vektorraumes sind.Fund. Math.20, (1933), 262–276.

[Bog98] V. I. BOGACHEV. Gaussian measures, vol. 62 of Mathematical Surveys and Monographs.American Mathematical Society, Providence, RI, 1998.

[Bog07] V. I. BOGACHEV. Measure theory. Vol. I, II. Springer-Verlag, Berlin, 2007.

[Bor75] C. BORELL. The Brunn-Minkowski inequality in Gauss space.Invent. Math.30, no. 2, (1975),207–216.

[Dav80] E. B. DAVIES. One-parameter semigroups, vol. 15 of London Mathematical Society Mono-graphs. Academic Press Inc. [Harcourt Brace Jovanovich Publishers], London, 1980.

[DJT95] J. DIESTEL, H. JARCHOW, and A. TONGE. Absolutely summing operators, vol. 43 ofCam-bridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 1995.

[DPKZ87] G. DA PRATO, S. KWAPIEN, and J. ZABCZYK . Regularity of solutions of linear stochasticequations in Hilbert spaces.Stochastics23, no. 1, (1987), 1–23.

[DPZ92a] G. DA PRATO and J. ZABCZYK . A note on stochastic convolution.Stochastic Anal. Appl.10,no. 2, (1992), 143–153.

[DPZ92b] G. DA PRATO and J. ZABCZYK . Stochastic equations in infinite dimensions, vol. 44 ofEncy-clopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1992.

[DPZ96] G. DA PRATO and J. ZABCZYK . Ergodicity for infinite-dimensional systems, vol. 229 ofLon-don Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 1996.

[Gar02] R. J. GARDNER. The Brunn-Minkowski inequality.Bull. Amer. Math. Soc. (N.S.)39, no. 3,(2002), 355–405 (electronic).

[Hil53] T. H. H ILDEBRANDT. Integration in abstract spaces.Bull. Amer. Math. Soc.59, (1953), 111–139.

[HM08a] M. HAIRER and J. C. MATTINGLY . Spectral gaps in Wasserstein distances and the 2D stochas-tic Navier-Stokes equations.Ann. Probab.36, no. 6, (2008), 2050–2091.

[HM08b] M. HAIRER and J. C. MATTINGLY . Yet another look at Harris ergodic theorem for Markovchains, 2008. Preprint.

[HMS09] M. HAIRER, J. C. MATTINGLY , and M. SCHEUTZOW. A weak form of Harris’s theorem withapplications to stochastic delay equations, 2009. Preprint.

[Kat82] M. P. KATS. Continuity of universally measurable linear mappings.Sibirsk. Mat. Zh.23, no. 3,(1982), 83–90, 221.

[Law04] G. F. LAWLER. An introduction to the stochastic Loewner evolution. InRandom walks andgeometry, 261–293. Walter de Gruyter GmbH & Co. KG, Berlin, 2004.

[Led96] M. LEDOUX. Isoperimetry and Gaussian analysis. InLectures on probability theory andstatistics (Saint-Flour, 1994), vol. 1648 ofLecture Notes in Math., 165–294. Springer, Berlin,1996.

[LT91] M. L EDOUX and M. TALAGRAND . Probability in Banach spaces, vol. 23 of Ergebnisseder Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)].Springer-Verlag, Berlin, 1991. Isoperimetry and processes.

[Lun95] A. LUNARDI. Analytic semigroups and optimal regularity in parabolic problems. Progress inNonlinear Differential Equations and their Applications,16. Birkhauser Verlag, Basel, 1995.

Page 77: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

SEMILINEAR SPDES 77

[MT93] S. P. MEYN and R. L. TWEEDIE. Markov chains and stochastic stability. Communicationsand Control Engineering Series. Springer-Verlag London Ltd., London, 1993.

[Paz83] A. PAZY . Semigroups of linear operators and applications to partialdifferential equations,vol. 44 ofApplied Mathematical Sciences. Springer-Verlag, New York, 1983.

[Phi55] R. S. PHILLIPS. The adjoint semi-group.Pacific J. Math.5, (1955), 269–283.

[PR07] C. PREVOT and M. ROCKNER. A concise course on stochastic partial differential equations,vol. 1905 ofLecture Notes in Mathematics. Springer, Berlin, 2007.

[PZ07] S. PESZAT and J. ZABCZYK . Stochastic partial differential equations with Levy noise, vol. 113of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge,2007. An evolution equation approach.

[RS80] M. REED and B. SIMON. Methods of Modern Mathematical Physics I–IV. Academic Press,San Diego, California, 1980.

[RW94] L. C. G. ROGERSand D. WILLIAMS . Diffusions, Markov processes, and martingales. Vol. 1.Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statis-tics. John Wiley & Sons Ltd., Chichester, second ed., 1994. Foundations.

[RY94] D. REVUZ and M. YOR. Continuous martingales and Brownian motion, vol. 293 ofGrundlehren der Mathematischen Wissenschaften [Fundamental Principles of MathematicalSciences]. Springer-Verlag, Berlin, second ed., 1994.

[SC74] V. N. SUDAKOV and B. S. CIREL′SON. Extremal properties of half-spaces for sphericallyinvariant measures.Zap. Naucn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI)41, (1974),14–24, 165. Problems in the theory of probability distributions, II.

[Sch48] E. SCHMIDT. Die Brunn-Minkowskische Ungleichung und ihr Spiegelbildsowie die isoperi-metrische Eigenschaft der Kugel in der euklidischen und nichteuklidischen Geometrie. I.Math.Nachr.1, (1948), 81–157.

[Sch66] L. SCHWARTZ. Sur le theoreme du graphe ferme.C. R. Acad. Sci. Paris Ser. A-B263, (1966),A602–A605.

[She07] S. SHEFFIELD. Gaussian free fields for mathematicians.Probab. Theory Related Fields139,no. 3-4, (2007), 521–541.

[SS05] M. SANZ-SOLE. Malliavin calculus. Fundamental Sciences. EPFL Press, Lausanne, 2005.With applications to stochastic partial differential equations.

[SS06] O. SCHRAMM and S. SHEFFIELD. Contour lines of the two-dimensional discrete gaussian freefield, 2006. Preprint.

[Tri83] H. TRIEBEL. Theory of function spaces, vol. 38 of Mathematik und ihre Anwendungen inPhysik und Technik [Mathematics and its Applications in Physics and Technology]. Akademis-che Verlagsgesellschaft Geest & Portig K.-G., Leipzig, 1983.

[Tri92] H. TRIEBEL. Theory of function spaces. II, vol. 84 ofMonographs in Mathematics. BirkhauserVerlag, Basel, 1992.

[Tri06] H. TRIEBEL. Theory of function spaces. III, vol. 100 of Monographs in Mathematics.Birkhauser Verlag, Basel, 2006.

[Yos95] K. YOSIDA. Functional analysis. Classics in Mathematics. Springer-Verlag, Berlin, 1995.Reprint of the sixth (1980) edition.

Page 78: arXiv:0907.4178v1 [math.PR] 23 Jul 2009 · 2009-07-23 · SOME MOTIVATING EXAMPLES 3 Formally taking the continuum limit (with the scalings k≈ νN2 and σ≈ N), we can infer that

78 INDEX

Index

adjointoperator, 21, 28semigroup, 33

approximationspatial, 74temporal, 74

Borell-Sudakov-Cirel’son inequality, seeisoperimetric inequality

Cameron-Martinnorm, 16space, 16theorem, 19

canonical process, 24closed

operator, 28coupling, 56, 58covariance operator, 9

Feller semigroup, 70Fernique’s theorem, 10filtration, 63Fourier transform, 9

Gaussian measure, 8

Holder inequality, 67Harris’s theorem, 57Hille-Yosida theorem, 31

interpolation, 39interpolation space, 38invariant measure, 52isoperimetric inequality, 22

Kolmogorovcontinuity criterion, 12extension theorem, 12

Markovoperator, 52property, 50semigroup, 51

mild solution, 44, 45, 63

polarisation, 18, 21

reproducing kernel, 17

resolvent, 28identity, 29set, 28

semigroupanalytic, 34selfadjoint, 34strongly continuous (C0), 27

Sobolevembedding, 66, 68space, 49space (fractional), 66, 68

spectral decomposition theorem, 34spectral gap, 58stochastic

heat equation, 4Navier-Stokes equations, 3, 72reaction-diffusion equation, 70

stochastic convolution, 45, 64, 65, 70stochastic integral, 26stopping time, 63

total variation distance, 56

weak convergence, 13, 56, 62weak solution, 44white noise, 26Wiener

measure, 16process (cylindrical), 24, 25space (abstract), 16

Young inequality, 67

zero-one law, 23