Chapter 1 Distributions - Istituto Nazionale di Fisica ...moby.mib.infn.it/~zaffaron/distribuzioni.pdf · Chapter 1 Distributions The concept of distribution generalises and extends

Chapter 1

Distributions

The concept of distribution generalises and extends the concept of function.

A distribution is basically defined by its action on a set of test functions.

It is indeed a linear functional T

T : F → C

φ(x) → T (φ) . (1.1)

from a space of functions F of real variable, called the test functions, to C.

For example, to any integrable real function f(x) we can associate the linear

functional Tf : F → R defined as

φ(x)→ Tf (φ) =∫∞−∞ dx f(x)φ(x) . (1.2)

The idea is that, knowing the value of the integral Tf (φ) on a large enough

number of test functions φ, we can reconstruct the original function f . Al-

lowing for general linear functionals we extend the concept of function.

We can define operations on distributions even in cases where they are

not well defined on functions. For example, we will be able to define the

derivative of a discontinuous function when it is seen as a distribution. In

the world of distributions, everything is infinitely differentiable and limits

always commute with other operations. The distributions are also called

generalized functions.

1

2 CHAPTER 1. DISTRIBUTIONS

1.1 Test functions

In this section we discuss the space F of test functions and its properties.

A distribution will be a linear functional on F . For simplicity, we consider

test functions of real variable. We want to choose test functions φ ∈ Fthat are as smooth as possible, in particular that are infinitely differentiable,

φ ∈ C∞(R). Moreover, in order to guarantee the existence of the integral in

(1.2) we require that φ vanishes sufficiently rapidly at infinity. There are two

canonical choices for F , the space of bump functions D(R) and the space of

rapidly decreasing functions S(R), which we now discuss.

1.1.1 The space D(R)

The support of a function of real variable is the closure of the set of points

where the function is non-zero

supp(f) = {x ∈ R | f(x) 6= 0} (1.3)

A function with compact support is a function whose support is compact (a

closed and bounded subset of R).

We call D(R) the space of real functions of real variable that are infinitely

differentiable and with compact support,

D(R) = {φ(x) |φ ∈ C∞(R) with compact support} (1.4)

D(R) is obviously a vector space. It is also a topological space, but the

notion of limit that is useful for the theory of distribution is not induced by

any norm or metric. For our purposes, it is enough to define the notion of

convergence of a sequence in the topology of D(R). We say that the sequence

{φn ∈ D(R)} converges to the function φ ∈ D(R) if the following conditions

are satisfied,

• it exists a bounded interval I ⊂ R which contains the supports of all

the functions φn and φ. In other words, it exists a real number R such

that

φn(x) = 0 and φ(x) = 0 ∀ |x| ≥ R ; (1.5)

1.1. TEST FUNCTIONS 3

• the sequence of the p-th derivatives of φn(x) converges uniformly to the

p-th derivative of φ(x)

supx∈R

∣∣∣∣ dp

dxpφn(x)− dp

dxpφ(x)

∣∣∣∣ n−→∞−→ 0 ∀ p = 0, 1, . . . . (1.6)

In particular, φn(x) converges uniformly to φ(x).

The generalization of the space D(R) to the case of complex-valued func-

tions or of functions defined in Rn is straightforward.

Example 1.1. The function

f(x) =

{e− 1

1−x2 |x| < 1

0 |x| ≥ 1(1.7)

belongs to D(R) since it is zero outside the interval [−1, 1], infinitely differ-

entiable in (−1, 1) and with vanishing derivatives of all orders in x = ±1.

Notice that the function is C∞(R) but not analytic. The Taylor series in

x = ±1 is identically zero since all derivatives are zero, but the function

is not zero in the neighborhood of x = ±1. All the functions in D(R) are

necessarily non analytic.

1.1.2 The space S(R)

A function f of real variable is called a rapidly decreasing function, or a

Schwarz function, if it is infinitely differentiable on R, f ∈ C∞(R), and

xpdq

dxqf(x)

x→±∞−→ 0 ∀ p, q = 0, 1, . . . . (1.8)

In other words, f is infinitely differentiable and f and all its derivatives

vanish at infinity more rapidly than any polynomial.

The vector space of rapidly decreasing functions is denoted

S(R) = {φ(x) |φ ∈ C∞(R) and rapidly decreasing} . (1.9)

We define a notion of convergence in S(R) as follows. We say that the

sequence of functions {φn ∈ S(R)} converges to the function φ ∈ S(R) if

xp dq

dxqφn(x) converges uniformly to xp dq

dxqφ(x) for all p and q

supx∈R

∣∣∣∣xp dq

dxqφn(x)− xp dq

dxqφ(x)

∣∣∣∣ n−→∞−→ 0 ∀ p, q = 0, 1, . . . . (1.10)


Example 1.2. The functions

φ(x) = P (x)e−x2

, φ(x) = sinx e−x2

, (1.11)

where P (x) is an arbitrary polynomial, are elements of S(R). On the other

hand, the functions

φ(x) =1

1 + x2, φ(x) = e−|x| (1.12)

are not, for the first vanishes as an inverse polynomial at infinity and the

second, while vanishing faster than any polynomial at infinity, is not C∞(R).

Example 1.3. Since every function in S(R) is square integrable, S(R) is

a vector subspace of L2(R). S(R) is dense in L2(R). In fact, any element

of L2(R) is the limit of a sequence of elements of S(R). To see this, it is

enough to expand f ∈ L2(R) in the Hermite basis, f =∑∞

n=0 cnun, where

un(x) ∼ Hn(x)e−x2

are obviously rapidly decreasing functions.

1.2 Distributions

We call distribution a functional T : F → C

φ(x) → T (φ) . (1.13)

where F = D(R) or F = S(R), with the properties

• T is linear:

T (λ1φ1 + λ2φ2) = λ1T (φ1) + λ2T (φ2) , (1.14)

for all λ1, λ2 ∈ C and φ1, φ2 ∈ F ;

• T is continuous in the following sense: if the sequence φn(x) converges

to φ(x) in F , limn→∞ φn(x) = φ(x), then

limn→∞

T (φn) = T (φ) . (1.15)

1.2. DISTRIBUTIONS 5

It is also customary to use the notation

(T, φ) ≡ T (φ) (1.16)

for the action of a distribution on a test function.

In the theory of distributions, the functions φ in D(R) and S(R) are called

test functions. The functionals T acting on F = D(R) are simply called

distributions and form a space denoted D′(R). The functionals T acting on

F = S(R) are called tempered distributions and form a space denoted S ′(R).

We say that two distributions, T1 et T2, are equal if they act in the same

way on all test functions

T1(φ) = T2(φ) ∀φ ∈ D(R) (S(R)) . (1.17)

The spaces of distributions D′(R) et S ′(R) are vector spaces if we define

sum and multiplication by scalars by

(T1 + T2)(φ) = T1(φ) + T2(φ) , (1.18)

(λT )(φ) = λT (φ) , (1.19)

for all φ ∈ D′(R) (S ′(R)) and all complex numbers λ.

The functions with compact support are a subset of the rapidly decreasing

functions

D(R) ⊂ S(R) . (1.20)

This implies the opposite inclusion for the space of distributions

S ′(R) ⊂ D′(R) . (1.21)

since every functional that is defined on all the functions in S(R) is a fortiori

defined on the functions in D(R).

It is straightforward to extend the previous definitions to the case of

multi-variable distributions.

In the following we discuss the case of distributions in D′(R), since all

properties we will introduce in this chapter are also valid for tempered dis-

tributions. More sophisticated operations, like the Fourier transform, can be

only defined for tempered distributions.


1.2.1 Regular distributions

We say that a function f(x) is locally integrable, and we write f ∈ L1loc(R),

if its integral on any compact subset K of R is finite∫K

|f(x)|dx <∞ . (1.22)

Notice that we are not requiring any nice behavior at infinity. A function

like f(x) = e|x| is not an element L1(R) but it is in L1loc(R). To any locally

integrable function f we associate the linear functional

Tf (φ) = (f, φ) =

∫ ∞−∞

dx f ∗(x)φ(x) . (1.23)

Tf is a distribution Tf ∈ D′(R).

Proof. In order to see that Tf is a distribution we need to check that its value on

a test function is well defined and that Tf is linear and continuous. Call Kφ the

support of φ ∈ D(R). The integral in (1.23) restricts to∫Kφ

dx f∗(x)φ(x), which

is finite since φ is C∞(R) and f is integrable on any compact. Tf is clearly linear

in φ. To see that it is continuous take a sequence of test functions φn converging

to φ in the topology of D(R). Recall that this means that the supports of φn and

φ are contained in a compact set K, and that φn and all its derivatives converges

uniformly to φ. We have

|Tf (φ)− Tf (φn)| ≤∫K

dx |f(x)| |φ(x)− φn(x)| ≤ supx∈K|φ(x)− φn(x)|

∫K|f |

which converges to zero since f is locally integrable and φn converges uniformly

to φ: limn→∞ supx∈K |φ(x)− φn(x)| = 0.

Distributions of the form Tf for a locally integrable function f are called

regular distributions. Tf is not in general a tempered distribution, because

the integral (1.23) is not necessarily convergent at infinity. Since φ ∈ S(R)

vanishes at infinity more rapidly than any polynomials, Tf will be a tempered

distribution if f has an algebraic growth, which means that |f(x)| < c|x|k for

some k and c for large enough |x|.We can use definition (1.23) to identify the space of functions L1

loc(R)

with a subset of the space of distributions D(R). In this sense, distributions

generalize functions. When no confusion can arise we will make use of the

identification f ∼ Tf and write f instead of Tf .


The presence of the complex conjugate of f in the definition (1.23) can

be startling but it is dictated by convenience. In particular, with this choice,

Tf (φ) can be formally written as the scalar product (f, φ) of f with φ. This

definition also explains the notation (1.16) which is often used for the action

of a distribution on a test function.

Example 1.4. The function f(x) = sign(x) is not continuous, it is not

differentiable and it is not integrable on R. It is however integrable on any

finite interval [a, b] and thus defines the distribution

Tf (φ) =

∫ ∞−∞

dx f(x)φ(x) = −∫ 0

−∞dxφ(x) +

∫ ∞0

dxφ(x) .

Since f has an algebraic growth, Tf is a tempered distribution. The previous

expression is indeed finite for all φ ∈ S(R).

Example 1.5. The Heaviside function

θ(x) =

{1 x ≥ 0

0 x < 0 ,(1.24)

is similarly associated to the tempered distribution

Tθ(φ) =

∫ ∞−∞

dx θ(x)φ(x) =

∫ ∞0

dxφ(x) .

Example 1.6. By identifying locally integrable functions with the associ-

ated distribution, we have the following inclusion: C∞(R) ⊂ D′(R). The

elementary functions sin(x), cos(x), log(x), P (x), where P (x) is a polynomial

are tempered distributions. The function ex is a distribution, but growing

faster than a polynomial, is not tempered.

Example 1.7. A notable chain of inclusions is

D(R) ⊂ S(R) ⊂ L2(R) ⊂ S ′(R) ⊂ D′(R) . (1.25)

In particular we will see that each space in this chain is dense in those

containing it.


1.2.2 Singular distributions

Most distributions are not of the form (1.23) for any locally integrable func-

tion. They are called singular and are the truly new objects. We give various

examples. In order to see that a functional is a distribution we always need

to check that its value on a test function is well defined and it is linear

and continuous. We will often leave as an exercise to the reader to verify

the continuity. This can be done with the same method used for regular

distributions.

Example 1.8. The Dirac delta function.

The most famous example of singular distribution is given by the Dirac delta

function which is not a function, as the name would suggest, but the distri-

bution defined by

δx0(φ) = φ(x0) . (1.26)

In other words, δx0 associates to the test function φ(x) its value at the point

x0. It is obvious that the functional δx0 is well defined and linear. It is also

continuous since for a sequence φn → φ in D(R)

|δx0(φn)− δx0(φ)| = |φn(x0)− φ(x0)| (1.27)

converges to zero for n→∞ because φn converges uniformly, and therefore

also pointwise, to φ. δx0 is obviously well defined also for φ ∈ S(R) and it

is therefore a tempered distribution. In physics it is customary to use the

following notation

δx0(φ) ≡∫ ∞−∞

δ(x− x0)φ(x) dx , (1.28)

as if there were a function δ(x− x0) such that∫ ∞−∞

δ(x− x0)φ(x) dx = φ(x0) . (1.29)

Unfortunately a function δ(x− x0) which satisfies the previous equation for

all φ does not exist. Even if we will use the notation (1.29) we must keep

in mind that δx0 is not a function but a truly singular distribution. When

x0 = 0 we simply write δ or δ(x).


Example 1.9. The Principal Value.

The function 1/x is not locally integrable in the neighborhood of x = 0 and

it cannot be considered as a distribution. We can however define a tempered

distribution T which is close enough to 1/x using the Principal Value

T (φ) = limε→0

(∫ −ε−∞

φ(x)

xdx+

∫ ∞ε

φ(x)

xdx

). (1.30)

The limit always exists since the sum of integrals can be written as∫ −ε−∞

φ(x)

xdx+

∫ ∞ε

φ(x)

xdx =

∫ ∞ε

φ(x)− φ(−x)

xdx , (1.31)

and the integrand φ(x)−φ(−x)x

is continuous in the neighborhood of x = 0 for

all C∞(R) functions. We can thus take the ε → 0 limit on the right hand

side of (1.31) and write

T (φ) =

∫ ∞0

φ(x)− φ(−x)

xdx . (1.32)

The distribution T is also written with some abuse of notations as P 1x

and

can be viewed as a regularization of the function 1x

in the neighborhood of

x = 0.

Example 1.10. Other regularizations of the function 1/x are given by the

distributions

T±(φ) = limε→0

∫ +∞

−∞

φ(x)

x± iεdx . (1.33)

The original singularity of the integrand is avoided by shifting the position of

the singularity in the complex plane by a small imaginary part. The action

of T± on a test function is often denoted with some abuse of notations as

T±(φ) =

∫φ(x)

x± i0dx . (1.34)

and the distributions T± themselves as 1/(x± i0). The distributions T± are

related to P 1x

by the identities

1

x± i0= P

1

x∓ πiδ(x) . (1.35)


We can indeed write

T±(φ) = limε→0

(∫ +∞

−∞

x

x2 + ε2φ(x)dx∓ i

∫ +∞

−∞

ε

x2 + ε2φ(x)dx

)(1.36)

The first integral on the right hand side can be written as∫ ∞0

x2

x2 + ε2

(φ(x)− φ(−x)

x

)dx , (1.37)

and thus converges for ε → 0 to the principal value (1.32). The second

integral can be written, by the change of variables x = εy, as

∓ i∫ +∞

−∞

ε

x2 + ε2φ(x)dx = ∓i

∫ +∞

−∞

1

y2 + 1φ(εy)dy , (1.38)

which converges for ε→ 0 to ∓iφ(0)∫R

dyy2+1

= ∓iπφ(0). We then have∫φ(x)

x+ i0dx = P

∫φ

x− πi

∫δ(x)φ(x)dx (1.39)

which, being valid for all test functions φ, gives us the identity (1.35).

Example 1.11. The finite part.

The principal value for a function which behaves as 1/x2 is not well defined.

We can modify the expression (1.32) as follows

T (φ) =

∫ ∞0

φ(x) + φ(−x)− 2xφ′(0)

x2dx . (1.40)

which is now well defined since the numerator is O(x2) in the neighborhood

of x = 0 by Taylor theorem. T in (1.40) defines a distribution, called the

finite part and denoted as fp 1x2

, which regularizes the function 1/x2 in the

neighborhood of x = 0.

A distribution is only defined by its action on test functions and, even if we

often denotes distributions as ”functions” like in the case of δ(x) or 1/(x+i0),

we must remember that these are formal expressions and distributions are

not defined pointwise. The concept of value of the distribution T at the

point x is meaningless. Only for regular distributions Tf we can identify the

distribution with the function f and define the value of T at a point x as

1.3. LIMITS OF DISTRIBUTIONS 11

the value f(x), if it exists. More generally, we can study the behavior of

a distribution T on a subset I of R by considering the action of T on test

functions with support in I. If, on all test functions with support in I, the

action of T is the same of the action of a regular distribution Tf we can say

that T is equal to f in the interval I and we can talk about the values of T at

points of I. For example, δ(x) is identified with the function zero and P1/x

and 1/(x ± i0) with the function 1/x on all intervals that do not contain

x = 0. We can define the support of T as the closure of the complement of

the biggest subset of R where it can be identified with the zero function. The

support of δx0 is the point x0.

Example 1.12. Compactly supported distributions.

In general, the action of a distribution T ∈ D′(R) is only defined on C∞

functions with compact support. However if T itself has compact support,

it effectively truncate the domain of definition of any functions is acting on.

A distribution T with compact support can act on any C∞ function. In

practice, if K = Supp(T ) and φ ∈ C∞(R), we can always find φ ∈ D(R)

which coincides with φ in K and define T (φ) ≡ T (φ).

1.3 Limits of distributions

We can define a notion of convergence for distributions. We say that the

sequence {Tn} ∈ D′(R) converges to T in the sense of distributions and we

write limn→∞ Tn = T , if, for all test functions φ ∈ D(R), we have

limn→∞

Tn(φ) = T (φ) . (1.41)

When the distributions Tn are regular and associated with locally integrable

functions fn, the distributional limit of Tn is not necessarily the same of the

pointwise (or uniform or Lp) limit of the fn. The limit in D′(R) is often

called limit in weak sense, since it is defined by considering the action of Tn

on test functions.

Example 1.13. The sequence of locally integrable functions fn(x) = einx

for n = 1, 2, 3, . . . converges pointwise to 1 for x = 2πk avec k ∈ Z and is not


convergent for all other x. On the other hand, the sequence of distributions

Tfn converge to 0 as we see with an integration by parts

(Tfn , φ) =

∫ +∞

−∞e−inxφ(x)dx =

1

in

∫ +∞

−∞e−inxφ′(x)dx

n→∞−→ 0 . (1.42)

Example 1.14. The sequence of locally integrable functions

fk(x) =

{k 1

k< x < 2

k

0 elsewherek ≥ 1 (1.43)

converges pointwise to 0. The associated distributions converge instead to

the Dirac delta function,

limk→∞

(Tfk , φ) = limk→∞

k

∫ 2k

1k

dxφ(x) = limk→∞

φ(x0) = φ(0) = (δ0, φ) (1.44)

where x0 is a point of the interval [ 1k, 2k] and we used the mean value theorem

for the C∞ function φ.

Example 1.15. The Dirac delta function δ(x) can be obtained as a limit of

any of the following sequences of functions

fn(x) =

{n√πe−n

2x2 ,1

π

sinnx

x,

1

nπ

(sinnx

x

)2,

1

nπ

1

x2 + 1/n2

}(1.45)

All these functions are obtained by rescaling: fn(x) = nf(nx), where f(x)

is a locally integrable function of unit area,∫∞−∞ f(x)dx = 1. Given a test

function φ, we have∫ ∞−∞

fn(x)φ(x)dxx=y/n

=

∫ ∞−∞

f(y)φ(y/n)dyn→∞−→ φ(0)

∫ ∞−∞

f(y)dy = φ(0) .

if f is sufficiently regular. It follows that lim fn(x) = δ(x) in distributional

sense. Notice that the pointwise limit of all the functions fn(x) is 0 for x 6= 0

and +∞ for x = 0. It is sometime said that the Dirac delta function δ(x) is

vanishing for x 6= 0 and goes to infinity for x = 0 and it is not uncommon

on physics textbook to find the writing

δ(x) =

{0 x 6= 0

∞ x = 0(1.46)

1.4. OPERATIONS ON DISTRIBUTIONS 13

We stress again that the concept of pointwise value of a distribution f(x) is

not well defined. In particular the value of δ in x = 0 makes no sense. What

it is true is that δ(x) is the limit of functions fn which become more and

more supported around x = 0 and whose value in the origin increases when

n becomes large.

The previous example shows that the weak limit of locally integrable func-

tions can be a singular distribution. In fact, one can show that all singular

distributions can be obtained as a limit of locally integrable functions. One

can show indeed that D(R) is dense in D′(R).

1.4 Operations on distributions

Many operations that are defined for functions can be extended to distribu-

tions. The idea is to let the operation act on the test function. One usually

defines the operation for regular distributions, move the operation on the

test function and extend the definition to the singular distributions. Since

test functions are C∞(R), it follows, for example, that all distributions are

differentiable.

1.4.1 Change of variables

Distributions are not defined point-wise but we can nevertheless define the

concept of change of variables. This is easy to do for a regular distribution

Tf , associated with the locally integrable function f(x). We want to consider

now the function g(x) = f(u(x)), or in short g = f ◦ u, where u(x) ∈ C∞(R)

and is one-to-one map of R to itself. We have

Tg(φ) =

∫ +∞

−∞dxf(u(x))φ(x)

y=u(x)=

∫ +∞

−∞dy

f(y)

|u′(u−1(y))|φ(u−1(y)) , (1.47)

or

Tf◦u(φ) = Tf

(φ ◦ u−1

|u′ ◦ u−1|

). (1.48)

What we did was to move the operation (the change of variable) from T to

the test function. Now that the operation is only acting on the test function


we can generalize the previous formula and define, for any distribution T ,

the distribution T ◦ u, defined as

T ◦ u(φ) := T (φ ◦ u−1/|u′ ◦ u−1|) . (1.49)

The definition makes sense since φ ◦ u−1/|u′ ◦ u−1| is a test function, being

C∞(R) and with compact support. One can easily check that the functional

T ◦ u is continuous. The above definition may look complicated and it is

better to demonstrate it by examples.

Example 1.16. Translation

Given x0 ∈ R and the distribution δx0 we want to find the distribution

translated by x→ Ta(x) = x+ a. Using (1.49) we have

δx0 ◦ Ta(φ) = δx0(φ ◦ T−1a ) =

∫ ∞∞

δ(x− x0)φ(x− a) dx = φ(x0 − a) (1.50)

and we find, not unexpectedly, that the translation δx0 ◦ Ta is δx0−a.

Example 1.17. Change of scale

Given λ ∈ R and the distribution δ we want to find the distribution obtained

by a dilatation x→ Dλ(x) = λx. Using (1.49) we have

δ ◦Dλ(φ) = δx(φ ◦D−1λ ) =

∫ ∞∞

δ(x)φ(x/λ)

|λ|dx = φ(0)/|λ| (1.51)

and we find that δ ◦ Dλ = δ/|λ|. With the usual abuse of notation we can

write the more transparent expression δ(λx) = δ(x)|λ| .

Example 1.18. Change of variables in the delta function

For a more general change of variables x→ u(x), with u monotonic and with

a single zero in x = x0, we would obtain from (1.49)

δ(u(x)) =δ(u−1(0))

|u′(u−1(0))|=

δ(x0)

|u′(x0)|(1.52)

Notice that that δ(u(x)) evaluates the test function in the zeros of u(x), as

expected, but it also multiplies it by a constant. Since the delta function is

vanishing outside x = 0 we can generalize the construction by using functions


u(x) which are not necessarily monotonic but have a finite number of zeros

xi with u′(xi) 6= 0. In this case we have

δ(u(x)) =∑i

δ(xi)

|u′(xi)|(1.53)

For example, δ(x2 − 1) = δ(x−1)2

+ δ(x+1)2

.

1.4.2 Multiplication by a C∞ function

We define the product of a distribution T by a function g ∈ C∞ as

(gT )(φ) := T (gφ) . (1.54)

The definition makes sense since gφ is a test function, being C∞(R) and with

compact support. Notice that it is important to have g ∈ C∞ in order to

have gφ ∈ D(R). One can easily check that the functional gT is continuous.

The definition is motivated by the case of a regular distribution Tf where

indeed

(Tgf , φ) =

∫ ∞−∞

dx g(x)f(x)φ(x) = (Tf , gφ) . (1.55)

Example 1.19. We want to show that

xP1

x= 1 (1.56)

in D′(R). Notice that 1 is an element of D′(R), being locally integrable.

Denoting T = P 1x

we have indeed

xT (φ) = T (xφ) = limε→0

∫|x|≥ε

dxxφ(x)

x=

∫ ∞−∞

dxφ(x) = T1(φ) (1.57)

since the integral in principal value of the C∞ function φ is the same as the

ordinary integral. Similarly we prove

x1

x+ i0= 1 . (1.58)

Example 1.20. The general solution of the algebraic equation xT = 1 for

T in D′(R) is

T = P1

x+ cδ(x) (1.59)


where c is an arbitrary constant. In fact xδ = 0 identically in D′(R) since

xδ(φ) = δ(xφ) =

∫ ∞−∞

δ(x)xφ(x) = 0 (1.60)

For c = ∓iπ we have T = 1x±i0 , as Example 1.10 shows. Notice that the

inverse in the space of distributions D′(R) is not unique.

1.4.3 Derivative of a distribution

Given a locally integrable function f whose derivative f ′ exists and it is

locally integrable, we can consider the distribution Tf ′ ∈ D′(R) and write

Tf ′(φ) =

∫ +∞

−∞dxf ′(x)φ(x) = [f(x)φ(x)]+∞−∞ −

∫ +∞

−∞dxf(x)φ′(x)

= −∫ +∞

−∞dxf(x)φ′(x) = −(Tf , φ

′) , (1.61)

where we moved the operation (the derivative) on the test function by inte-

grating by parts. It is important to notice that we can do that since φ is C∞

and the boundary terms vanish since φ ∈ D(R) has compact support.

This computation suggests to define the derivative T ′ for an arbitrary

distribution T ∈ D′(R) as

T ′(φ) := −(T, φ′) . (1.62)

The previous definition makes sense since φ′ is still a test function. It is easy

to check that T ′ is continuous.

Since test functions are C∞, we can iterate the operation of derivation an

infinite number of times: the n-th derivative of T will be given by

T (n)(φ) = (−1)n T (φ(n)) , (1.63)

where φ(n) denotes the n-th derivative of the test function φ. Distributions

are infinitely differentiable!

The derivative of distributions has properties similar to those of functions.

If T1 and T2 are two distributions we have

(λ1T1 + λ2T2)′ = λ1T

′1 + λ2T

′2 ∀λ1, λ2 ∈ C , (1.64)


(T (x− x0))′ = T ′(x− x0) ∀x0 ∈ R , (1.65)

(T (ax))′ = aT ′(ax) . (1.66)

All locally integrable functions f , even if not differentiable in ordinary

sense in one or more points, have a derivative T ′f when considered as distri-

butions.

Example 1.21. The derivative of the distribution associated with the Heav-

iside function θ(x) of Example 1.5 is

T ′θ(φ) = −Tθ(φ′) = −∫ ∞−∞

dx θ(x)φ′(x) = −∫ ∞0

dxφ′(x) = φ(0) . (1.67)

We thus find T ′θ = δ. The standard derivative of the Heaviside function does

not exist, but the distributional derivative does and it is given by the delta

function. As expected, the distributional derivative of θ(x) vanishes outside

x = 0. The discontinuity (jump) in x = 0 gives rise to a delta function. With

some abuse of language we will also write θ′ instead of T ′θ.

More generally, the distributional derivative allows to make sense of the

derivative of discontinuous functions. Given a function f which is continuous

and differentiable with continuous derivative everywhere except at the point

x0 where it has a jump discontinuity, we define the element f ′ ∈ L1loc(R) as the

(discontinuous) locally integrable function that coincides with the ordinary

derivatives of f(x) in R− {x0}1. We have

T ′f (φ) = −∫ ∞−∞

dx f(x)φ′(x) =

− f(x)φ(x)|x0−∞ +

∫ x0

−∞dx f ′(x)φ(x)− f(x)φ(x)|∞x0 +

∫ ∞x0

dx f ′(x)φ(x)

=

∫ ∞−∞

dx f ′(x)φ(x) + [f(x+0 )− f(x−0 )]φ(x0) (1.68)

where we have integrated by parts in the intervals [−∞, x0] and [x0,+∞].

By defining the discontinuity disc(f, x0) = f(x+0 )− f(x−0 ), we can write

T ′f = Tf ′ + disc(f, x0)δx0 . (1.69)

1We can define the value of f ′ in x0 arbitrarily: the value in x0 does not matter for

integration and functions in L1 which differs at a point are identified in the L1.


Example 1.22. The function ln |x| is locally integrable in the neighborhood

of x = 0 and thus defines a regular distribution. As a function it is not

differentiable in ordinary sense in x = 0. In D′(R) we have the identity

d ln |x|dx

= P1

x(1.70)

as the following computation shows

T ′ln |x|(φ) = −∫ +∞

−∞dx ln |x|φ′(x) = − lim

ε→0

∫|x|≥ε

dx ln |x|φ′(x) =

limε→0

[ln ε (φ(ε)− φ(−ε)) +

∫|x|≥ε

dx1

xφ(x)

]= P

∫ ∞−∞

dx1

xφ(x) .

The boundary terms vanish since

limε→0

(ε ln ε)φ(ε)− φ(−ε)

ε∼ 0(2φ′(0)) = 0 . (1.71)

Similarly we can show that in D′(R)

d2 ln |x|dx2

= −fp1

x2. (1.72)

1.4.4 Convolution of distributions

Distributions are not defined pointwise and therefore we cannot multiply

two distributions. For example, the product δ(x)2 does not make any sense:

formally we would write∫ ∞−∞

δ(x)δ(x)φ(x)dx = δ(0)φ(0) =∞ . (1.73)

The natural product on distributions is the convolution.

The convolution of two functions f and g is defined as the integral

(f ∗ g)(x) =

∫ ∞−∞

dx′f(x′)g(x− x′) . (1.74)

One can show that if f, g ∈ L1(R) then f ∗ g ∈ L1(R). More generally, if f

and g are locally integrable and one of them has compact support, then f ∗ gis again locally integrable. It is easy to see that the convolution of functions


satisfies the following properties. Given three functions f1, f2 et f3, two of

which at least with compact support, and two complex numbers λ, µ we have

f1 ∗ f2 = f2 ∗ f1 , (1.75)

(f1 ∗ f2) ∗ f3 = f1 ∗ (f2 ∗ f3) , (1.76)

f1 ∗ (λf2 + µf3) = λ(f1 ∗ f2) + µ(f1 ∗ f3) , (1.77)

(f1 ∗ f2)′ = f ′1 ∗ f2 = f1 ∗ f ′2 , (1.78)

the last equation being a consequence of the obvious identity∫ ∞−∞

dx′f(x′)g(x− x′) =

∫ ∞−∞

dx′f(x− x′)g(x′) , (1.79)

obtained by changing variables x′− > x− x′ in the integral.

As usual, we try to find a convenient definition for the convolution of two

distributions by considering first the case of regular distributions. Consider

then two locally integrable functions f and g with at least one of them with

compact support. The regular distribution Tf∗g acts as

Tf∗g(φ) =

∫ ∞−∞

dx

(∫ ∞−∞

dyf(y)g(x− y)

)φ(x)

=

∫ ∞−∞

dy f(y)

(∫ ∞−∞

dx g(x)φ(x+ y)

). (1.80)

We can interpret the last expression as follows. We first act with the distri-

bution Tg of variable x on the function φ(x+y) considering y as a parameter

and then we act on the result, which is a function of y, with the distribution

Tf . Using the notation (1.16) and abusing it we can write

(Tf (y), (Tg(x), φ(x+ y)) , (1.81)

where we indicated explicitly the variables the distributions are acting on.

Notice that (Tg(x), φ(x + y)) is a C∞ function of y, since φ is C∞. If g has

compact support, also (Tg(x), φ(x + y)) is and it is a test function in D(R)

to which we can apply Tf . If g is not compactly supported, (Tg(x), φ(x +

y)) is not a test function and the previous expression does not make sense.

However, we assumed that if g is not compactly supported, f is. Tf can


then act on any C∞ function because the support of Tf effectively reduce the

integral to a compact set (see Example 1.12).

We are now ready to generalize the construction to distributions. Recall

from Section 1.2.2 and Example 1.12 that the concept of support of a distri-

bution is well defined and that distributions with compact support can act

on any C∞ function. Given two distributions T and S, one of them with

compact support, we define the convolution T ∗ S ∈ D(R) as

T ∗ S(φ) = (T (y), (S(x), φ(x+ y)) . (1.82)

One can show that (S(x), φ(x + y)) is a C∞ function of y and that it is

compactly supported if S is. If S has compact support, (S(x), φ(x+y)) is then

a test function and the previous definition makes sense for any distribution T .

If S is not compactly supported T will be and its action on (S(x), φ(x+ y))

is well defined even if the function is only C∞.

One can show that the convolution for distributions satisfies the proper-

ties (1.75)-(1.78).

Example 1.23. The delta function is an identity element for the convolution

product

δ ∗ T = T ∗ δ = T . (1.83)

With an obvious abuse of notation we have indeed

(δ ∗ T, φ) =

∫ ∞−∞

dx dy δ(x)T (y)φ(x+ y) =

∫ ∞−∞

dy T (y)φ(y) = (T, φ) .

1.4.5 Fourier Transform of Distributions

We can actually extend the Fourier Transform to tempered distributions.

As usual, we first look at the case of regular distributions. The regular

distribution Tg corresponding to the Fourier Transform of a function g ∈S(R) acts on an element ϕ ∈ S(R) as

(Tg, ϕ) =

∫ ∞−∞

dp g(p) ϕ(p) . (1.84)

By inserting the explicit form of the Fourier Transform of g, and doing formal

manipulations, one can also write

(Tg, ϕ) =1√2π

∫ ∞−∞

dx g(x)

∫ ∞−∞

dp ϕ(p) e−ipx , (1.85)


which can be interpreted as the action of the regular distribution Tg on the

Fourier Transform of ϕ. In formulae,

(Tg, ϕ) =

∫ ∞−∞

dp g(p) ϕ(p) =

∫ ∞−∞

dx g(x) ϕ(x) = (Tg, ϕ). (1.86)

A generalization of (1.86) leads to the definition of the Fourier Transfom of

a generic distribution T ∈ S ′(R)

T (ϕ) = T (ϕ) . (1.87)

This definition extends the Fourier Transform from S(R) to S ′(R). It is

not hard to prove that the extension is a linear, bijective and bicontinuous

operator from S ′(R) to S ′(R). Notice that we cannot define the Fourier

Transform for all distributions: the definition (1.87) would fail for T ∈ D′(R)

since the Fourier Transform of φ ∈ D(R) is not necessarily in D(R).

Example 1.24. The Fourier Transform of the Dirac delta function is a con-

stant: δ = 1√2π

. Indeed

(δ, φ) = (δ, φ) = φ(0) =1√2π

∫ ∞−∞

dxφ(x) = (1√2π, φ) . (1.88)

Notice that we could have obtained the same result by the formal manipula-

tion

δ(p) =1√2π

∫ +∞

−∞dx δ(x) e−ipx =

1√2π

. (1.89)

Consequently, the inverse theorem tells us that the Fourier Transform of the

function 1 is F(1)(λ) =√

2πδ(p). This can be interpreted as an integral

representation of the delta function, namely

δ(x) =1

2π

∫ +∞

−∞dp eipx . (1.90)

This integral obviously does not exist in conventional sense, however it makes

sense in distributional sense, namely when applied to a test function.

Example 1.25. Any locally integrable function with algebraic growth is

in S ′(R) and has a well defined Fourier Transform. Take for example a

polynomial P (x) =∑n

k=0 akxk. We find

F(P )(p) =n∑k=0

akik∂kpF(1)(p) =

√2π

n∑k=0

akikδ(k)(p) , (1.91)


which is a distribution with support in the point p = 0. Vice-versa, the

Fourier Transform of a linear combination of δ(x) and its derivatives is a

polynomial. One can show more generally that the Fourier Transform of any

distribution with compact support is a C∞ function with algebraic growth.

Example 1.26. Consider the Fourier Transform of the distribution 1x+i0

. We

can easily compute it with the residue theorem

limε→0

1√2π

∫ ∞−∞

e−ixp

x+ iε= −√

2πi limε→0

e−εpθ(p) = −√

2πiθ(p) , (1.92)

where we close the contour in the upper (lower) half plane for p < 0 (p > 0).

The inverse Fourier Transform gives us a useful integral representation

i

x+ i0=

∫ ∞0

eixpdp . (1.93)

It is interesting to observe that this formula can be obtained as a formal limit

for ε→ 0 ofi

x+ iε=

∫ ∞0

eixp−εpdp , (1.94)

where now the integral converges in traditional sense for all ε > 0.

Some of the properties of Fourier Transform of functions extend to the

Fourier Transform of distributions. For instance

F(T (n))(p) = (ip)n FT (p) (1.95)

F(xnT )(p) = indn

dpnFT (p) (1.96)

F(T1 ∗ T2) = FT1 FT2 (1.97)

The last property deserves some comments. First of all, we need to assume

that at least one of the two distributions T1 and T2 has compact support,

since, as we saw in Section 1.4.4, this a necessary condition for the convolution

T1∗T2 to exist. Under this hypothesis, one can show that T1∗T2 is a tempered

distribution, so that the left hand side of (1.97) is well defined. Moreover,

as we already mentioned, the Fourier Transform of a tempered distribution

with compact support is actually a C∞ function with algebraic growth. It is

only this property that allows to make sense of the right hand side of (1.97)

as the product of a tempered distribution with a C∞ function. Remember

that the product of two distributions is generically not defined.

1.5. EXERCISES 23

1.5 Exercises

Exercise 1.1. Show that the k-th derivative of the delta function acts as (δ(k), φ) =

(−1)kφ(k)(0).

Exercise 1.2. Define a D(R2) distribution by (f, φ) =∫∞0

∫∞0 φ(x, y)dxdy. Com-

pute ∂2f/∂x∂y.

Exercise 1.3. Show that d2 ln |x|dx2

= −fp 1x2

in D(R).

Exercise 1.4. Define the distribution F = log(x + i0) in D(R) by (F, φ) =

limε→0

∫∞−∞ log(x+ iε)φ(x)dx.

1) Show that log(x + i0) = log |x| + iπθ(−x). Hint: analyze log argument in the

complex plane.

2) Show that ddx log(x+ i0) = 1/(x+ i0)

3) Check consistency of 1) and 2) with (1.35) and Example 1.22.

Exercise 1.5. Solve for T the algebraic equation P (x)T = Q(x) in D(R), where

P and Q are real polynomials with distinct zeros.

Exercise 1.6. Solve for T the differential equation x2dT/dx = 1 in D(R). Hint:

solve it first for functions T and add delta-function like ambiguities.

Exercise 1.7. Show that T =∑∞

n=−∞ δ(x− n) is a tempered distribution.

1) Expand formally the periodic distribution T in Fourier series in [0, 2π]

2) Show that the result in 1) is equivalent to the Poisson resummation formula valid

for functions in S(R):∑∞

k=−∞ φ(k) =∑∞

n=−∞ φ(n) where φ(p) =∫∞−∞ φ(x)e−2πix

is the Fourier transform of φ.

Documents

Chapter 1 Distributions - Istituto Nazionale di Fisica ...moby.mib.infn.it/~zaffaron/distribuzioni.pdf · Chapter 1 Distributions The concept of distribution generalises and extends