34
The exact modulus of the generalized Kurdyka-Lojasiewicz property Xianfu Wang * and Ziyuan Wang April 13, 2021 Abstract This work is twofold. On one hand, we introduce the generalized Kurdyka- Lojasiewicz property, a new concept that generalizes the classic Kurdyka- Lojasiewicz property by employing nonsmooth desingularizing functions. On the other hand, through intro- ducing the exact modulus of the generalized KL property, we provide an answer to the open question: “What is the optimal desingularizing function? ”, which fills a gap in the current literature. The exact modulus is designed to be the smallest among all possible desingularizing functions. Examples are also given to illustrate this pleasant property. In turn, by using the exact modulus, we found the sharpest upper bound for the total length of iterates generated by the celebrated Bolte-Sabach-Teboulle PALM algorithm. 2010 Mathematics Subject Classification: Primary 49J52, 26D10, 90C26; Secondary 26A51, 26B25. Keywords: Generalized Kurdyka- Lojasiewicz property, optimal desingularizing function, Bolte-Daniilidis-Ley-Mazet desingularizing function, proximal alternating linearized mini- mization, nonconvex optimization. * Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail: [email protected]. Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail: [email protected]. 1 arXiv:2008.13257v2 [math.OC] 11 Apr 2021

The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

The exact modulus of the generalizedKurdyka-Łojasiewicz property

Xianfu Wang∗and Ziyuan Wang†

April 13, 2021

Abstract

This work is twofold. On one hand, we introduce the generalized Kurdyka- Lojasiewiczproperty, a new concept that generalizes the classic Kurdyka- Lojasiewicz property byemploying nonsmooth desingularizing functions. On the other hand, through intro-ducing the exact modulus of the generalized KL property, we provide an answer tothe open question: “What is the optimal desingularizing function?”, which fills a gapin the current literature. The exact modulus is designed to be the smallest among allpossible desingularizing functions. Examples are also given to illustrate this pleasantproperty. In turn, by using the exact modulus, we found the sharpest upper bound forthe total length of iterates generated by the celebrated Bolte-Sabach-Teboulle PALMalgorithm.

2010 Mathematics Subject Classification: Primary 49J52, 26D10, 90C26; Secondary26A51, 26B25.

Keywords: Generalized Kurdyka- Lojasiewicz property, optimal desingularizing function,Bolte-Daniilidis-Ley-Mazet desingularizing function, proximal alternating linearized mini-mization, nonconvex optimization.

∗Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail:[email protected].

†Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail:[email protected].

1

arX

iv:2

008.

1325

7v2

[m

ath.

OC

] 1

1 A

pr 2

021

Page 2: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

1 Introduction

Throughout this paper,Rn is the standard Euclidean space

with inner product 〈x, y〉 = xTy and the Euclidean norm ‖x‖ =√〈x, x〉 for x, y ∈ Rn.

Denote by N the set of positive natural numbers, i.e., N = 1, 2, 3, . . .. The open ballcentered at x with radius r is denoted by B(x; r). The distance function of a subset K ⊆ Rn

is dist(·, K) : Rn → R = (−∞,∞],

x 7→ dist(x,K) = inf‖x− y‖ : y ∈ K,

where dist(x,K) ≡ ∞ if K = ∅. For f : Rn → R and r1, r2 ∈ [−∞,∞], we set [r1 < f <r2] = x ∈ Rn : r1 < f(x) < r2. For η ∈ (0,∞], denote by Kη the class of functionsϕ : [0, η)→ R+ satisfying: (i) ϕ : [0, η)→ R+ is concave and continuous with ϕ(0) = 0; (ii)ϕ is C1 on (0, η); (iii) ϕ′(t) > 0 for all t ∈ (0, η). Before stating the goal of this paper, let usrecall the definition of the Kurdyka- Lojasiewicz property.

Definition 1.1 Let f : Rn → R be proper and lsc. We say f has the Kurdyka- Lojasiewicz(KL) property at x ∈ dom ∂f , if there exist neighborhood U 3 x, η ∈ (0,∞] and a functionϕ ∈ Kη such that for all x ∈ U ∩ [0 < f − f(x) < η],

ϕ′(f(x)− f(x)

)· dist

(0, ∂f(x)

)≥ 1, (1)

where ∂f(x) denotes the limiting subdifferential of f at x, see Definition 2.1. The functionϕ is called a desingularizing function of f at x with respect to U and η. We say f is a KLfunction if it has the KL property at every x ∈ dom ∂f .

Originated from algebraic geometry, the KL property is a powerful regularity condition formany proximal-type algorithms, see, e.g., [1,2,8,11,19] and references therein. The pioneeringwork of Lojasiewicz [12] and Kurdyka [9] within the framework of differentiable functionslaid the foundation of this object. Their work was then extended to nonsmooth functions byBolte, Daniilidis, Lewis and Shiota, see [4, 5]. The term “KL property” appeared in Bolte,Daniilidis, Ley and Mazet’s work [6] for the first time, which characterizes the KL propertyand is a pillar of this area. In terms of algorithmic applications, the milestone work by Bolte,Sabach and Teboulle [8] showed that under the KL property and other mild assumptions, thesequence generated by the proximal alternating linearized minimization (PALM) algorithmconverges globally to a stationary point of a nonconvex and nonsmooth objective function.Furthermore, the trajectory of iterates has a finite length property, i.e.,

∑∞k=1 ‖zk+1 − zk‖ <

∞, where (zk)k∈N is a sequence generated by PALM. This pleasant convergence mechanismowing to the KL property has gained increasing attention, see, e.g., [2, 11, 14, 19, 20, 22] andreferences therein. These results, despite devoting to different algorithms, share a common

2

Page 3: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

theme: Employing the KL property to ensure that the sequence generated by the respectivealgorithm has a finite length property, see, e.g., [14, Lemma 3.5], [20, Theorem 3.1] and [2,Theorem 1].

This paper is devoted to answering the open question:

What is the optimal desingularizing function for the KL property?

When verifying the KL property of f at x, one needs to find a desingularizing functionϕ ∈ Kη for some η ∈ (0,∞] to “sharpen” the given function f around x. On one hand,desingularizing functions of f at x may be various. For example, both ϕ1(t) = arcsin(t −1) + π/2 and ϕ2(t) =

√t serve as desingularizing functions of the function f(x) = sin(x)

at x = −π/2 with U = (−3π/4,−π/4) and η = 1. On the other hand, the “optimality”of desingularizing function is still blurry from the current literature. Given all possibledesingularizing functions, an intuitive way to obtain the “optimal” one is to take theirinfimum (the supremum is always infinity). Nevertheless, the differentiability assumption ondesingularizing functions excludes their infimum from staying within the same class, sincethe infimum of differentiable functions may be nondifferentiable. Such observation naturallyleads to the aforementioned question, which, to the best of our knowledge, has receivedlittle attention. Bolte, Daniilidis, Ley and Mazet provided an integrability condition in theirfundamental work [6, Theorem 18], which is closely related to our pursuit, see Section 2.However, we shall see in Section 3.3 that their integrability condition fails to capture thesmallest desingularizing function, even for convex functions on the real line.

The main contributions of this paper are listed below:

• Definition 3.2 generalizes the KL property. Compared to the classic KL property, themain difference is that we allow the desingularizing function to be non-differentiable.

• Proposition 3.9 shows that the exact modulus of the generalized KL property, given inDefinition 3.6, is the optimal desingularizing function. This result provides an answerto the aforementioned question, which fills a gap in the current literature.

• Theorem 4.5 provides the sharpest upper bound of∑∞

k=1 ‖zk+1 − zk‖, where (zk)k∈N isa sequence generated by the PALM algorithm, which is an improvement of [8, Theorem1].

Unlike the classic theory, where most published articles emphasize on desingularizing func-tions of the form ϕ(t) = c · t1−θ for c > 0 and θ ∈ [0, 1), the exact modulus has variousforms depending on the given function. Proposition 3.11 gives an explicit formula for theoptimal desingularizing function of locally convex and C1 functions on the real line, inwhich case the exact modulus coincides with the desingularizing function obtained from the

3

Page 4: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Bolte-Daniilidis-Ley-Mazet integrability condition. However, examples are given to showthat the exact modulus is indeed the smaller one, even for nondifferentiable convex func-tions on the real line, see Examples 3.19 and 3.21. More examples comparing these twoobjects are provided in Section 3.3, which indicate that the exact modulus is smaller andalways concave. It is worth mentioning that, by using our technique in Theorem 4.5, onemay improve other algorithms adopting the KL property assumption. Moreover, there isa by-product concerning intersections of convex functions. We show in Example 3.14 thatthere exists strictly increasing convex C2 functions f : R+ → R+ and g : R+ → R+ suchthat infx > 0 : f(x) = g(x) = 0.

The structure of this paper is as the following: Elements in variational analysis, classicalanalysis and facts of the classical KL property are collected in Section 2. The generalizedKL property, the exact modulus and their properties are studied in Section 3. Besides, var-ious examples and comparisons to the Bolte-Daniilidis-Ley-Mazet (BDLM) desingularizingfunctions are given. We revisit the celebrated PALM algorithm in Section 4. Finally, someconcluding remarks and directions for the future work are presented in Section 5.

2 Preliminaries

2.1 Elements of variational and classical analysis

In this paper, we will use frequently the following subgradients in the nonconvex setting,see, e.g., [13, 17].

Definition 2.1 Let f : Rn → R be a proper function. We say that

(i) v ∈ Rn is a Frechet subgradient of f at x ∈ dom f , denoted by v ∈ ∂f(x), if for everyx ∈ dom f ,

f(x) ≥ f(x) + 〈v, x− x〉+ o(‖x− x‖). (2)

(ii) v ∈ Rn is a limiting subgradient of f at x ∈ dom f , denoted by v ∈ ∂f(x), if

v ∈ v ∈ Rn : ∃xkf−→ x, ∃vk ∈ ∂f(xk), vk → v, (3)

where xkf−→ x ⇔ xk → x and f(xk) → f(x). Moreover, we set dom ∂f = x ∈ Rn :

∂f(x) 6= ∅. We say that x ∈ dom ∂f is a stationary point, if 0 ∈ ∂f(x).

We follow the definition of proximal mapping in [8].

4

Page 5: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Definition 2.2 Let f : Rn → R be proper and lsc and let λ be a positive real. The proximalmapping Proxfλ is defined by

Proxfλ(x) = argminy∈Rn

f(y) +

λ

2‖x− y‖2

, ∀x ∈ Rn.

The following fact follows from [17, Theorem 1.25].

Fact 2.3 Let f : Rn → R be proper and lsc with infRn f > −∞. Then for λ ∈ (0,∞),Proxfλ(x) is nonempty for every x ∈ Rn. Moreover, for every v ∈ Rn and x ∈ Rn we have

Proxfλ

(x− 1

λv

)= argmin

y

〈y − x, v〉+

λ

2‖x− y‖2 + f(y)

.

Some well-known properties of convex functions on the real line are given in the followingfact.

Fact 2.4 ( [16, Section 24], [3, Chapter 17]) Let I ⊆ R be an open interval and let ϕ : I → Rbe convex. Then

(i) The side derivatives ϕ′−(t) and ϕ′+(t) are finite at every t ∈ I. Moreover, ϕ′−(t) andϕ′+(t) are increasing.

(ii) ϕ is differentiable except at countably many points of I, and ϕ(s) − ϕ(t) =∫ stϕ′−(x)dx =

∫ stϕ′+(x)dx for all s, t ∈ I.

(iii) Let t ∈ I. Then for every s ∈ I, ϕ(s)− ϕ(t) ≥ ϕ′−(t) · (s− t).

The following result concerns the absolute continuity of integrals.

Fact 2.5 [18, Theorem 6.79] Let f ∈ L1. Then for each ε > 0, there exists δ > 0 such that∫E

|f(x)|ds < ε,

whenever m(E) < δ, where E is a Lebesgue-measurable set and m(E) denotes its Lebesguemeasure.

2.2 The Kurdyka- Lojasiewicz property and known desingularizingfunctions

In this subsection, we collect several facts about the KL property and desingularizing func-tions, beginning with a classic result asserts that the KL property at non-stationary pointsis automatic, which means that the KL property is only interesting at stationary points.

5

Page 6: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Fact 2.6 [10, Lemma 2.1] Let f : Rn → R be proper and lsc. Let x ∈ dom ∂f be a non-stationary point and let θ ∈ [0, 1). Then there exists ε ∈ (0, 1] such that f has the KL

property at x with respect to U = B(x; ε), η = ε and ϕ(t) = t1−θ

ε(1−θ) .

Another celebrated result states that semialgebraic functions have the KL property.

Definition 2.7 (i) A set E ⊆ Rn is called semialgebraic, if there exist finitely many poly-nomials gij, hij : Rn → R such that

E =

p⋃j=1

q⋂i=1

x ∈ Rn : gij(x) = 0 and hij(x) < 0.

(ii) A function f : Rn → R is called semialgebraic, if its graph

gph f = (x, y) ∈ Rn+1 : f(x) = y

is semialgebraic.

Fact 2.8 [5, Corollary 16] Let f : Rn → R be a proper and lsc function and let x ∈ dom ∂f .If f is semialgebraic, then it has the KL property at x with ϕ(t) = c · t1−θ for some c > 0and θ ∈ [0, 1).

Remark 2.9 Despite it is well-known that real-polynomials are semialgebraic thus have theKL property, only until very recently did Bolte, Nguyen, Peypouquet and Suter [7, Corollary9] provide an explicit formula for desingularizing functions of convex piecewise polynomials.

We now recall some desingularizing functions by Bolte, Daniilidis, Ley and Mazet [6],which will be compared with our main results later. For η ∈ (0,∞], denote by Dη the classof functions ϕ : [0, η) → R+ satisfying: (i) ϕ is continuous of [0, η) with ϕ(0) = 0; (ii) ϕis continuously differentiable on (0, η); (iii) ϕ is strictly increasing on [0, η). In the contextof [6], desingularizing functions belong to this larger class of functions Dη instead of Kη1.Recall that a proper and lsc function f : Rn → R is semiconvex, if there exists α > 0 suchthat f + α

2‖·‖2 is convex. The following integrability condition can be extracted from [6]:

Fact 2.10 [6, Lemma 45, Theorem 18] Let f : Rn → R be lsc and semiconvex. Let x ∈[f = 0] and assume that there exist r, ε > 0 such that

x ∈ B(x; ε) ∩ [0 < f ≤ r]⇒ 0 /∈ ∂f(x). (4)

1To the best of our knowledge, the concavity assumption on desingularizing functions appeared in [1] forthe first time, and has become standard since then.

6

Page 7: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Suppose there exist r0 ∈ (0, r) and ε ∈ (0, ε) such that the function

u(r) =1

infx∈B(x;ε)∩[f=r]

dist(0, ∂f(x)), r ∈ (0, r0] (5)

is finite-valued and belongs to L1(0, r0). Then the following statements hold:

(i) There exists a continuous majorant u : (0, r0] → (0,∞) such that u ∈ L1(0, r0) andu(r) ≥ u(r) for all r ∈ (0, r0].

(ii) Define for t ∈ (0, r)

ϕ(t) =

∫ t

0

u(s)ds.

Then ϕ ∈ Dr0. For every x ∈ B(x; ε) ∩ [0 < f ≤ r0], one has

ϕ′ (f(x)) dist(0, ∂f(x)) ≥ 1.

Remark 2.11 Fact 2.10 is extracted from the implication (v) ⇒ (i) in the proof of [6,Theorem 18], where the above desingularizing function ϕ(t) was not stated explicitly intheir theorem statement. Since results in this paper are on Rn, we restrict Fact 2.10 to Rn,in which case Assumption (24) of [6, Theorem 18] becomes superfluous.

With additional assumption on convexity, the following fact asserts that the desingularizingfunction given by Fact 2.10 can be taken to be concave with an enlarged domain [0,∞).

Fact 2.12 [6, Lemma 45, Theorem 29] Let f : Rn → R be a proper, lsc and convex functionwith inf f = 0. Suppose that there exist r0 > 0 and ϕ ∈ Dr0 such that for all x ∈ [0 < f ≤ r0],

ϕ′(f(x)) dist(0, ∂f(x)) ≥ 1.

Then the following statements hold:

(i) Define for r ∈ (0,∞) the function

u(r) =1

infx∈[f=r]

dist(0, ∂f(x)).

Then u is finite-valued, decreasing and u ∈ L1(0, r0). Moreover, there exists a decreasingcontinuous function u ∈ L1(0, r0) such that u ≥ u.

(ii) Pick r ∈ (0, r0) and define for r ∈ (0,∞)

ϕ(r) =

∫ r0u(s)ds, if r ≤ r;∫ r

0u(s)ds+ u(r)(r − r), otherwise.

7

Page 8: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Then ϕ ∈ D∞ is concave and for every x /∈ [f = 0],

ϕ′(f(x)) dist(0, ∂f(x)) ≥ 1.

Another way to determine the desingularizing function of the KL property for convexfunctions is through the following growth condition:

Fact 2.13 [6, Theorem 30] Let f : Rn → R be a proper lsc convex function with f(0) =min f . Let S ⊆ Rn. Assume that there exists a function m : R+ → R+ that is continuous,increasing, m(0) = 0, f ≥ m (dist(·, argmin f)) on S ∩ dom f and

∃ρ > 0,

∫ ρ

0

m−1(s)

sds <∞.

Then for all x ∈ S\ argmin f ,

ϕ′(f(x)) dist (0, ∂f(x)) ≥ 1,

where for t ∈ (0, ρ),

ϕ(t) =

∫ t

0

m−1(s)

sds.

Remark 2.14 Facts 2.10, 2.12 and 2.13 fail to capture the smallest concave desingularizingfunction, even for convex functions on the real line, see Section 3.3.

The following fact is a special case of Fact 2.13, which emphasizes on desingularizingfunctions of the form ϕ(t) = c · t1−θ for c > 0 and θ ∈ [0, 1).

Fact 2.15 [8, Example 5] Let f : Rn → R be a proper, lsc and convex function. Letx ∈ argmin f 6= ∅. Assume that f satisfies the following growth condition: There existneighborhood U 3 x, η > 0, c > 0 and r ≥ 1 such that for x ∈ U ∩ [min f < f < min f + η]

f(x) ≥ f(x) + c · dist(x, argmin f)r. (6)

Then f has the KL property at x with respect to U , η and ϕ(t) = rc−1/rt1/r.

Example 2.16 Let f1(x) = |x|p, f2(x) = − ln(1 − |x|p) and f3(x) = tan(|x|p) for p ≥ 1.For each i ∈ 1, 2, 3, fi(x) satisfies the KL property at x = 0 with respect to U = dom fi,η =∞ and ϕ(t) = p · t1/p.

Proof. Note that (− ln(1− t))′ = 1/(1− t) ≥ 1 for 0 ≤ t < 1 and (tan(t))′ = 1/ cos2(t) ≥ 1for t ∈ [0, π/2). Hence for i ∈ 1, 2, 3 we have fi(x) ≥ |x|p on dom fi. Furthermore one hasfor x ∈ R ∩ [0 < fi − fi(x) <∞]

fi(x) ≥ fi(x) + |x|p = dist(x, argmin fi)p,

8

Page 9: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

where the last equality holds because fi has unique minimizer x = 0 for each i ∈ 1, 2, 3.Hence by applying Fact 2.15 we conclude that fi has the KL property with respect to U = R,η =∞ and ϕ(t) = rc−1/rt1/r with c = 1 and r = p.

3 The generalized KL property and its exact modulus

In this section, we provide an answer to the open question: “What is the optimal desingu-larizing function for the KL property?”. To this end, the generalized Kurdyka- Lojasiewiczproperty and its exact modulus are introduced. We will show that the exact modulus is theoptimal desingularizing function, in the sense that it is the smallest among all desingularizingfunctions.

3.1 The generalized KL property

For η ∈ (0,∞], denote by Φη the class of functions ϕ : [0, η) → R+ satisfying the followingconditions: (i) ϕ(t) is right-continuous at t = 0 with ϕ(0) = 0; (ii) ϕ is concave and strictlyincreasing on [0, η). Recall that the left derivative of ϕ : [0,∞)→ R at t ∈ (0,∞) is definedby

ϕ′−(t) = lims→t−

ϕ(s)− ϕ(t)

s− t.

Some useful properties of ϕ ∈ Φη are collected below.

Lemma 3.1 For η ∈ (0,∞] and ϕ ∈ Φη, the following assertions hold:

(i) Let t > 0. Then ϕ(t) = limu→0+∫ tuϕ′−(s)ds =

∫ t0ϕ′−(s)ds.

(ii) The function t 7→ ϕ′−(t) is decreasing and ϕ′−(t) > 0 for t ∈ (0, η).

(iii) For 0 ≤ s < t < η, ϕ′−(t) ≤ ϕ(t)−ϕ(s)t−s .

Proof. (i) Invoking Fact 2.4(ii) yields

ϕ(t) = limu→0+

(ϕ(t)− ϕ(u)

)= lim

u→0+

∫ t

u

ϕ′−(s)ds <∞,

where the first equality holds because ϕ is right-continuous at 0 with ϕ(0) = 0. Let (un)n∈Nbe a decreasing sequence with u1 < t such that un → 0+ as n → ∞. For each n, define

9

Page 10: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

hn : (0, t]→ R+ by hn(s) = ϕ′−(s) if s ∈ (un, t] and hn(s) = 0 otherwise. Then the sequence(hn)n∈N satisfies: (a) hn ≤ hn+1 for every n ∈ N; (b) hn(s) → ϕ′−(s) pointwise on (0, t); (c)

The integral∫ t

0hn(s)ds =

∫ tunϕ′−(s)ds = ϕ(t) − ϕ(un) ≤ ϕ(t) − ϕ(0) < ∞ for every n ∈ N.

Hence the monotone convergence theorem implies that

limu→0+

∫ t

u

ϕ′−(s)ds = limn→∞

∫ t

un

ϕ′−(s)ds = limn→∞

∫ t

0

hn(s)ds =

∫ t

0

ϕ′−(s)ds.

(ii) According to Fact 2.4(i), the function t 7→ ϕ′−(t) is decreasing. Suppose that ϕ′−(t0) = 0for some t0 ∈ (0, η). Then by the monotonicity of ϕ′− and (i), we would have ϕ(t)− ϕ(t0) =∫ tt0ϕ′−(s)ds ≤ (t − t0)ϕ′−(t0) = 0 for t > t0, which contradicts to the assumption that ϕ is

strictly increasing.

(iii) For 0 < s < t < η, applying Fact 2.4(iii) to the convex function −ϕ yields that−ϕ(s) +ϕ(t) ≥ −ϕ′−(t)(s− t)⇔ ϕ′−(t) ≤

(ϕ(t)−ϕ(s)

)/(t− s). The desired inequality then

follows from the right-continuity of ϕ at 0.

Definition 3.2 Let f : Rn → R be proper and lsc. Let x ∈ dom ∂f and µ ∈ R, and letV ⊆ dom ∂f be a nonempty subset.

(i) We say that f has the generalized Kurdyka- Lojasiewicz property (g-KL) at x ∈ dom ∂f ,if there exist neighborhood U 3 x, η ∈ (0,∞] and ϕ ∈ Φη, such that for all x ∈ U ∩ [0 <f − f(x) < η],

ϕ′−(f(x)− f(x)

)· dist

(0, ∂f(x)

)≥ 1. (7)

(ii) Suppose that f(x) = µ on V . We say f has the uniform generalized Kurdyka- Lojasiewicz property on V , if there exist U ⊃ V , η ∈ (0,∞] and ϕ ∈ Φη such that forevery x ∈ U ∩ [0 < f − µ < η],

ϕ′−(f(x)− µ

)· dist

(0, ∂f(x)

)≥ 1. (8)

Remark 3.3 (i) It is easy to see that the uniform generalized KL property on V reducesto the generalized KL property at x if V = x. This uniformized notion will be used inSection 4.

(ii) The KL property (Definition 1.1) implies the generalized KL property because Kη ⊆Φη. However, the generalized notion allows desingularizing functions to be non-differentiable.

In the rest of this subsection, we work towards generalizing a result by Bolte, Sabachand Teboulle [8, Lemma 6], whose proof we will follow. For nonempty subset Ω ⊆ Rn andε ∈ (0,∞], define Ωε = x ∈ Rn : dist(x,Ω) < ε. Let us recall the Lebesgue numberlemma [15, Theorem 55].

10

Page 11: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Lemma 3.4 Let Ω ⊆ Rn be a nonempty compact subset. Suppose that Uipi=1 is a finiteopen cover of Ω. Then there exists ε > 0, which is called the Lebesgue number of Ω, suchthat

Ω ⊆ Ωε ⊆p⋃i=1

Ui.

Proposition 3.5 below connects the generalized KL property to its uniform counterpart,generalizes [8, Lemma 6], and will play a key role in Section 4.

Proposition 3.5 Let f : Rn → R be proper lsc and let µ ∈ R. Let Ω ⊆ dom ∂f bea nonempty compact set on which f(x) = µ for all x ∈ Ω. Suppose that f satisfies thegeneralized KL property at each x ∈ Ω. Then there exist ε > 0, η ∈ (0,∞] and ϕ(t) ∈ Φη

such that f has the uniform generalized KL property on Ω with respect to U = Ωε, η and ϕ.

Proof. For each x ∈ Ω, there exist ε = ε(x) > 0, η = η(x) ∈ (0,∞] and ϕ(t) = ϕx(t) ∈ Φη

such that for y ∈ B(x; ε) ∩ [0 < f − f(x) < η],

ϕ′−(f(y)− f(x)

)· dist

(0, ∂f(y)

)≥ 1.

Note that Ω ⊆⋃x∈Ω B(x; ε). By the compactness, there exist x1, . . . , xp ∈ Ω such that

Ω ⊆⋃pi=1 B(xi; εi). Moreover, for each i and x ∈ B(xi; εi) ∩ [0 < f − f(xi) < ηi] =

B(xi; εi) ∩ [0 < f − µ < ηi], one has

(ϕi)′−(f(x)− µ

)· dist

(0, ∂f(x)

)≥ 1. (9)

Define ϕ(t) =∑p

i=1 ϕi(t) and η = min1≤i≤p ηi. It is easy to see that ϕ belongs to Φη. Byusing Lemma 3.4, there exists ε > 0 such that Ω ⊆ Ωε ⊆

⋃pi=1 B(xi; εi), which by the fact

that η ≤ ηi for every i further implies that

x ∈ Ωε ∩ [0 < f − µ < η]⇒ ∃i0, s.t., x ∈ B(xi0 ; εi0) ∩ [0 < f − µ < ηi0 ].

Hence for every x ∈ Ωε ∩ [0 < f − µ < η], one has

ϕ′−(f(x)− µ

)· dist

(0, ∂f(x)

)≥ (ϕi0)

′−(f(x)− µ

)· dist

(0, ∂f(x)

)≥ 1,

where the first inequality holds because (ϕi)′−(t) > 0, see Lemma 3.1, and the last one is

implied by (9).

3.2 The exact modulus of the generalized KL property

Definition 3.6 Let f : Rn → R be proper and lsc. Let x ∈ dom ∂f and let U ⊆ dom ∂f bea neighborhood of x. Let η ∈ (0,∞]. Furthermore, define h : (0, η)→ R by

h(s) = sup

dist−1(0, ∂f(x)

): x ∈ U ∩ [0 < f − f(x) < η], s ≤ f(x)− f(x)

.

11

Page 12: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Suppose that h(s) <∞ for s ∈ (0, η). The exact modulus of the generalized KL property off at x with respect to U and η is the function ϕ : [0, η)→ R+,

t 7→∫ t

0

h(s)ds, ∀t ∈ (0, η), (10)

and ϕ(0) = 0. If U ∩ [0 < f − f(x) < η] = ∅ for given U 3 x and η > 0, then we set theexact modulus with respect to U and η to be ϕ(t) ≡ 0.

Some comments on the definition of the exact modulus ϕ are in order.

Remark 3.7 (i) The exact modulus may look similar to the integrability condition (5) atthe first glance, however there are two major differences. On one hand, the exact modulusis always concave, see Proposition 3.9. In contrast, (5) only ensures concavity when f isconvex (recall Fact 2.12). On the other hand, we shall see in Proposition 3.9 that the exactmodulus is indeed the infimum of all possible desingularizing functions. A more detaileddiscussion is carried out in Section 3.3.

(ii) Note that lims→0+ h(s) could be infinity, in which case the function ϕ(t) represents alimit of Riemann or Lebesgue integrals. For instance, let f(x) = x2 and consider the exactmodulus of the generalized KL property of f at 0 with respect to U = R and η =∞. Thenwe have dist

(0, ∂f(x)

)= |f ′(x)| = |2x|, and

h(s) = sup|2x|−1 : x ∈ R ∩ [0 < f <∞], s ≤ x2

= sup

|2x|−1 : |x| ≥

√s

= 1/(2√s).

Hence lims→0+ h(s) =∞.

(iii) The assumption that h(s) <∞ for s ∈ (0, η) is necessary. For example, consider theexact modulus of the generalized KL property of the function f(x) = 1 − e−|x| at 0. Thenone has for x 6= 0, dist−1

(0, ∂f(x)

)= e|x|. Let U = R and η1 = 1. Then

h1(s) = hU,η1(s) = sup

dist−1(0, ∂f(x)

): R ∩ [0 < f < 1], s ≤ f(x)

= sup

e|x| : |x| > − ln(1− s)

=∞.

This kind of pathological behavior can be avoided by shrinking the set U∩[0 < f−f(x) < η].Let η2 ∈ (0, 1). Consequently we have

h2(s) = hU,η2(s) = sup

dist−1(0, ∂f(x)) : x ∈ R ∩ [0 < f < η2], s ≤ f(x)

= supe|x| : − ln(1− s) ≤ |x| < − ln(1− η2) =1

1− η2

.

The exact modulus ϕ is designed to be the optimal desingularizing function. To show this,a lemma helps.

12

Page 13: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Lemma 3.8 Let η ∈ (0,∞] and let h : (0, η)→ R+ be a positive-valued decreasing function.Define ϕ(t) =

∫ t0h(s)ds for t ∈ (0, η) and set ϕ(0) = 0. Suppose that ϕ(t) <∞ for t ∈ (0, η).

Then ϕ is a strictly increasing concave function on [0, η) with

ϕ′−(t) ≥ h(t)

for t ∈ (0, η), and right-continuous at 0. If in addition h is a continuous function, then ϕ isC1 on (0, η).

Proof. Let 0 < t0 < t1 < η. Then ϕ(t1) − ϕ(t0) =∫ t1t0h(s)ds ≥ (t1 − t0) · h(t1) > 0, which

means ϕ is strictly increasing. Applying Fact 2.5, one concludes that ϕ(t) → ϕ(0) = 0 ast→ 0+. The concavity of ϕ and the inequality ϕ′−(t) ≥ h(t) follow from a similar argumentas in [16, Theorem 24.2]. If in addition h is continuous, then by applying the fundamentaltheorem of calculus, one concludes that ϕ is C1 on (0, η).

Proposition 3.9 Let f : Rn → R be proper lsc and let x ∈ dom ∂f . Let U be a nonemptyneighborhood of x and η ∈ (0,∞]. Let ϕ ∈ Φη and suppose that f has the generalized KLproperty at x with respect to U , η and ϕ. Then the exact modulus of the generalized KLproperty of f at x with respect to U and η, denoted by ϕ, is well-defined and satisfies

ϕ(t) ≤ ϕ(t), ∀t ∈ [0, η).

Moreover, the function f has the generalized KL property at x with respect to U , η and ϕ.Furthermore, the exact modulus ϕ satisfies

ϕ = infϕ ∈ Φη : ϕ is a desingularizing function of f at x with respect to U and η

.

Proof. Let us show first that ϕ(t) ≤ ϕ(t) on [0, η), from which the well-definedness of ϕreadily follows. If U ∩ [0 < f − f(x) < η] = ∅, then by our convention ϕ(t) = 0 ≤ ϕ(t)for every t ∈ [0, η). Therefore we proceed with assuming U ∩ [0 < f − f(x) < η] 6= ∅. Byassumption, one has for x ∈ U ∩ [0 < f − f(x) < η],

ϕ′−(f(x)− f(x)

)· dist

(0, ∂f(x)

)≥ 1.

which guarantees that dist(0, ∂f(x)

)> 0. Fix s ∈ (0, η) and recall from Lemma 3.1(ii) that

ϕ′−(t) is decreasing. Then for x ∈ U ∩ [0 < f − f(x) < η] with s ≤ f(x)− f(x) we have

dist−1(0, ∂f(x)

)≤ ϕ′−

(f(x)− f(x)

)≤ ϕ′−(s).

Taking the supremum over all x ∈ U ∩ [0 < f − f(x) < η] satisfying s ≤ f(x)− f(x) yields

h(s) ≤ ϕ′−(s),

13

Page 14: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

where h(s) = sup

dist−1(0, ∂f(x)

): x ∈ U ∩ [0 < f − f(x) < η], s ≤ f(x) − f(x)

. If

lims→0+ h(s) =∞, then one needs to treat ϕ(t) as an improper integral. For t ∈ (0, η), onehas

ϕ(t) = limu→0+

∫ t

u

h(s)ds ≤ limu→0+

∫ t

u

ϕ′−(s)ds = ϕ(t) <∞,

where the last equality follows from Lemma 3.1. If lims→0+ h(s) < ∞, then the aboveargument still applies.

Recall that dist(0, ∂f(x)

)> 0 for every x ∈ U ∩ [0 < f − f(x) < η]. Hence h(s) is

positive-valued. Take s1, s2 ∈ (0, η) with s1 ≤ s2. Then for x ∈ U ∩ [0 < f − f(x) < η], onehas

s2 ≤ f(x)− f(x)⇒ s1 ≤ f(x)− f(x),

implying that h(s2) ≤ h(s1). Therefore h(s) is decreasing. Invoking Lemma 3.8, one con-cludes that ϕ ∈ Φη and ϕ′−(t) ≥ h(t) for every t ∈ (0, η).

Let t ∈ (0, η). Then for x ∈ U ∩ [0 < f − f(x) < η] with t = f(x)− f(x) we have

ϕ′−(f(x)− f(x)

)≥ h(t) ≥ dist−1

(0, ∂f(x)

),

where the last inequality is implied by the definition of h(s), from which the generalized KLproperty readily follows because t is arbitrary.

Recall that ϕ is an arbitrary desingularizing function of f at x with respect to U and η,and ϕ(t) ≤ ϕ(t) for all t ∈ [0, η). Hence one has

ϕ ≤ infϕ ∈ Φη : ϕ is a desingularizing function of f at x with respect to U and η

.

On the other hand, the converse inequality holds as ϕ is a desingularizing function of f atx with respect to U and η.

Our next set of examples shows that the optimal desingularizing function for the gener-alized KL property is not necessarily differentiable, which justifies the nonsmooth extensionof desingularizing functions in Definition 3.2.

Example 3.10 The following statements are true:

(i) Let ρ > 0. Consider the function given by

f(x) =

2ρ|x| − 3ρ2/2, if |x| > ρ;

|x|2/2, if |x| ≤ ρ.

Then the function

ϕ1(t) =

√2t, if 0 ≤ t ≤ ρ2/2;

t/(2ρ) + 3ρ/4, if t > ρ2/2,

14

Page 15: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

is the exact modulus of the generalized KL property of f at x = 0 with respect toU = R and η =∞.

(ii) Consider the following function

g(x) =

1− e−|x|, if |x| ≤ 1;

(1− e−1)|x|, if |x| > 1.

Then the function

ϕ2(t) =

e · t, if t < 1− e−1;ee−1· t+ e− 2, if t ≥ 1− e−1,

is the exact modulus of generalized KL property of g at x = 0 with respect to U = Rand η =∞.

Proof. (i) Note that for x 6= 0, one has

dist(0, ∂f(x)

)=

|x|, if 0 < |x| ≤ ρ;

2ρ, if |x| > ρ.⇒ dist−1

(0, ∂f(x)

)=

1/|x|, if 0 < |x| ≤ ρ;

1/2ρ, if |x| > ρ.

It follows that for s ∈ (0, ρ2/2],

h1(s) = sup

dist−1(0, ∂f(x)

): x ∈ R ∩ [0 < f <∞], s ≤ f(x)

= sup

dist−1

(0, ∂f(x)

): |x| ≥

√2s

= 1/√

2s,

and for s > ρ2/2

h1(s) = sup

dist−1(0, ∂f(x)

): x ∈ R ∩ [0 < f <∞], s ≤ f(x)

= sup

dist−1

(0, ∂f(x)

): x 6= 0, |x| ≥ s/(2ρ) + 3ρ/4

= 1/(2ρ).

Hence ϕ(t) =∫ t

0h1(s)ds =

√2t for t ≤ ρ2/2 and ϕ1(t) =

∫ tρ2/2

h1(s)ds +∫ ρ2/2

0h1(s)ds =

t/(2ρ)− ρ/4 + ρ = t/(2ρ) + 3ρ/4.

(ii) For nonzero x, we have

dist(0, ∂g(x)

)=

e−1e, if |x| > 1;

e−|x|, if 0 < |x| ≤ 1.⇒ dist−1

(0, ∂g(x)

)=

ee−1

, if |x| > 1;

e|x|, if 0 < |x| ≤ 1.

Hence for s ∈ (0, 1− e−1],

h2(s) = sup

dist−1(0, ∂g(x)

): x ∈ R ∩ [0 < g <∞], s ≤ g(x)

= sup

dist−1

(0, ∂g(x)

): x 6= 0, |x| ≥ − ln(1− s)

= e,

15

Page 16: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

while for s ∈ (1− e−1,∞),

h2(s) = sup

dist−1(0, ∂g(x)

): x ∈ R ∩ [0 < g <∞], s ≤ g(x)

= sup

dist−1

(0, ∂g(x)

): x 6= 0, x ≥ e

e− 1s

=e

e− 1.

Then ϕ2(t) =∫ t

0eds = e · t for t ∈ (0, 1 − e−1] and ϕ2(t) = e · (1 − e−1) +

∫ t1−e−1

ee−1

ds =ee−1· t+ e− 2 for t ∈ (1− e−1,∞).

It is difficult to compute directly the exact modulus of the generalized KL for multi-variablefunctions, due to its complicated definition. However, on the real line, we have the followingpleasing formula.

Proposition 3.11 Let f : R → R be proper and lsc. Let x be a stationary point. Supposethat there exists an interval (a, b) ⊆ int dom f , where −∞ ≤ a < b ≤ ∞, on which f isconvex on (a, b) and C1 on (a, b)\x. Set η = min

f(a) − f(x), f(b) − f(x)

, f1(x) =

f(x+ x)− f(x) for x ∈ (a− x, 0] and f2(x) = f(x+ x)− f(x) for [0, b− x). Furthermore,define ϕ : [0, η)→ R+,

t 7→∫ t

0

max

(−f−11 )′(s), (f−1

2 )′(s)ds, ∀t ∈ (0, η) (11)

and ϕ(0) = 0. Then ϕ(t) is the exact modulus of the KL property at x with respect toU = (a, b) and η. Note that we set f(x) = ∞ if x = ±∞ and (f−1

i )′ ≡ 0 if f−1i does not

exist.

Proof. Replacing f(x) by g(x) = f(x + x) − f(x) if necessary, we assume without loss ofgenerality that x = 0 and f(x) = 0. Then by the assumption that x = 0 is a stationarypoint, we have 0 ∈ ∂f(0) = [f ′−(0), f ′+(0)], meaning that f ′−(0) ≤ 0 ≤ f ′+(0). We learn fromFact 2.4 that f ′−(x) and f ′+(x) are increasing functions. Combining the C1 assumption, wehave f ′(x) = f ′−(x) ≤ f ′−(0) ≤ 0 on (a, 0) and f ′(x) = f ′+(x) ≥ f ′+(0) ≥ 0 on (0, b). Henceone gets that for x ∈ (a, b)\0

dist(0, ∂f(x)

)= |f ′(x)| =

−f ′(x) = −f ′1(x), if x ∈ (a, 0);

f ′(x) = f ′2(x), if x ∈ (0, b),

meaning that the function x 7→ dist(0, ∂f(x)

)is decreasing on (a, 0) and increasing on (0, b).

Now we work towards showing that h(s) = max

(−f−11 )′(s), (f−1

2 )′(s)

, where h(s) is thefunction given in Definition 3.6. Recall that f ′(x) is increasing on (a, b)\0 with f ′(x) ≤ 0on (a, 0) and f ′(x) ≥ 0 on (0, b). Shrinking the interval (a, b) if necessary, we only need toconsider the following cases:

16

Page 17: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Case 1: Consider first the case where f ′1(x) < 0 for x ∈ (a, 0) and f ′2(x) > 0 for x ∈ (0, b).Then both f1 and f2 are invertible and we have

dist−1(0, ∂f(x)

)=

−1/f ′1(x), if a < x < 0;

1/f ′2(x), if 0 < x < b.

Fix s ∈ (0, η). For x ∈ (a, 0), on which f1 is decreasing, we have

s ≤ f(x) = f1(x)⇔ f−11 (s) ≥ x. (12)

Similarly for x ∈ (0, b) we have

s ≤ f(x) = f2(x)⇔ f−12 (s) ≤ x. (13)

Hence one concludes that for x ∈ (a, b),

s ≤ f(x)⇔ x ∈ (a, f−11 (s)] ∪ [f−1

2 (s), b).

On the other hand, we have 0 < f(x) < η ⇔ x ∈ (f−11 (η), f−1

2 (η))\0, where f−11 (η) > a

and f−12 (η) < b, which means (a, b) ∩ [0 < f < η] = (f−1

1 (η), f−12 (η))\0.

Altogether, we conclude that the function h : (0, η)→ R given in Definition 3.6 satisfies

h(s) = sup

dist−1(0, ∂f(x)

): x ∈ (a, b) ∩ [0 < f < η], s ≤ f(x)

= sup

dist−1

(0, ∂f(x)

): x ∈ (f−1

1 (η), f−11 (s)] ∪ [f−1

2 (s), f−12 (η))

= max

− 1/(f ′1)(f−1

1 (s)), 1/(f ′2)(f−12 (s))

= max

(−f−1

1 )′(s), (f−12 )′(s)

,

where the third equality is implied the fact that x 7→ dist−1(0, ∂f(x)

)is increasing on (a, 0)

and decreasing on (0, b).

Case 2: If f ′(x) = 0 on (a, 0) and f ′(x) > 0 on (0, b), then f2 is invertible and (f−12 )′(s) =

1/f ′(f−12 (s)) > 0 on (0, η). Note that by our convention (f−1

1 )′(s) is set to be zero for all s.Hence it suffices to prove h(s) = (f−1

2 )′(s). For s ∈ (0, η)

h(s) = sup

dist−1(0, ∂f(x)

): x ∈ U ∩ [0 < f < η], s ≤ f(x)

= sup

1/f ′2(x) : x ∈ [f−1

2 (s), b)

= 1/f ′2(f−12 (s)) = (f−1

2 )′(s),

where the second equality is implied by (13), U ∩ [0 < f < η] = (0, b) and the fact that1/f ′2(x) is decreasing on (0, b).

Case 3: If f ′(x) < 0 on (a, 0) and f ′(x) = 0 on (0, b), then f1 is invertible. A similarargument proves that h(s) = (−f−1

1 )′(s).

17

Page 18: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Case 4: Now we consider the case where f ′(x) = 0 on (a, b), in which case U ∩ [0 <f < η] = ∅ and the corresponding exact modulus is ϕ ≡ 0 by our convention. Moreover,(−f−1

1 )′(s) and (f−12 )′(s) are set to be constant 0. Hence we have ϕ(t) =

∫ t0

0 ds = 0, whichcompletes the proof.

Remark 3.12 In the setting of Proposition 3.11, it is easy to see that the exact modulussatisfies

ϕ(t) =

∫ t

0

ds

inf(a,b)∩[f=s]

dist(0, ∂f(x)),

which means that the desingularizing function given by Fact 2.10 coincides with the exactmodulus. However, this is not true without the C1 assumption in Proposition 3.11, seeExample 3.19 and Example 3.21.

Combining Fact 2.6 and Proposition 3.11, we immediately obtain:

Corollary 3.13 Let f : (a, b) → R be a differentiable convex function. Then f is a KLfunction, i.e., f satisfies the KL-property at every point of (a, b).

When proving the above Proposition 3.11, our initial attempt is to take η > 0 sufficientlysmall so that t 7→ max−f−1

1 (t), f−12 (t) becomes either −f−1

1 or f−12 on [0, η). This attempt

leads to a question of independent interest: Let f and g be two smooth strictly increasingconvex functions defined on [0,∞) with f(0) = g(0).

Is infx > 0 : f(x) = g(x) always positive?

The answer is negative, as our next example shows.

Example 3.14 There exist strictly increasing convex C2 functions f, g : R+ → R+ withf(0) = g(0) = 0 such that infx > 0 : f(x) = g(x) = 0. To be specific, let h : R → R begiven by

h(x) =

sin(

1x

)e−

1x2 , if x 6= 0;

0, if x = 0.

Define h′′+(s) = maxh′′(s), 0 and h′′−(s) = −minh′′(s), 0. Furthermore, set f1(x) =∫ x0h′′−(t)dt and g1(x) =

∫ x0h′′+(t)dt. Then the functions f, g : R+ → R+ given by

g(x) =

∫ x

0

g1(t)dt, f(x) =

∫ x

0

f1(t)dt

are strictly increasing convex and C2 functions with f(0) = g(0) = 0, and satisfy g(x) −f(x) = h(x). Hence infx > 0 : f(x) = g(x) = infx > 0 : h(x) = 0 = 0.

18

Page 19: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Proof. Note that h(x) ∈ C∞. We now show that h is a difference of convex functions.Observe from the definition that h′′+(s) and h′′−(s) are positive-valued and continuous. Thenby the fundamental theorem of calculus, g1(x) and f1(x) are both increasing C1 functionswith f ′1(x) = h′′−(x) and g′1(x) = h′′+(x). Since h′′(x) = h′′+(x)− h′′−(x), one has

g1(x)− f1(x) =

∫ x

0

h′′(s)ds = h′(x)− h′(0) = h′(x).

Suppose that there exists x0 > 0 such that g1(x0) = 0. Then g1(x) = 0 for x ∈ (0, x0), whichimplies h′′(x) ≤ 0 on (0, x0). This is impossible because h′′(x) oscillates between positiveand negative infinitely many times when x → 0+. Hence g1 is strictly positive. A similarargument shows that f1 is also strictly positive. Then applying the fundamental theorem ofcalculus, one concludes that f and g are strictly increasing C2 functions with f ′(x) = f1(x)and g′(x) = g1(x). Functions f and g are both convex because f ′′ = f ′1 = h′′− ≥ 0 andg′′ = g′1 = h′′+ ≥ 0. Furthermore, we have

g(x)− f(x) =

∫ x

0

h′(s)ds = h(x)− h(0) = h(x).

Hence infx > 0 : f(x) = g(x) = infx > 0 : h(x) = 0 = inf 1nπ

: n ∈ N = 0.

We end this section by showing that there exists f : R → R, which has the generalizedKL property at 0 but desingularizing functions cannot have the form ϕ(t) = c · t1−θ for c > 0and θ ∈ [0, 1).

Proposition 3.15 Define f : R → R by f(x) = e−1/x2 for x 6= 0 and f(0) = 0. Then thefollowing statements hold:

(i) The exact modulus of the generalized KL property of f at 0 with respect to U =(−√

2/3,√

2/3) and η = e−3/2 is ϕ(t) =√−1/ ln(t) for t > 0 and ϕ(0) = 0.

(ii) For every c > 0 and θ ∈ [0, 1), the function ϕ(t) = c · t1−θ cannot be a desingularizingfunction of the generalized KL property of f at 0 with respect to any neighborhoodU 3 0 and η ∈ (0,∞].

Proof. (i) Note that f is convex on [−√

2/3,√

2/3]. Define f1(x) = f(x) for x ∈ (−√

2/3, 0]

and f2(x) = f(x) for x ∈ [0,√

2/3). Then we have h(s) = max(−f−11 )′(s), (f−1

2 )′(s) =

(f−12 )′(s) because f is even. For 0 < y ≤ exp(−3/2), y = exp(−x−2) ⇔ |x| =

√−1/ ln y.

Hence f−12 (y) =

√−1/ ln y. By applying Proposition 3.11, one concludes that ϕ(t) =√

−1/ ln t for t > 0.

(ii) Suppose to the contrary that there were c > 0 and θ ∈ [0, 1) such that f has thegeneralized KL property at 0 with respect to some U 3 0 and η > 0 and ϕ(t) = c · t1−θ.

19

Page 20: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Taking the intersection if necessary, assume without loss of generality that U ∩ [0 < f < η] ⊆(−√

2/3,√

2/3) ∩ [0 < f < e−3/2]. Then f is convex and C1 on U ∩ [0 < f < η]. By usingProposition 3.11, one concludes that the exact modulus of the generalized KL property of fat x with respect to U and minη, e−3/2 is also ϕ(t). Hence Proposition 3.9 implies that

ϕ(t) ≤ ϕ(t) = c · t1−θ, ∀t ∈ (0,minη, e−3/2). (14)

Let s > 0. Then one has s = ϕ(t)⇔ t = e−1/s2 , which further implies that

lim supt→0+

ϕ(t)

t1−θ= lim sup

s→0+

s

e−(1−θ)/s2 = lim sups→0+

e(1−θ)/s2

s−1=∞,

which contradicts to (14).

Remark 3.16 Proposition 3.15(ii) is known, however no proof was given, see [4, Section 1].

3.3 Comparison to the Bolte-Daniilidis-Ley-Mazet desingulariz-ing functions

In this subsection, we compare the exact modulus to the Bolte-Daniilidis-Ley-Mazet (BDLM)desingularizing functions in Facts 2.10 and 2.12. A comparison with the growth conditionin Fact 2.13 is also carried out. Examples will be given to highlight two major differences:

- The exact modulus is always concave for any proper and lsc function f , while theBDLM desingularizing function only guarantees concavity with additional convexityassumption on f .

- The exact modulus is not necessarily differentiable, which allows us to capture thesmallest possible concave desingularizing function.

Below, we use ϕ for the BDLM desingularizing function. Our first example shows that theBDLM desingularizing function ϕ in Fact 2.10 is not necessarily concave, while the exactmodulus is.

Example 3.17 Let f(x) = −12x2 + 2|x| and x = 0. Then f is semiconvex and satisfies

condition (4) in Fact 2.10 with r ∈ (0, 2) and ε = 2. Applying Fact 2.10 with an arbitraryr0 ∈ (0, r) and ε = ε, one has for x ∈ B(x; ε) ∩ [0 < f ≤ r0],

ϕ′(f(x)) dist(0, ∂f(x)) ≥ 1,

where ϕ(t) = 2−√

4− 2t. Clearly the BDLM desingularizing function ϕ(t) is not concave.In contrast, the exact modulus with respect to U = B(x; ε) and η = r0 is ϕ(t) = t√

4−2r0,

which is concave.

20

Page 21: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Proof. For x ∈ B(x; ε) and r ∈ (0, r0), we have f(x) = r ⇔ |x| = 2−√

4− 2r. Therefore

u(r) =1

infx∈U∩[f=r]

dist(0, ∂f(x))=

1√4− 2r

⇒ ϕ(t) =

∫ t

0

u(s)ds = 2−√

4− 2t.

On the other hand, for x ∈ B(x; ε), r ≤ f(x)−f(x) < r0 ⇔ 2−√

4− 2r ≤ |x| < 2−√

4− 2r0.Therefore

h(r) = supdist−1(0, ∂f(x)) : x ∈ B(x; ε) ∩ [0 < f − f(x) < r0], r ≤ f(x)− f(x)

= sup|f ′(x)|−1 : 2−√

4− 2r ≤ |x| < 2−√

4− 2r0 =1√

4− 2r0

,

which means the exact modulus ϕ(t) =∫ t

0h(s)ds = 1√

4−2r0t.

By picking a decreasing and continuous majorant of u in the integrability condition (5)and integrating, one can get a concave BDLM desingularizing function in the absence ofconvexity, which may not be the smallest:

Example 3.18 Let U = R and η =∞. Define f : R→ R+ by

f(x) =

12x, if 0 ≤ x ≤ 1

4;

32(x− 1

4) + 1

8, if 1

4< x ≤ 1

2;

x, if x > 12;

0 otherwise.

Let u : (0, η)→ R+ be the function given in the integrability condition (5) with ε = r =∞,and let h be given in Definition 3.6. Then the following statements hold:

(i) Functions u and h are given by

u(s) =

2, if 0 < s ≤ 1

8;

23, if 1

8< s < 1

2;

1, if s ≥ 12.

and h(s) =

2, if 0 < s ≤ 1

8;

1, if s > 18.

(ii) The function h satisfies

h = infu : u : (0, η)→ R+ is a continuous and decreasing function with u ≥ u.

(iii) The exact modulus of f at x = 0 with respect to U and η is

ϕ(t) =

2t, if 0 ≤ t ≤ 1

8;

t+ 18, if t > 1

8.

21

Page 22: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Let u be a continuous and decreasing majorant of u and define ϕ(t) =∫ t

0u(s)ds. Then

ϕ ≤ ϕ on (0, 18] and ϕ < ϕ on (1

8,∞).

Proof. (i) The desired results follow from simple calculation. (ii) Define w = infu : u :(0, η) → R+ is a continuous and decreasing function with u ≥ u. Let n ∈ N and definewn : (0, η)→ R+ by

wn(s) =

2, if 0 < s ≤ 1

8;

−n(s− 1

8

)+ 2, if 1

8< s ≤ 1

8+ 1

n;

1, if s > 18

+ 1n.

Then wn is a decreasing and continuous majorant of u and wn ≥ w. Pick s > 18

and notethat limn→∞wn(s) = 1. Then w(s) ≤ limn→∞wn(s) = 1, which together with the fact thatw ≥ 1 yields w(s) = 1. On the other hand, for s ≤ 1

8, we have 2 ≤ w(s) ≤ wn(s) = 2, which

means w(s) = 2. Therefore, w = h by (i).

(iii) Integrating h yields the desired formula of ϕ. Statement(ii) implies that any contin-uous and decreasing majorant u of u satisfies u ≥ h and there exists some ε > 0 such thatu(s) > h(s) on (1

8, 1

8+ ε). If there was t0 >

18

such that ϕ(t0) = ϕ(t0), then we would haveu = h almost everywhere on (0, t0), which is absurd.

Following examples in nonconvex setting, we now compare the exact modulus withFacts 2.12 and 2.13 by recycling Example 3.10(i). On one hand, we shall see that theexact modulus is smaller than any BDLM desingularizing function given by Fact 2.12. Onthe other hand, we will show that the smallest desingularizing function obtained from thegrowth condition in Fact 2.13 is still bigger than the exact modulus.

Example 3.19 Consider the function f given in Example 3.10(i) with ρ = 1. Recall fromExample 3.10 that the exact modulus of f at x with respect to U = R and η =∞ is

ϕ(t) =

√2t, if 0 ≤ t ≤ 1/2;

t/2 + 3/4, if t > 1/2.

Moreover, the following statements hold:

(i) Applying Fact 2.12 with r0 = 12

and r ∈ (0, r0) gives that f satisfies the KL propertyat x = 0 with respect to U = R, η =∞ and

ϕ1(t) =

√2t, if t ≤ r;√2r + 1√

2r(t− r), if t > r.

Evidently ϕ ≤ ϕ1, even in the limiting case where r = r0, see the left plot in Figure 1. Thereare other constructions of ϕ1, however they are all bigger than the exact modulus ϕ, seeRemark 3.20 for a detailed discussion.

22

Page 23: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

(ii) Define m : R+ → R+ by

m(t) =

12t2, if 0 ≤ t ≤ 1;

2t− 32, if t > 1.

Then Fact 2.13 implies that f has the KL property at 0 with respect to U = R, η =∞ and

ϕ2(t) =

2√

2t, if t ≤ 12;

t2

+ 34

ln(t) + 74

+ 34

ln(2), if t > 12.

The desingularizing function ϕ2 is the smallest possible one can get from Fact 2.13. However,ϕ ≤ ϕ2, see the right plot in Figure 1.

Proof. (i) For r ∈ (0, 12], f(x) = r ⇔ |x| =

√2r and for r ∈ (1

2,∞) we have f(x) = r ⇔

|x| = 34

+ 12r. Then we have

u(r) =1

infx∈[f=r]

dist(0, ∂f(x))=

1√2r, if 0 < r ≤ 1

2;

12, if r > 1

2.

Noticing that u is continuous on (0, r0), we set the continuous majorant u in Fact 2.12 to beu. The desired ϕ1 then follows from applying Fact 2.12.

(ii) Clearly all conditions in Fact 2.13 are satisfied. In particular, the equality f(x) =m(dist(x, argmin f)) = m(|x|) holds for all x, which means m is the largest possible modulus

of the growth condition. The larger m the smaller its inverse. Hence ϕ2(t) =∫ t

0m−1(s)

sds is

the smallest possible desingularizing function that one can get from Fact 2.13. The rest ofthe statement follows from simple calculation.

Remark 3.20 The function ϕ1 given in Example 3.19 is indeed ϕ1(t) =∫ t

0u(s)ds, where

u(r) =

u(r), if r ≤ r;

u(r), if r > r,

which is a continuous and decreasing majorant of u, where u is given in the proof above.Replacing u by other such majorant of u certainly yields a different ϕ1. However, notice thatfor this example, we have ϕ(t) =

∫ t0u(s)ds. Therefore ϕ(t) ≤

∫ t0u(s)ds = ϕ1(t), no matter

which majorant u we choose.

23

Page 24: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Figure 1: Plots of Example 3.19. Left: The exact modulus ϕ and ϕ1 in the limiting casewhere r = r0. Right: The exact modulus ϕ and ϕ2.

Despite the exact modulus ϕ in Example 3.19(i) is smaller, there is still some overlapbetween ϕ and the desingularizing function obtained from Fact 2.12. In what follows, weconstruct an example where the exact modulus of a non-differentiable convex function is thestrictly smaller one everywhere, except at the origin. Note that we shall compare the exactmodulus to Fact 2.10 instead of Fact 2.12, as the former is more general.

Example 3.21 Let r1 = π2

6− 1 and rk+1 = rk − 1

k2(k+1)for k ∈ N. Define for k ∈ N and

x > 0

f(x) =1

k

(x− 1

k

)+ rk,∀x ∈

(1

k + 1,

1

k

].

Let f(−x) = f(x) for x < 0 and f(0) = 0. Then the following statements hold:

(i) The function f : [−1, 1]→ [0, r1] is continuous and convex with argmin f = 0.

(ii) The exact modulus of f at 0 with respect to U = [−1, 1] and η = r1 is a piecewiselinear function ϕ : [0, r1]→ R+ satisfying

ϕ(t) = k(t− rk+1) +∞∑

i=k+1

i(ri − ri+1),∀t ∈ (rk+1, rk],∀k ∈ N,

and ϕ(0) = 0. Furthermore, every desingularizing function ϕ : [0, r1] → R+ obtained fromFact 2.10 satisfies ϕ(t) > ϕ(t) on (0, r1].

24

Page 25: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Proof. (i) Note that we have

limk→∞

rk+1 = r1 −∞∑i=1

1

i2(i+ 1)= r1 −

∞∑i=1

(1

i2− 1

i(i+ 1)

)= r1 − r1 = 0,

which implies that f is well-defined and continuous at 0. To see the continuity at x = 1k+1

for k ∈ N, it suffices to observe that

limx→ 1

k+1

+f(x) =

1

k

(1

k + 1− 1

k

)+ rk = rk+1 = f

(1

1 + k

),

where the second last equality follows from the definition of rk+1. Moreover, f is piecewiselinear with increasing slope then it is convex.

(ii) Clearly we have ∂f(

1k

)=[

1k, 1k−1

]for every k ∈ N with k ≥ 2. It follows easily

that for x ∈ [−1, 1] with 1k+1

< |x| ≤ 1k, one has dist (0, ∂f(x)) = 1

k. For r ∈ (rk+1, rk],

r = f(x)⇔ |x| = k(r − rk) + 1k. Elementary calculation yields

u(r) =1

infx∈U∩[f=r]

dist(0, ∂f(x))= k.

Note that u ∈ L1(0, r1) hence all conditions in Fact 2.10 are satisfied. Indeed, one has∫ r1

0

u(s)ds =∞∑k=1

k · (rk − rk+1) =∞∑k=1

1

k(k + 1)≤

∞∑k=1

1

k2<∞.

Then [6, Lemma 44] implies there exists a continuous and decreasing majorant u of u. Thecontinuity of u ensures that for every k there exists εk > 0 such that u > u on (rk, rk + εk).It is also easy to observe that the exact modulus ϕ satisfies ϕ(t) =

∫ t0u(s)ds. Altogether,

we conclude that the desingularizing function given by Fact 2.10 satisfies

ϕ(t) =

∫ t

0

u(s)ds >

∫ t

0

u(s)ds = ϕ(t),∀t ∈ (0, r1].

Indeed, if there was t0 ∈ (0, r1] such that ϕ(t0) = ϕ(t0), then we would have u = u almosteverywhere on (0, t0], which is absurd.

4 The PALM algorithm revisited

In this section, we revisit the celebrated proximal alternating linearized minimiza-tion (PALM) algorithm. We will show that the exact modulus of the generalized KL propertyleads to the sharpest upper bound for the total length of trajectory of iterates generated byPALM.

25

Page 26: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

4.1 The PALM algorithm

Consider the following nonconvex and nonsmooth optimization model:

min(x,y)∈Rn×Rm

Ψ(x, y) = f(x) + g(y) + F (x, y),

where f : Rn → R and g : Rm → R are proper and lsc functions and F : Rn × Rm → R isC1. This model covers many optimization problems in practice, see [8]. Bolte, Sabach andTeboulle [8] proposed the following PALM algorithm to solve the aforementioned problem:

PALM: Proximal Alternating Linearized Minimization1. Initialization: Start with arbitrary z0 = (x0, y0) ∈ Rn × Rm.2. For each k = 0, 1, . . ., generate a sequence (zk)k∈N = (xk, yk)k∈N as follows, where quanti-ties L1(yk) and L2(xk+1) will be given in (A2):

2.1. Take γ1 > 1, set ck = γ1L1(yk) and compute

xk+1 ∈ Proxfck

(xk −

1

ck∇xF (xk, yk)

). (15)

2.2. Take γ2 > 1, set dk = γ2L2(xk+1) and compute

yk+1 ∈ Proxgdk

(yk −

1

dk∇yF (xk+1, yk)

). (16)

The PALM algorithm is analyzed under the following blanket assumptions in [8].

(A1) infRn×Rm Ψ > −∞, infRn f > −∞ and infRm g > −∞.

(A2) For every fixed y ∈ Rm, the function x 7→ F (x, y) is C1,1L1(y), i.e.,

‖∇xF (x1, y)−∇xF (x2, y)‖ ≤ L1(y) ‖x1 − x2‖ ,∀x1, x2 ∈ Rn.

Assume similarly that for every x ∈ Rn, y 7→ F (x, y) is C1,1L2(x).

(A3) For i = 1, 2 there exist λ−i , λ+i > 0 such that

infL1(yk) : k ∈ N ≥ λ−1 and infL2(xk) : k ∈ N ≥ λ−2 ,

supL1(yk) : k ∈ N ≤ λ+1 and supL2(xk) : k ∈ N ≤ λ+

2 .

26

Page 27: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

(A4) ∇F is Lipschitz continuous on bounded subsets of Rn × Rm, i.e., on every boundedsubset B1 × B2 of Rn × Rm, there exists M > 0 such that for all (xi, yi) ∈ B1 × B2,i = 1, 2,

‖∇F (x1, y1)−∇F (x2, y2)‖ ≤M ‖(x1 − x2, y1 − y2)‖ .

Fact 2.3 shows that PALM is well defined. Bolte, Sabach and Teboulle showed that thePALM algorithm enjoys the following properties:

Lemma 4.1 [8, Lemma 3] Suppose that (A1)-(A4) hold. Let (zk)k∈N be a sequence generatedby PALM. Then the following hold:

(i) The sequence(Ψ(zk)

)k∈N is decreasing and in particular

ρ1

2‖zk+1 − zk‖2 ≤ Ψ(zk)−Ψ(zk+1), ∀k ≥ 0, (17)

where ρ1 = min(γ1 − 1)λ−1 , (γ2 − 1)λ−2 .

(ii)∑∞

k=1 ‖zk+1 − zk‖2 <∞, and hence limk→∞ ‖zk+1 − zk‖ = 0.

Lemma 4.2 [8, Lemma 4] Suppose that (A1)-(A4) hold, and that M > 0 is the Lipschitzconstant given in (A4). Let (zk)k∈N be a sequence generated by PALM which is assumed tobe bounded. For k ∈ N, define

Akx = ck−1(xk−1 − xk) +∇xF (xk, yk)−∇xF (xk−1, yk−1),

Aky = dk−1(yk−1 − yk) +∇yF (xk, yk)−∇yF (xk, yk−1).

Then (Akx, Aky) ∈ ∂Ψ(xk, yk), and∥∥(Akx, A

ky)∥∥ ≤ ∥∥Akx∥∥+

∥∥Aky∥∥ ≤ (2M + 3ρ2) ‖zk − zk−1‖ ,∀k ∈ N,

where ρ2 = maxγ1λ+1 , γ2λ

+2 .

Below the set of limit points of sequence (zk)k∈N is denoted by ω(z0) = z ∈ Rn ×Rm : ∃(zkq)q∈N ⊆ (zk)k∈N, zkq → z, as q → ∞. The following lemma summarizes usefulproperties of (zk)k∈N and ω(z0), where Lemma 4.3(i) is extracted from the proof of [8, Lemma5(i)].

Lemma 4.3 [8, Lemma 5] Suppose that (A1)-(A4) hold. Let (zk)k∈N be a sequence generatedby PALM which is assumed to be bounded. Then the following assertions hold:

27

Page 28: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

(i) For every z∗ ∈ ω(z0) and (zkq)q∈N converging to z∗, one has

limq→∞

Ψ(zkq) = Ψ(z∗).

Moreover, ω(z0) ⊆ stat Ψ, where stat Ψ denotes the set of stationary points of Ψ.

(ii) limk→∞ dist(zk, ω(z0)) = 0.

(iii) The set ω(z0) is nonempty, compact and connected.

(iv) The objective function is constant on ω(z0).

4.2 The sharpest upper bound for the total length of trajectoryof iterates

In this subsection, we improve a result by Bolte, Sabach and Teboulle [8, Theorem 1]. Webegin with a technical lemma, which is a sharper version of [8, Lemma 6].

Lemma 4.4 Let f : Rn → R be proper lsc and let µ ∈ R. Let Ω ⊆ dom ∂f be a nonemptycompact set on which f(x) = µ for all x ∈ Ω. Suppose that f has the generalized KL propertyat each x ∈ Ω. Let ε, η > 0 and ϕ ∈ Φη be those given in Proposition 3.5. Set U = Ωε anddefine h : (0, η)→ R+ by

h(s) = sup

dist−1(0, ∂f(x)

): x ∈ U ∩ [0 < f − µ < η], s ≤ f(x)− µ

.

Then the function ϕ : [0, η)→ R+,

t 7→∫ t

0

h(s)ds, ∀t ∈ (0, η),

and ϕ(0) = 0, is well-defined and belongs to Φη. The function f has the uniform generalizedKL property on Ω with respect to U , η and ϕ. Moreover, one has

ϕ = infϕ ∈ Φη : ϕ is a desingularizing function of f on Ω with respect to U and η

.

We say ϕ is the exact modulus of the uniformized generalized KL property of f on Ω withrespect to U and η.

Proof. Apply a similar argument as in Proposition 3.9.

The following theorem provides the “sharpest” upper bound for the total length of thetrajectory of iterates generated by PALM, which improves Bolte, Sabach and Teboulle [8,Theorem 1]. The notion of “sharpest” will be specified later in Remark 4.6. Our prooffollows a similar approach as in [8, Theorem 1], but makes use of the exact modulus of theuniformized generalized KL property.

28

Page 29: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

Theorem 4.5 Suppose that the objective function Ψ is a generalized KL function such that(A1)-(A4) hold. Let (zk)k∈N be a sequence generated by PALM which is assumed to bebounded. Then the following assertions hold:

(i) The sequence (zk)k∈N converges to a stationary point z∗ of objective function Ψ.

(ii) The sequence (zk)k∈N has finite length. To be specific, there exist l ∈ N, η ∈ (0,∞] andϕ ∈ Φη such that for p ≥ l + 1 and every q ∈ N

p+q∑k=p

‖zk+1 − zk‖ ≤ C · ϕ(Ψ(zp)−Ψ(z∗)

)+ ‖zp − zp−1‖ . (18)

Therefore∞∑k=1

‖zk+1 − zk‖ ≤ A+ C · ϕ(Ψ(zl+1)−Ψ(z∗)

)<∞, (19)

where A = ‖zl+1 − zl‖+∑l

k=1 ‖zk+1 − zk‖ <∞ and C = 2(2M + 3ρ2)/ρ1.

Proof. Because (zk)k∈N is bounded, there exists a convergent subsequence, say zkq → z∗ ∈ω(z0). Then Lemma 4.3(i) implies limq→∞Ψ(zkq) = Ψ(z∗) and z∗ ∈ crit Ψ. Since (Ψ(zk))k∈Nis a decreasing sequence by Lemma 4.1, we have limk→∞Ψ(zk) = limq→∞Ψ(zkq) = Ψ(z∗).

We will show that (zk)k∈N converges to z∗, and along the way we also establish (18) and(19). We proceed by considering two cases.

Case 1: If there exits l such that Ψ(zl) = Ψ(z∗), then by the decreasing property of(Ψ(zk))k∈N, one has Ψ(zl+1) = Ψ(zl) and therefore zl = zl+1 by (17). Hence by induction,we conclude that limk→∞ zk = z∗. The desired assertion follows immediately.

Case 2: Now we consider the case where Ψ(z∗) < Ψ(zk) for all k ∈ N. By Lemma 4.3 andassumption, Ψ is a generalized KL function that is constant on compact set ω(z0). InvokingLemma 4.4 shows that there exist ε > 0 and η > 0 such that the exact modulus of theuniform generalized KL property on Ω = ω(z0) with respect to U = Ωε and η exists, whichis denoted by ϕ. Hence for every z ∈ Ωε ∩ [0 < Ψ−Ψ(z∗) < η],

ϕ′−(Ψ(z)−Ψ(z∗)

)· dist

(0, ∂Ψ(z)

)≥ 1. (20)

By the fact that limk→∞Ψ(zk) = Ψ(z∗), there exists some l1 > 0 such that 0 < Ψ(zk) −Ψ(z∗) < η for k > l1. On the other hand, Lemma 4.3(ii) shows that there exists l2 > 0such that dist(zk, w(z0)) < ε for k > l2. Altogether, we conclude that for k > l, wherel = maxl1, l2, zk ∈ Ωε ∩ [0 < Ψ−Ψ(z∗) < η] and

ϕ′−(Ψ(zk)−Ψ(z∗)

)· dist

(0, ∂Ψ(zk)

)≥ 1. (21)

29

Page 30: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

It follows from Lemma 4.2 that dist(0, ∂Ψ(zk)

)≤∥∥(Akx, Aky)∥∥ ≤ (2M + 3ρ2) ‖zk − zk−1‖.

Hence one has from (21) that for k > l,

ϕ′−(Ψ(zk)−Ψ(z∗)

)≥ dist−1

(0, ∂Ψ(zk)

)≥ 1

2M + 3ρ2

‖zk − zk−1‖−1 . (22)

Note that ‖zk − zk−1‖ 6= 0. Otherwise Lemma 4.2 would imply that

dist(0, ∂Ψ(zk)

)≤ (2M + 3ρ2) ‖zk − zk−1‖ = 0,

which contradicts to (21). Applying Lemma 3.1(ii) to ϕ with s = Ψ(zk+1) − Ψ(z∗) andt = Ψ(zk)−Ψ(z∗), one obtains for k > l

ϕ(Ψ(zk)−Ψ(z∗)

)− ϕ

(Ψ(zk+1)−Ψ(z∗)

)Ψ(zk)−Ψ(zk+1)

≥ ϕ′−(Ψ(zk)−Ψ(z∗)

)≥ 1

2M + 3ρ2

‖zk − zk−1‖−1 . (23)

For the sake of simplicity, we introduce the following shorthand:

∆p,q = ϕ(Ψ(zp)−Ψ(z∗)

)− ϕ

(Ψ(zq)−Ψ(z∗)

).

Then (23) can be rewritten as

Ψ(zk)−Ψ(zk+1) ≤ ‖zk − zk−1‖ ·∆k,k+1 · (2M + 3ρ2). (24)

Furthermore, Lemma 4.1(i) gives

‖zk+1 − zk‖2 ≤ 2

ρ1

[Ψ(zk)−Ψ(zk+1)] ≤ C∆k,k+1 ‖zk − zk−1‖ , (25)

where C = 2(2M+3ρ2)ρ1

∈ (0,∞). By the geometric mean inequality 2√αβ ≤ α+β for α, β ≥ 0,

one gets for k > l

2 ‖zk+1 − zk‖ ≤ C∆k,k+1 + ‖zk − zk−1‖ . (26)

Let p ≥ l+ 1. For every q ∈ N, summing up the above inequality from p up to p+ q yields

2

p+q∑k=p

‖zk+1 − zk‖ ≤ C

p+q∑k=p

∆k,k+1 +

p+q∑k=p

‖zk − zk−1‖+ ‖zp+q+1 − zp+q‖

= C∆p,p+q+1 +

p+q∑k=p

‖zk+1 − zk‖+ ‖zp − zp−1‖

≤ Cϕ(Ψ(zp)−Ψ(z∗)

)+

p+q∑k=p

‖zk+1 − zk‖+ ‖zp − zp−1‖ ,

30

Page 31: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

where the last inequality holds because ϕ ≥ 0. Hence one has for q ∈ Np+q∑k=p

‖zk+1 − zk‖ ≤ C · ϕ(Ψ(zp)−Ψ(z∗)

)+ ‖zp − zp−1‖ ,

which proves (18). By taking q →∞, one has

∞∑k=1

‖zk+1 − zk‖ ≤p−1∑k=1

‖zk+1 − zk‖+ C · ϕ(Ψ(zp)−Ψ(z∗)

)+ ‖zp − zp−1‖ ,

from which (19) readily follows by setting p = l + 1.

Now let p ≥ l + 1, where l is the index given in assertion (i), and let q ∈ N. Then

‖zp+q − zp‖ ≤p+q−1∑k=p

‖zk+1 − zk‖ ≤p+q∑k=p

‖zk+1 − zk‖ .

Recall that ϕ(t) → 0 as t → 0+, Ψ(zk) − Ψ(z∗) → 0 and ‖zk+1 − zk‖ → 0 as k → ∞.Invoking (18), one obtains that

‖zp+q − zp‖ ≤ C · ϕ(Ψ(zp)−Ψ(z∗)

)+ ‖zp − zp−1‖ → 0, p→∞,

meaning that (zk)k∈N is Cauchy and hence convergent. Because zkq → z∗, we conclude thatzk → z∗.

Remark 4.6 The bound for the total length of iterates (19) is the “sharpest”, in the sensethat it is the smallest one can get by using the usual KL convergence analysis. Assumingthat the objective function Ψ is KL, Bolte, Sabach and Teboulle [8, Theorem 1] showed that

∞∑k=1

‖zk+1 − zk‖ ≤ A+ C · ϕ(Ψ(zl+1)−Ψ(z∗)

), (27)

where ϕ(t) is a desingularizing function for the uniform KL property of Ψ on Ω = w(z0)with respect to Ωε and η > 0, which is given by [8, Lemma 6]. Note that A and C are fixed.Then we learn from (27) that the smaller ϕ(t) is, the sharper the upper bound becomes.According to Lemma 4.4, ϕ is the smallest among all possible ϕ. Hence the upper boundgiven by (19) is the sharpest.

5 Conclusion

In this work, we introduced the generalized KL property and its exact modulus. These newconcepts extend the classic KL property and provide an answer to the question: “What is the

31

Page 32: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

optimal desingularizing function for the KL property?”. In turn, we obtained the sharpestupper bound for the total length of trajectory of iterates generated by the PALM algorithm,which improves a result by Bolte, Sabach and Teboulle [8, Theorem 1]. Let us end this paperwith some directions for the future work:

• Compute or at least estimate the exact modulus of the generalized KL property forconcrete optimization models.

• One way to estimate the exact modulus is applying calculus rules of the generalizedKL property. Li and Pong [10] and Yu et al. [21] developed several calculus rules ofthe KL property, in the case where desingularizing functions take the specific formϕ(t) = c · t1−θ, where c > 0 and θ ∈ [0, 1). However, the exact modulus has variousforms depending on the given function, which requires us to obtain general calculusrules without assuming that desingularizing functions have the specific form.

Acknowledgments

XW and ZW were partially supported by NSERC Discovery Grants. The authors thank Dr.Heinz H. Bauschke for many useful discussions and suggesting this research problem. Theyare also very grateful to anonymous referees for their valuable comments that improved thismanuscript significantly.

References

[1] H. Attouch, J. Bolte, P. Redont, and A. Soubeyran, Proximal alternatingminimization and projection methods for nonconvex problems: an approach based onthe Kurdyka- Lojasiewicz inequality, Mathematics of Operations Research, 35 (2010),pp. 438–457.

[2] S. Banert and R. I. Bot, A general double-proximal gradient algorithm for d.c.programming, Mathematical Programming, 178 (2019), pp. 301–326.

[3] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone OperatorTheory in Hilbert Spaces, Springer, Cham, 2017.

[4] J. Bolte, A. Daniilidis, and A. Lewis, The Lojasiewicz inequality for nonsmoothsubanalytic functions with applications to subgradient dynamical systems, SIAM Journalon Optimization, 17 (2007), pp. 1205–1223.

32

Page 33: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

[5] J. Bolte, A. Daniilidis, A. Lewis, and M. Shiota, Clarke subgradients of strati-fiable functions, SIAM Journal on Optimization, 18 (2007), pp. 556–572.

[6] J. Bolte, A. Daniilidis, O. Ley, and L. Mazet, Characterizations of Lojasiewiczinequalities: subgradient flows, talweg, convexity, Transactions of the American Mathe-matical Society, 362 (2010), pp. 3319–3363.

[7] J. Bolte, T. P. Nguyen, J. Peypouquet, and B. Suter, From error boundsto the complexity of first-order descent methods for convex functions, MathematicalProgramming, 165 (2017), pp. 471–507.

[8] J. Bolte, S. Sabach, and M. Teboulle, Proximal alternating linearized minimiza-tion for nonconvex and nonsmooth problems, Mathematical Programming, 146 (2014),pp. 459–494.

[9] K. Kurdyka, On gradients of functions definable in o-minimal structures, Annales del’institut Fourier, 48 (1998), pp. 769–783.

[10] G. Li and T. K. Pong, Calculus of the exponent of Kurdyka- Lojasiewicz inequalityand its applications to linear convergence of first-order methods, Foundations of Com-putational Mathematics, 18 (2018), pp. 1199–1232.

[11] T. X. Liu, T. K. Pong, and A. Takeda, A refined convergence analysis of pDCAe

with applications to simultaneous sparse recovery and outlier detection, ComputationalOptimization and Applications, 73 (2019), pp. 69–100.

[12] S. Lojasiewicz, Une propriete topologique des sous-ensembles analytiques reels, Lesequations aux derivees partielles, 117 (1963), pp. 87–89.

[13] B. S. Mordukhovich, Variational Analysis and Generalized Differentiation I: BasicTheory, Springer-Verlag, Berlin, 2006.

[14] P. Ochs, Y. Chen, T. Brox, and T. Pock, iPiano: inertial proximal algorithm fornonconvex optimization, SIAM Journal on Imaging Sciences, 7 (2014), pp. 1388–1419.

[15] C. C. Pugh, Real Mathematical Analysis, Springer, 2015.

[16] R. T. Rockafellar, Convex Analysis, Princeton University Press, 1970.

[17] R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, Springer-Verlag,Berlin, 1998.

[18] K. R. Stromberg, An Introduction to Classical Real Analysis, American Mathemat-ical Society, 2015.

33

Page 34: The exact modulus of the generalized Kurdyka-Łojasiewicz ...The exact modulus of the generalized Kurdyka-Łojasiewicz property Xianfu Wang and Ziyuan Wangy August 24, 2020 Abstract

[19] B. Wen, X. J. Chen, and T. K. Pong, A proximal difference-of-convex algorithmwith extrapolation, Computational Optimization and Applications, 69 (2018), pp. 297–324.

[20] J.-H. Won, J. Xu, and K. Lange, Projection onto Minkowski sums with applicationto constrained learning, in Proceedings of the 36th International Conference on MachineLearning, vol. 97 of Proceedings of Machine Learning Research, PMLR, 2019, pp. 3642–3651.

[21] P. Yu, G. Li, and T. K. Pong, Deducing Kurdyka- Lojasiewicz exponent via inf-projection, arXiv:1902.03635, (2019).

[22] P. Yu, T. K. Pong, and Z. S. Lu, Convergence rate analysis of a sequential convexprogramming method with line search for a class of constrained difference-of-convexoptimization problems, arXiv:2001.06998, (2020).

34