64
1 Greedy algorithm 叶叶叶 [email protected]

Greedy algorithm

Embed Size (px)

DESCRIPTION

Greedy algorithm. 叶德仕 [email protected]. Greedy algorithm’s paradigm. Algorithm is greedy if it builds up a solution in small steps it chooses a decision at each step myopically to optimize some underlying criterion Analyzing optimal greedy algorithms by showing that: - PowerPoint PPT Presentation

Citation preview

Page 1: Greedy algorithm

1

Greedy algorithm

叶德仕[email protected]

Page 2: Greedy algorithm

2

Greedy algorithm’s paradigm

Algorithm is greedy ifit builds up a solution in small stepsit chooses a decision at each step myopically

to optimize some underlying criterionAnalyzing optimal greedy algorithms by

showing that:in every step it is not worse than any other

algorithm, or every algorithm can be gradually transformed

to the greedy one without hurting its quality

Page 3: Greedy algorithm

3

Interval scheduling Input: set of intervals on the line, represented by

pairs of points (ends of intervals). In another word, the ith interval, starts at time si and finish at

fi.

Output: finding the largest set of intervals such that none two of them overlap. Or the maximum number of intervals that without overlap.

Greedy algorithm: Select intervals one after another using some rule

Page 4: Greedy algorithm

4

Rule 1

Select the interval which starts earliest (but not overlapping the already chosen intervals)

Underestimated solution!

Algorithm #1

OPT #4

Page 5: Greedy algorithm

5

Rule 2

Select the interval which is shortest (but not overlapping the already chosen intervals)

Underestimated solution!

Algorithm #1

OPT #2

Page 6: Greedy algorithm

6

Rule 3

Select the interval with the fewest conflicts with other remaining intervals (but still not overlapping the already chosen intervals)

Underestimated solution!

Algorithm #3

OPT #4

Page 7: Greedy algorithm

7

Rule 4

Select the interval which ends first (but still not overlapping the already chosen intervals)

Quite a nature idea: we ensure that our resource become free as soon as possible while still satisfying one request

Hurray! Exact solution!

Page 8: Greedy algorithm

8

f1 smallest

Algorithm #3

Page 9: Greedy algorithm

9

Analysis - exact solution

Algorithm gives non-overlapping intervals: obvious, since we always choose an interval which

does not overlap the previously chosen intervals

The solution is exact: Let A be the set of intervals obtained by the

algorithm, and OPT be the largest set of pairwise non-

overlapping intervals. We show that A must be as large as OPT

Page 10: Greedy algorithm

10

Analysis – exact solution cont.

Let and be sorted. By definition of OPT we have k ≤ m

Fact: for every i ≤ k, Ai finishes not later than Bi.

Pf. by induction.For i = 1 by definition of a step in the algorithm.Suppose that Ai-1 finishes not later than Bi-1.

1{ , , }kA A A 1{ , , }mOPT B B

Page 11: Greedy algorithm

11

Analysis con.

From the definition of a step in the algorithm we get that Ai is the first interval that finishes after Ai-1 and does not verlap it.

If Bi finished before Ai then it would overlap some of the previous A1 ,…, Ai-1 and

consequently - by the inductive assumption - it would overlap Bi-1, which would be a contradiction.

Bi-1

Ai-1

Bi

Ai

Page 12: Greedy algorithm

12

Analysis con.

Theorem: A is the exact solution.Proof: we show that k = m.Suppose to the contrary that k < m. We

have that Ak finishes not later than Bk

Hence we could add Bk+1 to A and obtain bigger solution by the algorithm-contradiction Bk

Ak

Bk-1

Ak-1

Bk+1

Page 13: Greedy algorithm

13

Time complexity

Sorting intervals according to the right-most ends For every consecutive interval:

If the left-most end is after the right-most end of the last selected interval then we select this interval

Otherwise we skip it and go to the next interval

Time complexity: O(n log n + n) = O(n log n)

Page 14: Greedy algorithm

14

Planning of schools

A collection of towns. We want to plan schools in towns. Each school should be in a townNo one should have to travel more than 30

miles to reach one of them.Edge: towns no far than 30

miles

Page 15: Greedy algorithm

15

Set cover

Input. A set of elements B, sets Output. A selection of the Si whose union

is B.Cost. Number of sets picked.

1, mS S B

Page 16: Greedy algorithm

16

Greedy

Greedy: first choose a set that covers the largest number of elements. example: place a school at town a, since this

covers the largest number of other towns.

Greedy #4

OPT #3

Page 17: Greedy algorithm

17

Upper bound

Theorem. Suppose B contains n elements that the optimal cover consist of k sets. Then the greedy algorithm will use at most k ln n sets.

Pf. Let nt be the number of elements still not covered after t iterations of the greedy algorithm (n0=n). Since these remaining elements are covered by the optimal k sets, there must be some set with at least nt /k of them. Therefore, the greedy algorithm will ensure that

1

1(1 )t

t t t

nn n n

k k

Page 18: Greedy algorithm

18

Upper bound con. Then , since for all x, with equality if and only if x=0.

Thus

At t=k ln n, therefore, nt is strictly less than ne-ln n =1, which means no elements remains to be covered.

Consequently, the approximation ratio is at most ln n

0

1(1 )t

tn nk

1 xx e

1/ /0 0

1(1 ) ( )t k t t k

tn n n e nek

Page 19: Greedy algorithm

19

Exercise

Knapsack problem

Page 20: Greedy algorithm

20

Marking Changes

Goal. Given currency denominations in HK: 1, 2, 5, 10, 20, 50, 100, 500, and 1000, devise a method to pay amount to customer using fewest number of notes/coins.

Cashier's algorithm. At each iteration, add note/coin of the largest value that does not take us past the amount to be paid.

Page 21: Greedy algorithm

21

Optimal Offline CachingCaching.

Cache with capacity to store k items. Sequence of m item requests d1, d2, …, dm. Cache hit: item already in cache when requested. Cache miss: item not already in cache when requested: must

bring requested item into cache, and evict some existing item, if full. (It also refers to the operation of bringing an item into cache.)

Goal. Eviction schedule that minimizes number of cache misses.

Ex: k = 2, initial cache = ab, requests: a, b, c, b, c, a, a, b.Optimal eviction schedule: 2 cache misses.

a b

a b

c b

c b

c b

a b

a

b

c

b

c

a

a ba

a bb

cacherequests

Page 22: Greedy algorithm

22

Optimal Offline Caching: Farthest-In-FutureFarthest-in-future. Evict item in the cache that is not requested until farthest in the future.

Theorem. [Bellady, 1960s] FF is optimal eviction schedule.Pf. Algorithm and theorem are intuitive; proof is subtle.

a b

g a b c e d a b b a c d e a f a d e f g h ...

current cache: c d e f

future queries:

cache miss eject this one

Page 23: Greedy algorithm

23

Minimum spanning tree

Input: weighted graph G = (V,E) every edge in E has its positive weight Output: finding the spanning tree such that the sum

of weights is not bigger than the sum of weights of any other spanning tree

Spanning tree: subgraph with no cycle, and connected (every two nodes in V are connected by a

path)

11

2

23

11

2

23

11

2

23

Page 24: Greedy algorithm

24

Properties of minimum spanning trees MST

Spanning trees: n nodes n - 1 edges at least 2 leaves (leaf - a node with only one neighbor)MST cycle property: After adding an edge we obtain exactly one cycle and all

the edges from MST in this cycle have no bigger weight than the weight of the added edge

11

2

23

11

2

23 cycle

Page 25: Greedy algorithm

25

Optimal substructures

MST T:(Other edges of Gare not shown.)

Page 26: Greedy algorithm

26

Optimal substructures

MST T:(Other edges of Gare not shown.)

Remove any edge (u, v) ∈ T.

u

v

Page 27: Greedy algorithm

27

Optimal substructures

MST T:(Other edges of Gare not shown.)

Remove any edge (u, v) ∈ T. Then, T is partitionedinto two subtrees T1 and T2.

T1

T2

Page 28: Greedy algorithm

28

Optimal substructures

MST T:(Other edges of Gare not shown.)

Remove any edge (u, v) ∈ T. Then, T is partitionedinto two subtrees T1 and T2.

T1

T2

Theorem. The subtree T1 is an MST of G1 = (V1, E1),the subgraph of G induced by the vertices of T1:V1 = vertices of T1,E1 = { (x, y) ∈ E : x, y ∈ V1 }.Similarly for T2.

Page 29: Greedy algorithm

29

Proof of optimal substructure

Proof. Cut and paste: w(T) = w(u, v) + w(T1) + w(T2).If T1′ were a lower-weight spanning tree

than T1 for G1, then

T′ = {(u, v)} ∪ T1′ ∪ T2 would be a lower-weight spanning tree

than T for G.

Page 30: Greedy algorithm

30

Do we also have overlapping subproblems?Yes.

Great, then dynamic programming may work!Yes, but MST exhibits another powerful

property which leads to an even more efficient algorithm.

Page 31: Greedy algorithm

31

Crucial observation about MST

Consider sets of nodes A and V - ALet F be the set of edges between A and V - ALet a be the smallest weight of an edge from F Theorem:Every MST must contain at least one edge of weight afrom set F

11

2

23

11

2

23

A A

Page 32: Greedy algorithm

32

Proof of the observation

Let e be the edge in F with the smallest weight - for simplicity assume that there is unique such edge. Suppose to the contrary that e is not in some MST. Choose one such MST.Add e to MST - obtain the cycle, where e is (among) smallest weights. Since two ends of e are in different sets A and V - A, there is another edge f in the cycle and in F. Remove f from the tree (with added edge e) - obtain a spanning tree with the smaller weight(since f has bigger weight than e). This is a contradiction with MST.

11

2

23

11

2

23

A A

Page 33: Greedy algorithm

33

Greedy algorithm finding MST

Kruskal’s algorithm: Sort all edges according to the weights in non-

increasing order Choose n - 1 edges one after another as follows:

If a new added edge does not create a cycle with previously selected then we keep it in (partial) MST, otherwise we remove it

Remark: we always have a partial forest

11

2

23

11

2

23

11

2

23

Page 34: Greedy algorithm

34

Greedy algorithm finding MST

Prim’s algorithm: Select a node as a root arbitrarily Choose n - 1 edges one after another as follows:

Look on all edges incident to the currently build (partial) tree and which do not create a cycle in it, and select one which has the smallest weight

Remark: we always have a connected partial tree

11

2

23

11

2

23

11

2

23

root

Page 35: Greedy algorithm

35

Example of Prim

A

V - A

612

5

148

3 10

7

9

15

Page 36: Greedy algorithm

36

Example of Prim

A

V - A

612

5

148

310

7

9

15

Page 37: Greedy algorithm

37

Example of Prim

7

0

A

V - A

612

5

148

310

7

9

15

Page 38: Greedy algorithm

38

Example of Prim

7

0

A

V - A

612

5

148

310

7

9

15

Page 39: Greedy algorithm

39

Example of Prim

5 7

0

A

V - A

612

5

148

310

7

9

15

Page 40: Greedy algorithm

40

Example of Prim

6

5 7

0

A

V - A

612

5

148

310

7

9

15

Page 41: Greedy algorithm

41

Example of Prim

6

5 7

0

8

A

V - A

612

5

148

310

7

9

15

Page 42: Greedy algorithm

42

Example of Prim

6

5 7

0

8

A

V - A

612

5

148

310

7

9

15

Page 43: Greedy algorithm

43

Example of Prim

6

5

3

7

0

8

A

V - A

612

5

148

310

7

9

15

Page 44: Greedy algorithm

44

Example of Prim

6

5

3

7

0

8

9

A

V - A

612

5

148

310

7

9

15

Page 45: Greedy algorithm

45

Example of Prim

6

5

3

7

0

8

9

15

A

V - A

612

5

148

310

7

9

15

Page 46: Greedy algorithm

46

Example of Prim

6

5

3

7

0

8

9

15

A

V - A

612

5

148

310

7

9

15

Page 47: Greedy algorithm

47

Why the algorithms work?

Follows from the crucial observationKruskal’s algorithm: Suppose we add edge {v,w}. This edge has the

smallest weight among edges between the set of nodes already connected with v (by a path in selected subgraph) and other nodes.

Prim’s algorithm: Always chooses an edge with the smallest weight

among edges between the set of already connected nodes and free nodes.

Page 48: Greedy algorithm

48

Time complexity

There are implementations using Union-find data structure (Kruskal’s algorithm)Priority queue (Prim’s algorithm)

achieving time complexity

O(m log n)

where n is the number of nodes and m is the

number of edges

Page 49: Greedy algorithm

49

Best of MST

Best to date:Karger, Klein, and Tarjan [1993].Randomized algorithm.O(V + E) expected time.

Page 50: Greedy algorithm

50

Conclusions

Greedy algorithms for finding minimum

spanning tree in a graph, both in time

O(m log n) :Kruskal’s algorithmPrim’s algorithm

Remains to design the efficient data structures!

Page 51: Greedy algorithm

51

Conclusions

Greedy algorithms: algorithms constructing solutions step after step using a local rule

Exact greedy algorithm for interval selection problem - in time O(n log n) illustrating “greedy stays ahead” rule

Greedy algorithm may not produce optimal solution such as set cover problem

Matroids can help to prove when will greedy can lead to optimal solution

Minimum spanning tree could be solved by greedy method in O(m log n)

Page 52: Greedy algorithm

52

Matroids When will the greedy algorithm yields optimal solutions? Matroids [Hassler Whitney]: A matroid is an ordered pair

M=(S, ℓ) satisfying the following conditions. S is a finite nonempty setℓ is a nonempty family of subsets of S, called the

independent subsets of S, such that if

We say that ℓ is hereditary if it satisfies this property. Note that empty set is necessarily a member of ℓ.

If , then there is some element such that . We say that M satisfies the

exchage property.

, , .B and A B then A

| |, , | | BA B and A x B A { }A x

Page 53: Greedy algorithm

53

Theorem con.

Finally, Lemma 3 implies that the remaining problem is one of finding an optimal subset in the matroid M′ that is the contraction of M by x.

After the procedure GREEDY sets A to {x}, all of its remaining steps can be interpreted as acting in the matroid M′ = (S′,ℓ′), because B is independent in M′ if and only if B {∪ x} is independent in M, for all sets B

ℓ′.∈ Thus, the subsequent operation of GREEDY will find

a maximum-weight independent subset for M′, and the overall operation of GREEDY will find a maximum-weight independent subset for M.

Page 54: Greedy algorithm

54

Max independent

Theorem. All maximal independent subsets in a matroid have the same size.

Pf. Suppose to the contrary that A is a maximal independent subset of M and there exists another larger maximal independent subset B of M. Then, the exchange property implies that A is extendible to a larger independent set A {x}∪ for some x B - A∈ , contradicting the assumption that A is maximal.

Page 55: Greedy algorithm

55

Weighted Matroid

We say that a matroid M = (S,ℓ) is weighted if there is an associated weight function w that assigns a strictly positive weight w(x) to each element x ∈ S. The weight function w extends to subsets of S by summation:

for any A ⊆ S.

( ) ( )x A

w A w x

Page 56: Greedy algorithm

56

Greedy algorithms on a weighted matroid

Many problems for which a greedy approach provides optimal solutions can be formulated in terms of finding a maximum-weight independent subset in a weighted matroid.

That is, we are given a weighted matroid M = (S,ℓ), and we wish to find an independent set A ℓ∈ such that w(A) is maximized.

We call such a subset that is independent and has maximum possible weight an optimal subset of the matroid. Because the weight w(x) of any element x ∈ S is positive, an optimal subset is always a maximal independent subset-it always helps to make A as large as possible.

Page 57: Greedy algorithm

57

Greedy algorithm

GREEDY(M, w)

1. A ← Ø

2. sort S[M] into monotonically decreasing order by weight w

3. for each x ∈ S[M], taken in monotonically decreasing order by weight w(x)

4. do if A {∪ x} ℓ [∈ M]

5. then A ← A {∪ x}

6. return A

Page 58: Greedy algorithm

58

Lemma Lemma 1. Suppose that M = (S,ℓ) is a weighted matroid with

weight function w and that S is sorted into monotonically decreasing order by weight. Let x be the first element of S such that {x} is independent, if any such x exists. If x exists, then there exists an optimal subset A of S that contains x.

Pf. If no such x exists, then the only independent subset is the

empty set and we're done. Otherwise, let B be any nonempty optimal subset. Assume that x ∉ B; otherwise, we let A = B and we're done.

No element of B has weight greater than w(x). To see this, observe that y ∈ B implies that {y} is independent, since B ℓ∈ and ℓ is hereditary. Our choice of x therefore ensures that w(x) ≥ w(y) for any y ∈ B.

Page 59: Greedy algorithm

59

Lemma

Construct the set A as follows. Begin with A = {x}. By the choice of x, A is independent. Using the exchange property, repeatedly find a new element of B that can be added to A until |A| = |B| while preserving the independence of A. Then, A = B - {y}

{∪ x} for some y ∈ B, and so w(A)=w(B) - w(y) + w(x) ≥ w(B).

Because B is optimal, A must also be optimal, and because x ∈ A, the lemma is proven.

Page 60: Greedy algorithm

60

Lemma

Lemma 2. Let M = (S,ℓ) be any matroid. If x is an element of S that is an extension of some independent subset A of S, then x is also an extension of Ø.

Pf. Since x is an extension of A, we have that A ∪{x} is independent. Since ℓ is hereditary, {x} must be independent. Thus, x is an extension of Ø.

It is shown that if an element is not an option initially, then it cannot be an option later.

Page 61: Greedy algorithm

61

Corollary

Corollary Let M = (S,ℓ) be any matroid. If x is an element of S such that x is not an extension of Ø, then x is not an extension of any independent subset A of S.

Any element that cannot be used immediately can never be used. Therefore, GREEDY cannot make an error by passing over any initial elements in S that are not an extension of Ø, since they can never be used.

Page 62: Greedy algorithm

62

Lemma 3. Let x be the first element of S chosen by GREEDY for the weighted matroid M = (S,ℓ). The remaining problem of finding a maximum-weight independent subset containing x reduces to finding a maximum-weight independent subset of the weighted matroid M′ = (S′,ℓ), where

S′ ={y ∈ S : {x, y} ℓ},∈ ℓ′ ={B ⊆ S - {x} : B {∪ x} ℓ},∈

and the weight function for M′ is the weight function for M, restricted to S′. (We call M′ the contraction of M by the element x.)

Page 63: Greedy algorithm

63

Proof If A is any maximum-weight independent subset of M

containing x, then A′ = A - {x} is an independent subset of M′. Conversely, any independent subset A′ of M′ yields an independent subset A = A′ {∪ x} of M. Since we have in both cases that w(A) = w(A′) + w(x), a maximum-weight solution in M containing x yields a maximum-weight solution in M′, and vice versa.

Page 64: Greedy algorithm

64

Theorem

Theorem. If M = (S,ℓ) is a weighted matroid with weight function w, then GREEDY(M, w) returns an optimal subset.

Pf. By Corollary, any elements that are passed over initially because they are not extensions of Ø can be forgotten about, since they can never be useful.

Once the first element x is selected, Lemma 1 implies that GREEDY does not err by adding x to A, since there exists an optimal subset containing x.