20
Data Structures and Algorithms (CS210/ESO207/ESO211)  Lecture 30 Linear time algorithm for ith order statistic of a set S.  1

Lecture-30-CS210-2012.pptx

Embed Size (px)

Citation preview

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 1/20

Data Structures and Algorithms

(CS210/ESO207/ESO211) 

Lecture 30

• Linear time algorithm for ith order statistic of a set S. 

1

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 2/20

Problem definition

Given a set of  elements and a positive integer ≤ , compute th smallest element

from .

Applications: As wide as that of sorting.

Trivial algorithm: Sort  

AIM: To design an algorithm with O() time complexity.

Assumption (For the sake of neat description and analysis of algorithms of this lecture):

• All elements of are assumed to be distinct. 

2

But sorting takes O(n log n) time

and appears to be an overkill

for this simple problem.

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 3/20

A motivational background

Though it was intuitively appealing to believe that there exists an O() time algorithm to

compute th smallest element, it remained a challenge for many years to design such an

algorithm… 

In 1972, five well known researchers: Blum, Floyd, Pratt, Rivest, and Tarjan designed the

O() time algorithm. It was designed during a tea break of a conference when these five

researchers sat together for the first time to solve the problem.

In this way, the problem which remained unsolved for many years got solved in less than

an hour. But one should not ignore the efforts these researchers spent for years beforearriving at the solution … It was their effort whose fruit got ripened in that hour.

3

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 4/20

Notations

4

We shall now introduce some notations which will

help in a neat description of the algorithm.

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 5/20

Notations

•  : the given set of  elements.

•  : th smallest element of .

• < : subset of  consisting of all elements smaller than .

• > : subset of  consisting of all elements greater than .

• rank(,) = 1 + number of elements in that are smaller than .

Example: smallest element of has rank 1 and the largest element of has rank .

Partition(,): algorithm to partition  into < and >;this algorithm returns (<, >, ) where =rank(,).

5

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 6/20

Algorithm 1

6

QuickSelect(,)

(algorithm derived from QuickSort)

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 7/20

Pseudocode for QuickSelect(,) 

QuickSelect(,)

{ Pick an element from ;

(<, >, )  Partition(,);

If( = ) return ;

Else If (< )

QuickSelect(<,)

Else

QuickSelect(>, );

}

Worst case time complexity : O()

Average case time complexity: O() We shall prove it in the next class.

7

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 8/20

Towards worst case O() time algorithm … 

8

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 9/20

Key ideas

• Learning from recurrence of type I: T() = c + T(9/10)

• Concept of approximate median

• Learning from QuickSelect(,)

• Learning from recurrence of type II: T() = c + T(/6) + T(5/7)

9

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 10/20

Learning from recurrence of type I

T() = c + T(9/10)

Question:  what is the solution of recurrence T() = c + T(9/10) ?

Answer: O().

Sketch (by gradual unfolding):

T() = c + c / + c /  + … 

= c[ + / + / + …] 

= O()

Lesson learnt: Solution for T() = c + T() is O() if .

10

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 11/20

Concept of approximate median 

Definition: 

Given a constant 0 ≤ / , an element  ϵ  is said to be -approximate

median of  if rank(, ) is in the range [, (1 )]. 

11

  2     (−) 

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 12/20

Learning from QuickSelect(,) 

QuickSelect(,)

{ Pick an element from ;

(<, >, )  Partition(,);

If( = ) return ;

Else If (< )

QuickSelect(<,)

Else

QuickSelect(>, );

}Question: If  is a -approximate median of , what is the time complexity of the

above algorithm ?

Answer: T() =   + T(( ))

= O() [Learning from Recurrence of type 1] 

12

O()

T( ( ))

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 13/20

Learning from recurrence of type - II

Question:  what is the solution of recurrence T() = c + T(/6) + T(5/7) ?

Answer: O().

Sketch: (by induction)

Assertion: T() ≤ .

Induction step: T() = c + T(/6) + T(5/7)

≤ c + 37

≤   if ≥ 4

 c

Lesson learnt from the above proof ?: Solution for T() = c + T() + T() is O() if + .

13

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 14/20

Algorithm 2

14

Select(,)

(A linear time algorithm)

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 15/20

OverviewSelect(,)

{ Compute a -approximate median, say , of  ;

(<, >, )  Partition(,);

If( = ) return ;

Else If (< )

Select(<,)Else

Select(>, );

Observation: If we can compute -approximate median in O() time, we get O() time algo.

But that appears too much to expect from us. Isn’t it ?

So what to do ?

Hint: Can you refine the above observation using Lessons you learnt from Recurrence of type II ?

Observation: If we can compute -approximate median in O()+ T() time for +( ) < 1, the

time complexity will still be O().

AIM: How to compute a -approximate median of in O()  + T() time with +( ) < 1 ? 15

O()

T( ( ))

O()+ T() 

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 16/20

Computing approximate median of  with

desired parametersThis step forms the core of the algorithm and is indeed a brilliant stroke of inspiration.

The student is strongly recommended to ponder over this idea from various angles.

• Divide  into groups of 5 elements; 

• Compute median of each group by sorting;

Let  be the set of medians;• Compute median of , let it be ;

Question: Is  an approximate median of  ?

Answer: indeed.

The rank of  in is /. Each element in  has two elements smaller than itself in itsrespective group. Hence there are at least 

   elements in which are smaller than .

In a similar way, there are at least

elements in which are greater than . Hence,  

is

 -approximate median of .

(See the animation on the following slide to get a better understanding of this explanation.)16

O()

T(/5) 

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 17/20

1 2 …

  …

 

Median of  is

 -approximate median of  

We assumed that  is multiple of 5; Otherwise it can be shown that  is

approximate median.

This will not affect the asymptotic solution of the recurrence.

17

 

Increasing order of values

  

Surely smaller than  

Surely greater than  

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 18/20

Pseudocode for Select(,) 

Select(,) ∅;

Divide  into groups of 5 elements; 

Sort each group and add its median to ;

Select(,||/2);

(<, >, )  Partition(,);

If( = ) return ;

Else If (< )

Select(<,)

ElseSelect(>, );

18

O(n)

T(7n/10)

O(n)

T(n/5)

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 19/20

Analysis

T(n) = cn + T(n/5) + T(7n/10)

= O(n) [Learning from Recurrence of type II] 

Theorem: Given any of n elements, we can compute ith smallest element

from  in O(n) worst case time.

19

8/11/2019 Lecture-30-CS210-2012.pptx

http://slidepdf.com/reader/full/lecture-30-cs210-2012pptx 20/20

Exercises 

(Attempting these exercises will give you a better insight into the algorithm.)

• What is magical about number 5 in the algorithm ?

• What if we divide the set  into groups of size 3 ?

• What if we divide the set  into groups of size 7 ?

• What if we divide the set  into groups of even size (e.g. 4 or 6) ?

20