알고리즘2장_7장(sort)

Embed Size (px)

Citation preview

  • 8/3/2019 2 _7 (sort)

    1/55

    Sorting- CH 2 & CH 7 -

  • 8/3/2019 2 _7 (sort)

    2/55

    Sorting Reasons for choosing this problem

    Quite a few algorithms have been devisedthat solve the problem

    Learn how to choose among several algorithms and

    Learn how to improve a given algorithm

    One of the few problemsfor which we have developed algorithms

    whose time complexities are as good as lower bound

    CH2(2.2, 2.4) & CH7

  • 8/3/2019 2 _7 (sort)

    3/55

    CH2. Divide-and-Conquer Divide-and-Conquer :

    Divide the problem into a number of subproblems Conquer the subproblems by solving them recursively

    If the subproblem sizes are small enough,

    just solve the subproblems in a straightforward manner

    Combine solutions to subprobs into solution for original prob Top-down approach

    Solution to a top-level instance of a problem is obtained by

    going down and obtaining solutions to smaller instances

    can be implemented as recursive call

  • 8/3/2019 2 _7 (sort)

    4/55

    2.2 Mergesort 2-way merging:

    combine two sorted arrays into one sorted array

    Divide array into two subarrays each with n/2 items Conquer each subarray by sorting it

    Unless array is sufficiently small, userecursion to do this Combine the solutions to the subarrays

    by merging them into a single sorted array

    Ex. 27, 10, 12, 20, 25, 13, 15, 22

  • 8/3/2019 2 _7 (sort)

    5/55

    Mergesort Algorithmvoid mergesort (int n, keytype S[]) {

    const int h = n / 2, m = n - h;

    keytype U[1..h], V[1..m];

    if (n > 1) {copy S[1] through S[h] to U[1] through U[h];

    copy S[h+1] through S[n] to V[1] through V[m];

    mergesort(h,U);mergesort(m,V);merge(h,m,U,V,S);

    }

    }

  • 8/3/2019 2 _7 (sort)

    6/55

    Merge Problem: Merge two sorted arrays into one sorted array Input: positive integers hand m

    sorted arrays U[1..h] & V[1..m]

    Output: an sorted array S[1..h+m],containing the keys in U& Vin a single sorted array

    Ex. U=10,12,20,27, V=13,15,22,25

  • 8/3/2019 2 _7 (sort)

    7/55

    Merge Algorithmvoid merge(h, m, U[], V[], S[]) {

    index i, j, k;

    i = 1; j = 1; k = 1;

    while (i

  • 8/3/2019 2 _7 (sort)

    8/55

    Space Complexity In-Place

    Not use any extra space

    beyond that needed to store the input

    Algo 2.2 is not an in-place algo

    because it uses arrays U& Vbesides input array S

    It is possible to reduce the amount of extra space

    to only one array containing nitems

    by doing much of manipulation on the input array S

  • 8/3/2019 2 _7 (sort)

    9/55

    Mergesort2 Algorithmvoid mergesort2(index low, index high) {

    index mid;

    if (low < high) {

    mid = (low + high) / 2;

    mergesort2(low, mid);mergesort2(mid+1, high);merge2(low, mid, high);

    }

    }

  • 8/3/2019 2 _7 (sort)

    10/55

    Merge2 Problem: Merge two sorted subarrays of S

    created in Mergesort2

    Input: indices low,mid,highsubarray of Sindexed from lowto high

    The keys in array slots from lowto midare already sorted,as are the keys in array slots from mid+1 to high

    Output: subarray of S, indexed from lowto highEx. S = 10,12,20,27, 13,15,22,25

  • 8/3/2019 2 _7 (sort)

    11/55

    Merge2 Algorithmvoid merge2(index low, index mid, index high) {

    index i, j, k; U[low..high];

    i = low; j = mid + 1; k = low;

    while (i

  • 8/3/2019 2 _7 (sort)

    12/55

    2.4 Quicksort(Partition Exchange Sort) Developed by Hoare, 1962

    the array is partitioned by

    placing all items smaller than some pivot

    before that item and all items larger than or equal to the pivot item

    after it

    each partition is sorted recursively

    Ex. 15(pivot), 22, 13, 27, 12, 10, 20, 25

  • 8/3/2019 2 _7 (sort)

    13/55

    Quicksort Algorithmvoid quicksort(index low, index high) {

    index pivotpoint;

    if (high > low) {

    partition(low, high, pivotpoint);quicksort(low, pivotpoint-1);

    quicksort(pivotpoint+1, high);

    }

    }

  • 8/3/2019 2 _7 (sort)

    14/55

    Partition Problem: Partition the array Sfor Quicksort Input: indices low, high, subarray of S

    indexed from lowto high

    Output: pivot point for the subarrayindexed from lowto high

    Ex. 15(pivot), 22, 13, 27, 12, 10, 20, 25

  • 8/3/2019 2 _7 (sort)

    15/55

    Partition Algorithmvoid partition(index low, index high, index& pivotpoint) {

    index i, j; keytype pivotitem;

    pivotitem = S[low]; //choose first item for pivotitem

    j = low;

    for (i = low + 1; i

  • 8/3/2019 2 _7 (sort)

    16/55

    Quicksort Algorithm Analysis Worst-Case Time Complexity

    B.O. : comparison of S[i] with pivotitemin partition

    Input size : n, the no of items in the array S

    Worst-Case Scenario: ???

    T(n) = T(0) + T(n-1) + n -1 // T(0) = 0

    = T(n-1) + n - 1, for n >0

    = n(n-1) / 2 // Example B.16 W(n) T(n) = n(n-1) / 2 // by using induction

    (n2)

  • 8/3/2019 2 _7 (sort)

    17/55

    Quicksort Algorithm Analysis Average-Case Time Complexity

    Assume that

    the value of pivotpointreturned by partition

    is equally likely to be any of the numbersfrom 1 through n

    A(n) = ???

    A(n) (n lg n) // Example B.22

  • 8/3/2019 2 _7 (sort)

    18/55

    Remind: Exchange Sortvoid exchangesort(int n, keytype S[]) {index i, j;

    for (i = 1; i

  • 8/3/2019 2 _7 (sort)

    19/55

    CH7. Computational Complexity - Sorting It will take years to sort 1 billion keys using a (n2) algo

    Suppose someone wanted 1 billion keys to be sorted inreal-time

    There are two approaches for this problem

    Try to develop a more efficient algo for the prob

    Try to prove that a more efficient algo is impossible

    Once we have such a proof, we know that

    we should quit trying to obtain a faster algorithm

    Actually, we have proven that

    an algorithm better than (nlgn) is not possible

  • 8/3/2019 2 _7 (sort)

    20/55

  • 8/3/2019 2 _7 (sort)

    21/55

    7.1 Computational Complexity Exchange sort: (n2)

    This does not mean that

    the problem of sorting requires n2

    The function is a property of that one algo,not necessarily a property of the prob

    Mergesort: (nlg n)

  • 8/3/2019 2 _7 (sort)

    22/55

    Computational Complexity An important question is

    whether it is possible to find an even more efficient algo

    Computation complexity is study of all possible algosthat can solve a given problem

    A computational complexity analysis tries todetermine a lower bound

    on the efficiency of all algorithms

    for a given problem

  • 8/3/2019 2 _7 (sort)

    23/55

    Computational ComplexityEx. Suppose lower bound for problem is (nlg n)

    It does not mean that it must be possible

    to create a (nlg n) algorithm for that problem

    It means only that it is impossibleto create one that is better than (nlg n)

    Sorting problem is one of few problems

    for which we have been successful in developing algoswhose time complexities are as good as lower bound

  • 8/3/2019 2 _7 (sort)

    24/55

    7.2 Insertion & Selection SortInsertion Sort sort by inserting records in an existing sorted array

    Ex) 8 4 2 7 9 5 13

    void insertionsort(int n, keytype S[]) {index i,j;keytype x;for (i=2; i0 && S[j]>x) {

    S[j+1] = S[j];j--; }

    S[j+1] = x; } }

  • 8/3/2019 2 _7 (sort)

    25/55

    Insertion Sort Algorithm Analysis Worst-Case Time Complexity No of Comparisons of Keys:

    Basic Operation: comparison of S[j] with x

    For a given i,

    the comparison(in while-loop) is done at most i-1 times

    Total no of comparisons is at most ???

    i (5, 4, 3, 2, 1)2 (4, 5, 3, 2, 1)3 (3, 4, 5, 2, 1)4 (2, 3, 4, 5, 1)5 (1, 2, 3, 4, 5)

  • 8/3/2019 2 _7 (sort)

    26/55

    Insertion Sort Algorithm Analysis Extra Space Analysis:

    The only space usage that increases with nis

    the size of the input array

    Therefore, the algo is an in-place sort The extra space is in (1)

  • 8/3/2019 2 _7 (sort)

    27/55

    Selection Sort A slight modification of

    Exchange Sort

    The assignments ofrecords are significantly

    different Simply keeps track of the

    index of the currentsmallest key among thekeys in the ith through the

    nth slots After determining thatrecord, it exchanges itwith the record in the ithslot

    void selectionsort(n, S[]){

    index i,j,smallest;

    for(i=1; i

  • 8/3/2019 2 _7 (sort)

    28/55

    7.3 Lower Bound - algorithms that removeat most one inversion per comparisonBecause there are n! permutations of the first npositive integers,there are n! different orderings of those integers

    Denote a permutation by [k1, k2,,,, kn],

    where kiis the integer at the ith position

    An inversion in a permutation isa pair (ki, kj) s.t. i kj

    A permutation contains no inversion

    iff it is the sorted ordering [1, 2, 3, 4, 5, 6]

    The task of sorting ndistinct keys is

    the removal of all inversions in a permutation

  • 8/3/2019 2 _7 (sort)

    29/55

    Lower BoundTheorem 7.1Any algorithm that

    sorts ndistinct keys only by comparisons of keys andremoves at most one inversion after each comparisonmust

    in the worst-case

    do at least comparisons of keys and

    on the avg

    do at least comparisons of keys

    2)1( -nn

    4)1( -nn

  • 8/3/2019 2 _7 (sort)

    30/55

    Lower BoundProofCase 1: Worst-CaseWe need only show that

    there is a permutation with n(n-1)/2 inversions,because

    when that permutation is the input,any algo will have to remove that many inversionsand therefore do at least that many comparisons

    [n, n-1,,,,2, 1]

  • 8/3/2019 2 _7 (sort)

    31/55

    Lower BoundCase 2: Average-Case

    We pair permutation [kn,,,, k2,k1]with the permutation [k1, k2,,,, kn]

    Let rand sbe integers(between 1 and n) such that s> r

    Given a permutation, the pair (s, r) is an inversion ineither the permutation or its transpose, and not in bothThen, there are n(n-1)/2 such pairs of integers between 1 and n a permutation and its transpose have exactly n(n-1)/2

    inversions

    So, the avg no of inversions in a permutation and its transpose is2

    1n(n-1)/2

  • 8/3/2019 2 _7 (sort)

    32/55

    Lower BoundTherefore,

    if we consider all permutations equally probable for input,the avg no of inversions in the input is also n(n-1)/4Because we assumed that

    algo removes at most one inversion after each comparison,

    on the avgit must do at least this many comparisons

    to remove all inversions

  • 8/3/2019 2 _7 (sort)

    33/55

    7.6 Heap Sort Complete Binary Tree

    All internal nodes have two children

    All leaves have depth d

    depth of a node: the no of edges in the unique pathfrom the root to that node

    Essentially Complete Binary Tree It is a CBT down to a depth of d- 1

    The nodes with depth dare as to the left as possible

    CBTCBT(X), ECBT

    X

  • 8/3/2019 2 _7 (sort)

    34/55

    Heap Heap is a data structure

    ECBT(Essentially Complete Binary Tree)

    The value stored at each node is greater than or equal to

    the values stored at its children - Heap Property

  • 8/3/2019 2 _7 (sort)

    35/55

    HeapHeap Viewed as (a) a binary tree and (b) an array

    16 12

    8 7

    2 4 1

    9 3

    1

    2 34 5 6 7

    8 9 10

    18

    18 16 12 8 7 9 3

    1 2 3 4 5 6 7 8 92 4 1

    10

    parent(i) = i/2

    rchild(i) = 2*i+1

    lchild(i) = 2*i

  • 8/3/2019 2 _7 (sort)

    36/55

    SiftDown(Heap) Insert a node at the root into 2 children Heaps

    12 18

    8 7

    2 4 1

    16 3

    1

    23

    4 5 6 7

    8 9 10

    9siftdown(H,1)New value at root.parent

    Assumption: Subtrees Satisfy the Heap property

    Right Child is largerExchange root and right child

  • 8/3/2019 2 _7 (sort)

    37/55

    SiftDown(Heap)siftdown (H,i)

    parent = root at ith position

    largeChild = max (parents children)

    while (parent.key < largeChild.key)exchange parent.key and largeChild.keyparent = largeChild

    largeChild = max (parents children)

  • 8/3/2019 2 _7 (sort)

    38/55

    SiftDown(Heap)

    12 98 7

    2 4 1

    16 3

    1

    2 3

    4 5 6 7

    8 9 10

    18 ParentLeft Child is larger

    Exchange parent and left child

  • 8/3/2019 2 _7 (sort)

    39/55

    SiftDown(Heap)

    12 168 7

    2 4 1

    9 3

    1

    23

    4 5 6 7

    8 9 10

    18

    Satisfy the heap property?What is the run time to do siftdown?

    In-Place?

  • 8/3/2019 2 _7 (sort)

    40/55

    MakeHeap(Heap)makeheap( A )for i length(A)/2 downto 1

    do siftdown(A, i)

    17 3 2 8 7 9 121 2 3 4 5 6 7 8 9

    16 4 1810

    A.length/2 = 5

    3 28 7

    16 4 18

    9 12

    1

    23

    4 5 6 7

    8 9 10

    17

    Algorithm starts here in building heaps.siftdown makes it a heap

  • 8/3/2019 2 _7 (sort)

    41/55

    MakeHeap(Heap)17 3 2 8 18 9 121 2 3 4 5 6 7 8 9

    16 4 710

    3 2

    8 18

    16 4 7

    9 12

    1

    23

    4 5 6 7

    8 9 10

    17i = 4

    siftdown makes this into heap

    this is a heap

  • 8/3/2019 2 _7 (sort)

    42/55

    MakeHeap(Heap)17 3 2 16 18 9 121 2 3 4 5 6 7 8 9

    8 4 710

    3 216 18

    8 4 79 12

    1

    23

    4 5 6 7

    8 9 10

    17

    These are heaps

    i = 3

    Siftdown makes heap

  • 8/3/2019 2 _7 (sort)

    43/55

    MakeHeap(Heap)17 3 12 16 18 9 21 2 3 4 5 6 7 8 9

    8 4 710

    1i = 2

    129 2

    3

    6 7

    17

    316 18

    8 4 7

    2

    4 5

    8 9 10

    1816 3

    8 4 7

    2

    4 5

    8 9 10

    1816 7

    8 4 3

    2

    4 5

    8 9 10

    Siftdown

  • 8/3/2019 2 _7 (sort)

    44/55

    MakeHeap(Heap)17 18 12 16 7 9 21 2 3 4 5 6 7 8 9

    8 4 310

    18 1216 7

    8 4 3

    9 2

    1

    23

    4 5 6 7

    8 9 10

    17i = 1

    12

    16 7

    8 4 3

    9 2

    1

    23

    4 5 6 7

    8 9 10

    18

    17

  • 8/3/2019 2 _7 (sort)

    45/55

    Analysis of MakeHeap(Heap) assume n = 2d (depth : d)

    consider

    0

    1

    .

    .

    d-1d

    Depth0

    12:j:

    d-1

    #nodes20

    2122

    :2j:

    2d-1

    #siftsd-1

    d-2d-3

    :d-j-1:0

  • 8/3/2019 2 _7 (sort)

    46/55

    Analysis of MakeHeap(Heap) Total #sifts : at most

    Actual upper bound :

    Total # comp. :

  • 8/3/2019 2 _7 (sort)

    47/55

    HeapSort

    119 82 5

    3 1 5 64 7

    69 82 5

    3 1 5 64 7

    11Delete max

    7 2 5 3 1 5 611 9 8 4

    A 1 2 3 4 5 6 7 8 9 10 11

  • 8/3/2019 2 _7 (sort)

    48/55

    HeapSortHeapsort ( A )

    1. makeheap (A)

    2. forilength(A) downto 2 do3. exchange A[1] A[ i]4. heapSize[A] heapSize[A] -1

    5. siftdown(A, 1)

    What is the worst case run time?

    Extra Space?

  • 8/3/2019 2 _7 (sort)

    49/55

    Heap Insert (log n) time

    Priority Queue Insert & delete_max

    119 8

    2 53 1 5 6

    4 710

    119 108 5

    3 1 5 64 7

    2

  • 8/3/2019 2 _7 (sort)

    50/55

    Comparison of (nlgn) Sort Algorithms

    Heap

    Quick

    Merge

    Space ComplexityTime ComplexityAlgorithm

    W(n) = O(n2)

    A(n) = O(nlgn)

    W(n) = O(nlgn)

    A(n) = O(nlgn)

    W(n) = O(nlgn)

    A(n) = O(nlgn)

    O(n)

    O(lgn)

    O(1)

    So far, the best sorting algorithms is O(nlg n) in the worst case.

    CAN WE DO BETTER??

  • 8/3/2019 2 _7 (sort)

    51/55

    7.8 Lower Bounds for Comparison-Only SortingsCan we develop sorting algorithms

    whose time complexities are better thanO(nlg n)?

    As long as we limit ourselves to

    sorting only by comparisons of keys,

    such algorithms are not possible

  • 8/3/2019 2 _7 (sort)

    52/55

    Decision Trees for Sorting AlgorithmsSorting three keys(a,b,c)

    At each node, a decision must be made as to which node to visit next

    Sorted keys are stored at the leaves

    There is a leaf for every permutation of three keys,because the algorithm can sort every possible input of size 3

    n! leaves

  • 8/3/2019 2 _7 (sort)

    53/55

    Lower Bounds for Comparison-Only SortingsLemma 7.2: The worst-case number of comparisons

    done by a decision tree is equal to its depth

    Lemma 7.3: If mis the no of leaves in a binary treeand dis the depth, then dlg m

  • 8/3/2019 2 _7 (sort)

    54/55

    Lower Bounds for Comparison-Only SortingsProof: Using induction on d, we show first that 2

    dm

    Induction Base:d= 0: 2

    d 1.

    Induction Hypothesis:Assume that 2

    dm(where mis the no of leaves),

    for any binary tree with depth d

    Induction Step:We need to show that 2

    d+1m(where m is the no of leaves),

    for any binary tree with depth d+ 12

    d+1= 2 2d 2m (Induction Hypothesis)

    m (Each parent can have at most two children)Taking log of both sides, d lg m

  • 8/3/2019 2 _7 (sort)

    55/55

    Lower Bounds for Comparison-Only SortingsTheorem 7.2: Any algorithm that sorts ndistinct keysonly by comparison of keys must in the worst-case

    do at least lg(n!) comparisons of keys

    Theorem 7.3: Any algorithm that sorts ndistinct keysonly by comparison of keys must in the worst-case

    do at least nlg n -1.45n comparisons of keys