Hash Table Ind

Embed Size (px)

Citation preview

  • 8/14/2019 Hash Table Ind

    1/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Data Structures for JavaWilliam H. Ford

    William R. Topp

    Chapter 21

    Hashing as

    Map Implementation

    Bret Ford

    2005, Prentice Hall

  • 8/14/2019 Hash Table Ind

    2/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Table

  • 8/14/2019 Hash Table Ind

    3/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Pengertian

    Merupakan struktur data yangmenawarkan operasi insertion dansearching (juga deletion) dengan sangat

    cepat

  • 8/14/2019 Hash Table Ind

    4/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Pengertian

    Hash tabel biasa juga disebut map, lookuptable, assosiatif array atau dictionary

    Hash tabel adalah container yang

    mengizinkan akses langsung oleh indeksdengan tipe apapun.

    Hash tabel bekerja seperti array, akan

    tetapi indeksnya tidak harus integer. Contoh yang sederhana dari hash tabel

    adalah kamus

  • 8/14/2019 Hash Table Ind

    5/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Kekurangan Hash Tabel

    Hash tabel berbasis array, dan arraysangat sulit ditambah setelah arraytersebut dibuat

    Untuk beberapa bentuk Hash tabel,performance nya semakin menurun bilatabel telah penuh. Sehingga programmer

    dari awal sudah harus memperhitungkanberapa besar data yang akan disimpan

  • 8/14/2019 Hash Table Ind

    6/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Introduction to Hashing

    Hash function memetakan value ke indekspada tabel. Fungsi tsb menyediakan akseske suatu elemen, seperti suatu indeks

    menyediakan akses ke suatu elemen dariarray.

    Hash tabel menyediakan implementasi Set

    dan Map interface

  • 8/14/2019 Hash Table Ind

    7/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Introduction to Hashing

    (continued) Hasing merupakan suatu struktur

    penyimpanan data yang menghasilkan

    O(1) waktu pengembalian rata-rata.Dengan cara ini, item independenterhadap jumlah item lain pada collection

  • 8/14/2019 Hash Table Ind

    8/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Introduction to Hashing

    (continued) Hash tabel adalah reference dari array

    Yang berhubungan dengan hash tabel adalahhash function yang mempunyai key sebagaiargument dan mengembalikan nilai integer

    Dengan menggunakan sisa setelah membagihash value dengan ukuran tabel, kitamempunyai pemetaan dari key ke indekspada tabel

  • 8/14/2019 Hash Table Ind

    9/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Introduction to Hashing(concluded)

    Hash Value: hf(key) = hashValueHashTable index: hashValue % n

    hf(key) = hashValue

    key entry

    0

    1

    i

    n-1

    hashValue % n = i

  • 8/14/2019 Hash Table Ind

    10/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Using a Hash Function Misalkan hash function hf(x) = x, dimana

    x adalah nonnegative integer (identityfunction).

    Asumsikan tabel adalah array tableEntry

    dengan n = 7 elemen

    hf(22) = 22 22 % 7 = 1

    hf(4) = 4 4 % 7 = 4

    0

    1

    4

    6

    2

    3

    5

    tableEntry[1]

    tableEntry[4]

  • 8/14/2019 Hash Table Ind

    11/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Using a Hash Function(concluded)

    Dengan hash function hf() dan ukurantabel n, indeks dari tabel untuk key i =hf(key)%n. Collision akan muncul jika ada

    dua key yang berbeda dengan dibagi olehn

    0

    1

    4

    6

    23

    5

    tableEntry[1]

    tableEntry[5]

    hf(5) = 5

    hf(33) = 33

    5 % 7 = 5

    33 % 7 = 5

    hf(22) = 22

    hf(36) = 36

    22 % 7 = 1

    36 % 7 = 1

  • 8/14/2019 Hash Table Ind

    12/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Designing Hash Functions

    Beberapa prinsip dalam mendesain hashfunction

    Mengevaluasi hash function harus efisien

    Hash function harus menghasilkan nilai hashyang terdistribusi secara uniform. Ini akanmenyebarkan indek hash tabel ke tabel yang

    meminimalisasi collision

  • 8/14/2019 Hash Table Ind

    13/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Designing Hash Functions(continued)

    Java programming language menyediakanhashing function dengan methodhashCode() pada Object superclass

    public int hashCode()

    { }

  • 8/14/2019 Hash Table Ind

    14/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Designing Hash Functions(continued)

    hashCode mengkonversi internal address dariobjek menjadi integer value, yang mempunyaiaplikasi yang terbatas karena 2 objek yang

    berbeda akan mempunyai value yang berbedauntuk hashCode(), walaupun merekamenyimpan data yang sama

    // strings one and two are the same; not so for integer values// one.hashCode() and two.hashCode()String one = "java", two = "java";

  • 8/14/2019 Hash Table Ind

    15/86 2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Designing Hash Functions(continued)

    Integer class menyediakan identityfunction untuk hashCode()

    public int hashCode(){ return value; }

    Kecuali data integer memiliki karakteristikrandom, ini bukan fungsi hash yang baik

  • 8/14/2019 Hash Table Ind

    16/86

  • 8/14/2019 Hash Table Ind

    17/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Designing Hash Tables

    Ketika ada dua atau lebih data di-hash ke dalamindeks yang sama,mereka tidak dapat mengisiposisi yang sama pada tabel

    Pilihan yang dapat kita lakukan adalah

    mengalokasikan salah satu item ke posisi yang laindalam tabel (linear probing) atau mendesain ulangtabel untuk menyimpan sekumpulan key yang collidepada setiap indeks (chaining dengan list yang

    terpisah)

    Li P bi

  • 8/14/2019 Hash Table Ind

    18/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Linear Probing Hash tabel adalah array dari elemen yang

    berasosiasi dengan hash function. Untukmenambahkan item

    Awalnya, tag masing-masing entri pada tabeldengan empty

    Gunakan hash function thd key dan bagivalue dengan ukuran tabel untuk memperolehindeks. Jika entry nya kosong, masukkan item

    Jika tidak, mulai pada indeks hash berikutnyadan scan indeks-indeks berturut-turut,.Insertion dilakukan pada lokasi pertama yangterbuka

  • 8/14/2019 Hash Table Ind

    19/86

  • 8/14/2019 Hash Table Ind

    20/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Linear Probing (continued) Pencarian menghasilkan lokasi awal dari hash

    tanpa menemukan slot yang terbuka, tabelpenuh dan algoritma linear probing throwsuatu exception

    tableIndex = x % 11

  • 8/14/2019 Hash Table Ind

    21/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Linear Probing (concluded)

    Jika ukuran tabel relatif lebih besar darijumlah item, linear probing akan bekerjadengan baik, karena hash function yang

    baik membuat indek yang terdistribusi kesemua tabel dan collision akan minimal.Karena rasio dari ukuran tabel terhadap

    jumlah item yang didekati 1, algoritma

    lebih buruk dari sequential search.

  • 8/14/2019 Hash Table Ind

    22/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Linear Probing (continued)

    // compute hash index of item for a table of size n

    int index = (item.hashCode()&Integer.MAX_VALUE)%n, origIndex;// save the original hash indexorigIndex = index;// cycle through the table looking for an empty slot, a// match or a table full condition (origindex == index).do

    { // test whether the table slot is empty or the key matches// the data field of the table entryif table[index] is empty

    insert item in table at table[index] and return

    else if table[index] matches item

    return

    // begin a probe starting at the next table locationindex = (index+1) % n;

    } while (index != origIndex);// we have gone around table without finding match or open slotthrow new BufferOverflowException();

  • 8/14/2019 Hash Table Ind

    23/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    HashCode

    Urutan key/value yang disimpan padatabel HashMap tergantung capacity daritabel dan nilai dari hash code dari object

    h h

  • 8/14/2019 Hash Table Ind

    24/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Chaining with Separate Lists Chaining dengan daftar terpisah

    mendefinisikan tabel hash sebagai urutanindeks dari linked list. Setiap list, disebutbucket, mengandung satu set item yanghash ke lokasi tabel yang sama

  • 8/14/2019 Hash Table Ind

    25/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Chaining with Separate Lists(continued)

    Bucket adalah singly linked list. Masing-masingentri dari array merupakan simpul pertama

    dalam urutan item yang di hash ke indeks tabel.Node memiliki struktur dengan dua field, satuuntuk nilai dan satu untuk referensi ke nodeberikutnya.

  • 8/14/2019 Hash Table Ind

    26/86

    Ch i i ith S t Li t

  • 8/14/2019 Hash Table Ind

    27/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Chaining with Separate Lists(continued)

    Consider the following sequence of eight elements{54, 77, 94, 89, 14, 45, 35, 76} with the identity hash function and tableSize = 11.The figure displays the lists. Each entry in a table includes the number of probes toadd the element.

    Ch i i ith S t Li t

  • 8/14/2019 Hash Table Ind

    28/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Chaining with Separate Lists(concluded)

    Chaining dengan daftar terpisah umumnyalebih cepat daripada probing linear karenachaining hanya mencari item yang hash ke

    lokasi table yang sama Dengan linear probing, jumlah entri tabel

    adalah terbatas pada ukuran tabel,sedangkan linked list yang digunakandalam chaining bertambah sesuai dgnyang diperlukan

    Untuk menghapus elemen, hanya denganmen ha usn a dari daftar terkait.

    Rehashing

  • 8/14/2019 Hash Table Ind

    29/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    RehashingAs the number of entries in the hash

    table increases, search performance

    deteriorates. Rehashing increases the hashtable size when the number of entries in thetable is a specified percentage of its size.

    A Hash Table as a Collection

  • 8/14/2019 Hash Table Ind

    30/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    A Hash Table as a Collection

    The generic class Hash stores elements

    in a hash table using chaining withseparate lists and implements theCollection interface.

    hashCode() must be provided by the generictype.

    The constructor creates a hash table withinitial size 17. The table grows as rehashing

    occurs.

    The method toString() returns a comma-separated list that, by the nature of hashing,

    is not ordered.

  • 8/14/2019 Hash Table Ind

    31/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Implementation

    The hash table is an array whose

    elements are the first node in a singlylinked list.

    Define an inner class Entry with an integer

    field hashValue that stores the hash codevalue and avoids recomputing the hashfunction during rehashing.

    hashValue = item.hashCode() & Integer.MAX_VALUE;

  • 8/14/2019 Hash Table Ind

    32/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    A Hash Table as a Collection(concluded)

  • 8/14/2019 Hash Table Ind

    33/86

    Hash Class Instance Variables

  • 8/14/2019 Hash Table Ind

    34/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Instance Variables The Entry array, table, defines the

    singly-linked lists that store the elements.

    The integer variable hashTableSizespecifies the number of entries in thetable.

    The variable tableThreshold has the value(int)(table.length * MAX_LOAD_FACTOR)

    where the double constant

    MAX_LOAD_FACTOR specifies themaximum allowed ratio of the elements inthe table and the table size.

  • 8/14/2019 Hash Table Ind

    35/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Instance Variables(concluded)

    MAX_LOAD_FACTOR = 0.75 (number ofhash table entries is 75% of the table

    size) is generally a good value. When thenumber of elements in the table equalstableThreshold, a rehash occurs.

    The variable modCount is used byiterators to determine whether externalupdates may have invalidated the scan.

  • 8/14/2019 Hash Table Ind

    36/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Constructor

    The Hash class constructor creates the17-element array table with 17 emptylists. A rehash will first occur when the

    hash collection size equals 12.

    Hash Class Outline

  • 8/14/2019 Hash Table Ind

    37/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Outlinepublic class Hash implements Collection{

    // the hash table

    private Entry[] table;private int hashTableSize;private final double MAX_LOAD_FACTOR = .75;private int tableThreshold;

    // for iterator consistency checksprivate int modCount = 0;

    // construct an empty hash table with 17 bucketspublic Hash(){

    table = new Entry[17];hashTableSize = 0;tableThreshold =

    (int)(table.length * MAX_LOAD_FACTOR);}. . .

    }

  • 8/14/2019 Hash Table Ind

    38/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class add()

    The algorithm for add(): Compute the hash index for the parameter

    item and scan the list to see if item iscurrently in the hash table. If so, return false.

    Create a new Entry with value item and insertit at the front of the list.

    hashValue is assigned to the entry so it will not

    have to be computed when rehashing occurs.

  • 8/14/2019 Hash Table Ind

    39/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash add() (continued)

    // add item to the hash table if it is not// already present and return true; otherwise,// return false

    public boolean add(T item){

    // compute the hash table index

    int hashValue = item.hashCode() &Integer.MAX_VALUE,index = hashValue % table.length;

    Entry entry;

    // entry references the front of a linked

    // list of colliding valuesentry = table[index];

    H h dd() ( ti d)

  • 8/14/2019 Hash Table Ind

    40/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash add() (continued)// scan the linked list and return false// if item is in list

    while (entry != null){

    if (entry.value.equals(item))return false;

    entry = entry.next;

    }

    // we will add item, so increment modCountmodCount++;

    // create the new table entry so its successor

    // is the current head of the list

    entry = new Entry(item, hashValue,(Entry)table[index]);

    h dd() ( l d d)

  • 8/14/2019 Hash Table Ind

    41/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash add() (concluded)

    // add it at the front of the linked list// and increment the size of the hash tabletable[index] = entry;hashTableSize++;

    if (hashTableSize >= tableThreshold)rehash(2*table.length + 1);

    // a new entry is addedreturn true;

    }

  • 8/14/2019 Hash Table Ind

    42/86

  • 8/14/2019 Hash Table Ind

    43/86

  • 8/14/2019 Hash Table Ind

    44/86

    Hash Class rehash()

  • 8/14/2019 Hash Table Ind

    45/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class rehash()(continued)

    // see if there is a linked list presentif (entry != null){

    // have at least one element in a linked listdo{

    // record the next entry in the// original linked listnextEntry = entry.next;

    // compute the new table indexindex = entry.hashValue % newTableSize;

    // insert entry the front of the// new table's linked list at// location indexentry.next = newTable[index];newTable[index] = entry;

    Hash Class rehash()

  • 8/14/2019 Hash Table Ind

    46/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class rehash()(concluded)

    // assign the next entry in the// original linked list to entryentry = nextEntry;

    } while (entry != null);}

    }

    // the table is now newTabletable = newTable;// update the table thresholdtableThreshold =

    (int)(table.length * MAX_LOAD_FACTOR);

    // let garbage collection get rid of oldTableoldTable = null;

    }

  • 8/14/2019 Hash Table Ind

    47/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash remove()

    Compute the hash table index. Usingvariables prev and curr that move throughthe linked list in tandem, search for item.

    If not present, return false; otherwise,remove item from the list. If prev == null,this involves updating table[index] toreference the successor to the front of thelist. Decrement hashTableSize, incrementmodCount, and return true.

    Hash remove() (continued)

  • 8/14/2019 Hash Table Ind

    48/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash remove() (continued)public boolean remove(Object item){

    // compute the hash table indexint index = (item.hashCode() &

    Integer.MAX_VALUE) % table.length;Entry curr, prev;

    // curr references the front of a

    // linked list of colliding values;// initialize prev to nullcurr = table[index];

    prev = null;// scan the linked list for item

    while (curr != null)

    if (curr.value.equals(item)){

    // we have located item and will remove// it; increment modCount

    modCount++;

  • 8/14/2019 Hash Table Ind

    49/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash remove() (continued)

    // if prev is not null, curr is not the front// of the list; just skip over currif (prev != null)prev.next = curr.next;

    else

    // curr is front of the list; the// new front of the list is curr.nexttable[index] = curr.next;

    // decrement hash table size and return truehashTableSize--;

    return true;}

    h () ( l d d)

  • 8/14/2019 Hash Table Ind

    50/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash remove() (concluded)

    else{

    // move prev and curr forwardprev = curr;curr = curr.next;

    }

    return false;}

  • 8/14/2019 Hash Table Ind

    51/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Iterators Search the hash table for the first

    nonempty bucket in the array of linkedlists. Once the bucket is located, theiterator traverses all of the elements in the

    corresponding linked list and thencontinues the process by looking for thenext nonempty bucket. The iterator

    reaches the end of the table when itreaches the end of the list for the lastnonempty bucket.

    Hash Class Iterators (continued)

  • 8/14/2019 Hash Table Ind

    52/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Iterators (continued) Iterator objects are instances of the

    inner class IteratorImpl whose variablesare:

    Integer index that identifies the currentbucket (table[index]) scanned by the iterator.

    The Entry reference next pointing to thecurrent node in the current bucket.

    The variable lastReturned that references the

    last value returned by next(). The iterator variable expectedModCount used

    in conjunction with the collection variablemodCount.

    H h Cl It t

  • 8/14/2019 Hash Table Ind

    53/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Iterators(continued)

    // inner class that implements hash table iteratorsprivate class IteratorImpl implements Iterator{

    // next entry to returnEntry next;// to check iterator consistencyint expectedModCount;// index of current bucketint index;// reference to the last value returned by next()T lastReturned;

    . . .}

    Hash Class Iterators

  • 8/14/2019 Hash Table Ind

    54/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Class Iterators(continued)

    The elements enter the collection in theorder (19, 32, 11, 27) using the identifyhash function. The iterator visits theelements in the order (11, 32, 27, 19).

    Hash Iterator Constructor

  • 8/14/2019 Hash Table Ind

    55/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Iterator Constructor

    A loop iterates up the list of bucketsuntil it locates the first nonempty bucket.The loop variable i becomes the initialvalue for index and table[i] references the

    front of the list. This is the initial value fornext.

    Hash Iterator Constructor

  • 8/14/2019 Hash Table Ind

    56/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Iterator Constructor(concluded)

    IteratorImpl(){

    int i = 0;Entry n = null;

    // the expected modCount starts at modCount

    expectedModCount = modCount;

    // find the first nonempty bucketif (hashTableSize != 0)while (i < table.length &&

    ((n = table[i]) == null))

    i++;

    next = n;index = i;lastReturned = null;

    }

    Hash Iterator next()

  • 8/14/2019 Hash Table Ind

    57/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Iterator next() The method next() first determines

    that the operation is valid by checking thatmodCount and expectedModCount areequal and that we are not at the end ofthe hash table.

    If the iterator is in a consistent state,next() saves entry.value in lastReturnedand uses a loop index i and entry to

    perform the iterator scan for thesubsequent element in the hash table. Thereturn value is lastReturned.

    Hash Iterator next() (continued)

  • 8/14/2019 Hash Table Ind

    58/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Iterator next() (continued)public T next(){

    // check for iterator consistency

    if (modCount != expectedModCount)throw new ConcurrentModificationException();

    // we will return the value in Entry object nextEntry entry = next;

    // if entry is null, we are at the end of the tableif (entry == null)

    throw new NoSuchElementException();

    // capture the value we will returnlastReturned = entry.value;// move to the next entry in the current// linked listEntry n = entry.next;// record the current bucket indexint i = index;

    Hash Iterator next()

  • 8/14/2019 Hash Table Ind

    59/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Iterator next()(concluded)

    if (n == null){

    // we are at the end of a bucket; search for the// next nonempty bucketi++;

    while (i < table.length &&(n = table[i]) == null)

    i++;}

    index = i;next = n;

    return lastReturned;}

  • 8/14/2019 Hash Table Ind

    60/86

    Hash Iterator remove()

  • 8/14/2019 Hash Table Ind

    61/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Iterator remove()(concluded)

    public void remove(){

    // check for a missing call to next() or previous()if (lastReturned == null)

    throw new IllegalStateException("Iterator call to next() " +"required before calling remove()");

    if (modCount != expectedModCount)throw new ConcurrentModificationException();

    // remove lastReturned by calling remove() in Hash;// this call will increment modCountHash.this.remove(lastReturned);expectedModCount = modCount;lastReturned = null;

    }

    Th H hM C ll ti

  • 8/14/2019 Hash Table Ind

    62/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    The HashMap Collection

    The design of the HashMap collection issimilar to the implementation of TreeMap.A HashMap is not ordered since theposition of elements depends on hashingthe keys. This affects the methodtoString() which returns a listing of theelements based on the iterator order.

    The HashMap Collection

  • 8/14/2019 Hash Table Ind

    63/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    The HashMap Collection(continued)

    The HashMap class stores elements in ahash table containing linked lists of Entryobjects. The inner class Entry contains

    key-value pairs.

    The HashMap Collection

  • 8/14/2019 Hash Table Ind

    64/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    The HashMap Collection(continued)

    The inner class Entry implements theMap.Entry interface which defines themethods getKey(), getValue() and

    setValue(). A toString() method returns arepresentation of an entry in the format"key=value". The constructor has

    arguments for each field in the node.

    Entry Class

  • 8/14/2019 Hash Table Ind

    65/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Entry Class(partial listing)

    static class Entry implements Map.Entry{

    K key;V value;Entry next;int hashValue;

    // make a new entry with given key, valueEntry(K key, V value, int hashValue,

    Entry next){

    this.key = key;

    this.value = value;this.hashValue = hashValue;this.next = next;

    }...

    }

    A i E t i i H hM

  • 8/14/2019 Hash Table Ind

    66/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Accessing Entries in a HashMap

    The methods get(), and containsKey()take a key reference argument and mustlocate a corresponding entry in the map.

    This task is performed by the private

    HashMap method getEntry() which takes akey as an argument, applies the hash functionto the key and searches the resulting list for a

    key-value pair with the same key.

    Accessing Entries in a HashMap

  • 8/14/2019 Hash Table Ind

    67/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Accessing Entries in a HashMap(continued)

    // return a reference to the entry with the specified key// if there is one in the hash map; otherwise, return nullpublic Entry getEntry(K key){

    int index = (key.hashCode() &Integer.MAX_VALUE) % table.length;

    Entry entry;

    entry = table[index];

    while (entry != null){

    if (entry.key.equals(key))return entry;

    entry = entry.next;}

    return null;

    }

  • 8/14/2019 Hash Table Ind

    68/86

    Updating Entries in a HashMap

  • 8/14/2019 Hash Table Ind

    69/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Updating Entries in a HashMap

    The method put() updates the HashMap. Construct a table index by applying the hash

    function for the key and scan the linked listfor a match with the key. If a match occurs,

    apply setValue() and return its result.

    If key does not occur in the list, insert a newEntry object at the front of the linked list. If

    the hash map size has reached the tablethreshold, apply rehashing. Conclude byreturning null.

    Updating Entries in a HashMap

  • 8/14/2019 Hash Table Ind

    70/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Updating Entries in a HashMap(continued)

    // assigns value as the value associated with key// in this map and returns the previous value// associated with the key, or null if there// was no mapping for the key

    public V put(K key, V value){

    // compute the hash table indexint hashValue = key.hashCode() & Integer.MAX_VALUE,

    index = hashValue % table.length;Entry entry;

    // entry references the front of a linked// list of colliding valuesentry = table[index];

    Updating Entries in a HashMap

  • 8/14/2019 Hash Table Ind

    71/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    p g p(continued)

    // scan the linked list. if key matches the key in an

    // entry, return entry.setValue(value). this// replaces the value in the entry and returns the// previous valuewhile (entry != null){

    if (entry.key.equals(key))

    return entry.setValue(value);

    entry = entry.next;}

    // we will add item, so increment modCount

    modCount++;

    // create the new table entry so its successor// is the current head of the listentry = new Entry(key, value, hashValue,

    (Entry)table[index]);

    Updating Entries in a HashMap

  • 8/14/2019 Hash Table Ind

    72/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Updating Entries in a HashMap(concluded)

    // add it at the front of the linked list// and increment the size of the hash maptable[index] = entry;hashMapSize++;

    if (hashMapSize >= tableThreshold)rehash(2*table.length + 1);

    return null; // a new entry is inserted}

    Summary of HashMap Design

  • 8/14/2019 Hash Table Ind

    73/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Summary of HashMap Design

  • 8/14/2019 Hash Table Ind

    74/86

    HashSet Class (continued)

  • 8/14/2019 Hash Table Ind

    75/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    ( )

    public class HashSet implements Set{// value for each key in the map

    private static final Object PRESENT = new Object();

    // set implemented using a hash map

    private HashMap map;

    // create an empty set objectpublic HashSet(){ map = new HashMap(); }. . .

    }

    HashSet add()

  • 8/14/2019 Hash Table Ind

    76/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    ()

    The set methods are implemented with

    map methods that use theentry as the argument.

    add() uses the map method put(). If aduplicate exists, then put() simply updates thevalue field of the entry to PRESENT which isits current value. The map method returnsnull if a new element is added, so a return

    value of null indicates that the add() inserteditem.

    HashSet add() (concluded)

  • 8/14/2019 Hash Table Ind

    77/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    HashSet add() (concluded)

    public boolean add(T item){

    return map.put(item, PRESENT) == null;}

    HashSet iterator()

  • 8/14/2019 Hash Table Ind

    78/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    HashSet iterator()

    The HashSet iterator must traverse thekeys in the map. Implement the methoditerator() by returning an iterator for thekey set collection view of the map.

    // returns an iterator for the elements in the setpublic Iterator iterator(){

    return map.keySet().iterator();}

    HashSet remove()

  • 8/14/2019 Hash Table Ind

    79/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    HashSet remove()

    The HashSet remove() method callsthe remove() method for the map. Todetermine whether an element wasremoved from the set, verify that the

    return value from the map remove() call isthe reference PRESENT.

    public boolean remove(Object obj)

    { return map.remove(obj) == PRESENT;}

    Hash Table Performance

  • 8/14/2019 Hash Table Ind

    80/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Table Performance

    A good hash function provides a uniformdistribution of hash values.

    Hash table performance is measured by

    using the load factor = n/m, where n isthe number of elements in the hash tableand m is the number of buckets.

    For linear probe, 0 1. For chaining with separate lists, it is possible

    that > 1.

    Hash Table Performance

  • 8/14/2019 Hash Table Ind

    81/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Table Performance(continued)

    The worst case linear probe or chainingwith separate lists occurs when all dataitems hash to the same table location. If

    the table contains n elements, the searchtime is O(n), no better than that for thesequential search.

    Hash Table Performance

  • 8/14/2019 Hash Table Ind

    82/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    (continued)Assume that the hash function uniformly

    distributes indices around the hash table.

    We can expect = n/m elements in eachbucket.

    On the average, an unsuccessful search makes comparisons before arriving at the end of a list andreturning failure.

    Mathematical analysis shows that the averagenumber of probes for a successful search isapproximately 1 + /2.

    Hash Table Performance

  • 8/14/2019 Hash Table Ind

    83/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Hash Table Performance(concluded)

    Assume the number of elements n in thehash table is bounded by some amount,say, R*m, where m is the table size.

    In this case, = n/m (R*m)/m = R, and thefollowing relationships hold for the averagecases, so the average running time is O(1)!

    S 1 + /2 1 + R/2 (Successful Search)U = R (Unsuccessful Search)

    Evaluating Ordered and

  • 8/14/2019 Hash Table Ind

    84/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    Evaluating Ordered andUnordered Sets and Maps

    Use an ordered set or map if an iterationshould return elements in order (averagesearch O(log2n). Use an unordered set or

    map when fast access and updates areneeded without any concern for theordering of elements (average search time

    O(1)).

    Timing Example

  • 8/14/2019 Hash Table Ind

    85/86

    2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

    g p

    Program SearchComp.java:

    Reads a file of 25025 randomly orderedwords and inserts each word into a TreeSetand into a HashSet.

    Determines the amount of time required tobuild both of the data structures.

    Shuffles the input from the file and times asearch of the TreeSet and HashSet for each

    word in the shuffled input. Displays the time required for each search

    technique.

    Timing Example (concluded)

  • 8/14/2019 Hash Table Ind

    86/86

    Timing Example (concluded)

    Run:Number of words is 25025Built TreeSet in 0.078 secondsBuilt HashSet in 0.047 secondsTreeSet search time is 0.078 secondsHashSet search time is 0.016 seconds

    Note that the HashSet search time isconsiderably better than that for a TreeSet.