7
Min Hashing "÷÷÷÷÷÷÷÷* . Kgs : 5=6 . . .se ) :.: JSCABS-f.IT#--Ys:i7T.s.ozi-I--I

HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

MinHashing

"÷÷÷÷÷÷÷÷*.

Kgs: 5=6 . . . .se ):.:JSCABS-f.IT#--Ys:i7T.s.ozi-I--I

Page 2: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

Family Hash Functions If Si = Elie . 53u

all ha,

who penn Sees 33CrisciRand lords draw h

,rE) f 53=433,46 ?

then ha ,deterministic 34=444,63

D ← domain

ha .

i Cn ] → Sets Siccn ]Important :

has order

he,

(1) → z

haz -

ha

.ca#ggg.Cs7--smei$nhr.Cs)

hr,

C 5) s 4es . g. 45,3=3

of = permutation

oma⑧⑦z④⑤⑥7%

Cn7=Go ]hr

,

1700302600041001985

→ 6

hgz

⑧⑧67105041093

→ 2

Page 3: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

Way togo

from

S,=Eli* EI's7.

. . . .

g. Csi ) → u,sn=hgzcsi ) → Vz ) , v=Cv , .ua ,

. . . has E Cn]k, UCSD = ( 3,1 ,,6 . . .

,x )

i

gods :) → VrVKD-fg.z.iq#AprJs(S,.Sa)=f.IIgiitvicsd--uics.d

= ¥0 aw .

Page 4: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

Foraus two sets s

.

,

S,

ha,

those,

- . . hot Tid ) - (

ECIscsi.s.D-s-scs.is#Efsscs..saD--EAz&iiICq.cs.s=sgiCsD--

EE?EEIG.icsd-gicsDJP.fgecs.S-gics.D-s-scs.is#

Decompose En ] → A,

B,

C

JS=A objects hashed to by SES

, and sesz

I Ahl Bl B objects hashed to by SES , as Sese,

not hell

C objects haste to ↳ x ES. use

Page 5: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

Fastmintlashgir-gc.is/esccnT)-sCn3

choose C random ) hash functor -

fii Cn ] → Cm ] m > n

Vi =D Hi Domain

for y --

Hoe④= Ex, ,xz ,

. .. te

}]i¥÷÷i÷÷÷÷:c .

Return V - ( Vi,

V2,

- ' - Ut ) X e S

Seat

Page 6: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

How large should te be ?C- H -

-

X, ,X . ,

.. . Xu U Xi E Coil ]

A- = III. x. .

EGM - Ecxi ] -

- ee

Pr GA- ul - e ]szexpfzik.com

{ = 0.05

f- Zexpf - z ( o.

05512 )

hat E) =- Hafiz

ht f) =z( to ) 't ⇒ 12=4021 lmfaoo )

(2--2001600)=1060

Page 7: HashingMin ÷÷÷÷÷÷÷÷* Kgsjeffp/teaching/cs5140-S19/cs... · 2020. 1. 1. · Way to go from S,=Eli * 's7.EI.... g. Csi) → u sn=h, gzcsi ) → Vz) v=Cv,.ua. . . has E Cn]k,

ha,

(1) → z

htz

ha.

Cass gg.CI) =

y :$hols )

hr,

C 5) s 4es . g. 45,3=3

of = permutation

oma⑧⑦z④⑤⑥7ICnT=Go ]

hr,

1070302600041001985

→ 6

hgz

⑧⑧6710504to 93 → 2

I 6 4 2 78 co 9 53hisABCXBBBCX#if

s, If'm Ito 887

Sz 7¥ 0275①①

O 0I