1
want he said might be right, but not include enough information, so we could thought this review as a useless review. tokenize POS-tagging stopwords lemmatiz e score(w i )= c w i - P 5 k 6=,k =1 γ · c k w i ) log( P 5 k =1 c k w i ) , is most frequency rating score(r j )= P w i 2r j score(w i ) len(r j ) γ : discount rate c k w i :# of word i in rating k reviews len(r j ):# of words in review j 1 2 3 4 5

Zhang first pdf

  • Upload
    -

  • View
    422

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Zhang first pdf

want he said might be right,but not include enough information,

so we could thought this review as a useless review.

tokenize

POS-tagging

stopwords

lemmatize

score(wi) =c

⇤wi �

P5k 6=⇤,k=1 � · ckwi

)

log(

P5k=1 c

kwi)

, ⇤ is most frequency rating

score(rj) =

Pwi2rj

score(wi)

len(rj)

� : discount rate

c

kwi

: # of word i in rating k reviews

len(rj) : # of words in review j

1 2 3 4 5