Sentiwordnet: A publicly available lexical resource for opinion mining

{SentiWordNet

Andrea Esuli* and Fabrizio Sebastiani, SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining

1. text의주관성, 객관성판단

-> SO polarity (Pang and Lee, 2004; Yu and Hatzivassiloglou,

2003)

2. 주관성을지닌 text의긍정, 부정판단

-> PN polarity (Pang and Lee, 2004; Turney, 2002)

3. 얼마나긍정/부정인지판단

예) 조금긍정, 약간긍정, 아주긍정

-> strength of text PN polarity(Pang and Lee, 2005; Wilson et al., 2004)

Polarity classificationBo Pang and Lillian Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

http://acl.ldc.upenn.edu/acl2004/main/pdf/319_pdf_2-col.pdf

http://acl.ldc.upenn.edu/acl2004/main/pdf/319_pdf_2-col.pdf

WordNet

synset

- 영어의의미어휘목록

- synset (유의어집단)으로분류하여단어집과유의어,반의어사전의배합을만듬.

- 심리학교수인조지A. 밀러가지도하는프린스턴대학의인지과학연구소에의해만들어졌고, 유지되고있음.

http://wordnetweb.princeton.edu/perl/webwn?s=fool&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=

http://wordnetweb.princeton.edu/perl/webwn?s=fool&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=

http://ko.wikipedia.org/wiki/%ED%94%84%EB%A6%B0%EC%8A%A4%ED%84%B4_%EB%8C%80%ED%95%99

synset

Score : 0.0 ~ 1.0,각 synset의총합은 1

SentiWordNet

http://sentiwordnet.isti.cnr.it/search.php?q=fool

http://sentiwordnet.isti.cnr.it/search.php?q=fool

WordNetsynset

Positivescore

Neutralscore

Negativescore

Ternary classifier

Semi-supervised method

Training set

Lp , Ln Lp , Ln

K iterations

WordNetRelation 적용

(반의어, 유의어, 파생어)

Training set

Subjectivity(Lp, Ln)

Objectivity (Lo)

Lp, Ln, Lo -> vectorial representation -> label Ci

-> 2개의 classifier 생성

1. Positive / not Positive 로분류하는 classifier

2. Negative / not Negative 로분류하는 classifier

Classifier

“Supervised learner”

Positive∩





Classifier


Negative∩





Classifier


Objective

∩ ∩

∪

Precision = Tp / (Tp + Fp)

: True라고예측한것중에서실제로 true인것의비율

Recall = Tp / (Tp + Fn)

: 실제로 true인것중에내가얼마나맞췄는지

(Tp : true라고예측했는데실제로 true,

Fp : true라고예측했는데실제로 false,

Fn : false라고예측했는데실제로 true,

Tn : false라고예측했는데실제로 false)

K를정하려면?

K

Precision

Recall

SmallTraining set

noise

(Andrea Esuli1 and Fabrizio Sebastiani2, Determining Term Subjectivity and Term Orientation for Opinion Mining)

http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=4C658BE911293FA5B3D13B9BE101F36D?doi=10.1.1.60.8645&rep=rep1&type=pdf

Roccino (Andrew McCallum’s Bow package http://www-2.cs.cmu.edu/~mccallum/bow/)

SVM (6.01 of Thorsten Joachims’

SVMlight - http://svmlight.joachims.org/).

K = 0, 2, 4, 6 -> 8가지의 ternary classifier

-> 1로정규화

K를정하려면?

http://www-2.cs.cmu.edu/~mccallum/bow/

http://svmlight.joachims.org/

http://ko.asianwordnet.org/

한국어 WordNet?

http://ko.asianwordnet.org/

Data & Analytics

Sentiwordnet: A publicly available lexical resource for opinion mining