51
'( "!%% $# %# !# &#* $##% '($ $ %'!& !# #$ ! % 오창훈 ) $&#

(발제) Review Spotlight: A User Interface for Summarizing User-generated Reviews Using Adjective-Noun Word Paris +CHI 2011 -Koji Yatani /오창훈 x2012 summer

Embed Size (px)

Citation preview

오창훈

Review SPOTLIGHT

A User Interface for Summarizing User-generated Reviews

Using Adjective-NounWord Pairs

I am ohchanghoon

�����������

Koji YataniMicheal NovatiAndrew TrustyKhai N. Truong

Department of Computer ScienceUniversity of Toronto

�����������

These days... & Why?

•CHI 2013"논문"3저자로"참여중

•기출문제로"공부하는"스타일

-"논문의"구조,"실험"설계,"결과"도출"등"참고

•지난"발제문"CommandMaps (2012 CHI Best)에"이어

"또다시"CHI Best Paper로"준비

�����������

INTRODUCTION

�����������

The problem with all these reviews is they put a lot of really useless information there. For example, this guy included a dialogue he had with a waitress... [That] makes it di!cult when you actually try to quickly "nd something.

�����������

;Ö�j��øÄÞ¯I�쓸모없는"정보%�Bx�n¢¸��

9,�}f�yÒÚ��³i�X©���±<�ü����¸��Ä�

Å'À�ج¼û1�?G�Nāi�±;¯�Í©E¥Þ�

z¯¸��Äb�+X�[y¯�{%i�äI,�항상"쉽지만

은"않답니다�

�����������

1. 전반적인"이상에"대해서"1~5점으로"점수"매기기

이유에$대해서는$알$수가$없음

2. review자체에"평점을"매기기

최근에$등록된$중요한$글들을$놓칠$수$있음

Several Ways to

Provide brief overview of review

�����������

•리뷰를$빠르게$파악할$수$있도록$고안•리뷰$텍스트에서$가장$빈번하게$나타나는$•adjective$+$noun으로$구성된$word$pairs를$색깔과$폰트$크기를$다르게$하여$표시•클릭을$하면$추가적인$텍스트$정보를$얻을$수$있음$•임의로$정렬하여$serendipitiously$정보를$얻을$수$있음

reviewed entityReview SPOTLIGHT

�����������

RELATEDWORK

�����������

User Interfaces for

User Review Summarization

•summarization에$대한$연구는$많으나$UI$연구는$미비•feature에$따라서$분류하기$(service$or$food)•$bar$graph로$보여주기$(Liu$et$al.)$

→$평가$안됨

•트리맵$시각화$방식$(Carenini$et$al.)$→$사용자$오히려$혼란,$텍스트를$선호하는$것으로$드러남

•추가적$연구$역시$성과$미미하며$효과성$평가$부족

negative positive

�����������

User Interfaces for

User Review Summarization

•컴퓨터를$이용한$언어$분석-$리뷰$텍스트를$기계학습(machine$learning)$&$n그램$방법(n-gram$methods)으로$시맨틱$분석,$sentiment를$결정$(Turney$and$Pong$et$al.)

•태그클라우드에$sentiment$analysis$반영하는$연구-$positivitiy/negativity등$표현$(Dave$et$al.)

아직까지는$미미하지만$태그$클라우드를$사용하여$

사용자가$효과적으로$유용한$정보를$얻을$수$있을것이라$

기대

�����������

Effects of

a Tag Cloud on Different Tasks

browsingimpressionformation

recognitionsearching

"태그클라우드"사용시"유저의"네가지"유형"(Rivadeneira"et"al.)

특정한$단어를$찾음

특정$단어를$찾지$않고$정보를$훑어보기

태그$클라우드를$통해$impression$형성하기

추가적인$정보를$제공하기

•연구결과$태그$클라우드는$특정$단어를$searching하는$것보다$browsing하는데$더$유용함

•요기를$넘어서는$연구는$아직$open$•Review$SPOTLIGHT은$impression$formation을$중점적으로$지원하도록$하겠씀.

�����������

Effects of

Tag Cloud Visual Features

•폰트가$클수록,$좌상단에$위치할수록$기억하기$쉬움(Rivadeneira)

•폰트의$사이즈와$굵기는$강한$영향력이$있음.$반면$색상은$영향력$없음$(Bateman$et$al.)

•searching에서는$$알파벳$순으로$된$것이$랜덤보다$훨씬$효과적(Schrammel$et$al.)

-"impression formation에"대한"연구는"없음

�����������

FORMATIVESTUDY

�����������

•8명의$참가자$/$남4$여4$20세~50세•웹브라우징을$하지만$일반적인$컴사용자처럼$포스팅을$하지는$않음

•장소에$관한$리뷰•Yelp.com$/$TripAdvisor.com에서$각각$2개씩의$리뷰를$선택•총$4개의$리뷰(각각30개$이상의$리뷰를$달고$있었음)를$두고$think$aloud$요구

•일반적으로$읽고(read)$해당$장소에$대해$결론을$도출하면$멈춤•친구에게$그$장소에$대한$인상을$소개해주는$것처럼$요구$$•모두$녹음되고$전사됨

Overview

�����������

1. Formulating and adjusting an impression

•참가자$대부분이$해당$장소$평가시$평점$+$사진으로$평가$•반복되는$커멘트가$있는지$잠시$리뷰들을$훑어보고$횟수를$세보기도$함•일반적인$표현과$조금$다른$리뷰에$주목하고$읽는$경향이$있으며$impression을$조정

•impression에$대해서$짧은$어구를$말로$표현하는$경향이$있음$

descriptive$information$(e.g.,$Asian$food)$+$subjective$opinion$statement$(e.g.,$good$steak)

Insight

2. Verbalizing impressions with short phases

�����������

1.$자주$언급되는$커멘트에$대한$빠른$오버뷰를$얻을수$있도록$도와줘야$함→$“빈도로$표시”

2.$해당$커멘트의$컨텍스트를$제공해서$impression$조정할$수$있도록$해야$함

3.$짧은$어구를$보여줌으로써$impression$formation을$빠르게$하고$결정을$신속하게$도울$수$있음

•UI로는$태그$클라우드를$사용$-$이미$익숙한$표현법이기$때문에$사용자$이해와$관련된$문제점$줄일$수$있음$$

(디자인$임플리케이션에선$이점이$매우$중요함.$전문$유저가$아니라$일반적인$유저를$위해$뽑아내야$한다.)

Design Implication

�����������

REVIEWSPOTLIGHTPROTOTYPE

�����������

High-level Design

표준적인$태그클라우드는$적절하지$않음

•세부$내용을$파악할$수$없음•Word$Pair로$의미있는$정보$덩어리를$제시해야$함

�����������

•n-gram$방법으로$word$pairs$도출•adjective$+$noun으로$구성$빈도수를$폰트크기에$반영$•sentiment$특징을$색상으로$반영•커서를$가져가면$해당$noun과$가장$많이$짝이$되는$수식어를$보여줌•수식어를$클릭하면$언급된$횟수와$텍스트가$나타남$-$impression을$테스트할$수$있는$빠른$평가$제시

Prototype

�����������

Implementation

•POS$tagger를$사용$(Tsuruoka$and$Tsujii)•noun과$근접한$adjective$걸러냄•be$동사$문장에서도$추출•관사/전치사$걸러냄

“The$food$is$great”$→“great$food”

•폰트$사이즈$결정•noun$-$발생$빈도•adjective$-$pair$빈도$/$noun과의$관계

•SentiWordNet-$문맥에$상관$없이$단어의$sentiment를$분석해주는$툴

•positivity:green•negativity:red•objectivity:blue

•shade로$정도$표현

•spatial$allocation•랜덤하게$배열•겹치지$않도록•네가지$adjective제시

extractingword pairs

counting occurences

sentiment analysis

displaying

�����������

LABORATORYUSER STUDY

�����������

기존"페이지와"Review"SPOTLIGHT"비교

impression"formation의"정도를"평가

�����������

Procedure•시스템에$익숙해지도록$설명해줌$•두$레스토랑$리뷰를$양쪽에서$제시함$a)$일반적인$review$pages$b)$Review$SPOTLIGHT

•레스토랑$링크를$눌러서$가고$싶은$레스토랑$결정을$표시해달라$요구•모든$마우스$움직임과$클릭이$기록되고$결정하는데$걸린$시간도$기록됨•인터뷰를$통해$선호$정도나$이유를$기록함

�����������

Procedure

PA$평점$비슷

PB평점$고저

distracters

review pages &

Review SPOTLIGHT

Review SPOTLIGHT

only

reviewpagesonly

PA1P/PA1S PA2P/PA2S PA3S PA4P

PB1P/PB1S PB2P/PB2S PB3S PB4P

Yelp.com에서$익숙하지$않은$지역의$레스토랑$8쌍을$뽑아냄(각각$50개$이상의$리뷰를$달고$있음)

�����������

Procedure

•각각의$참여자는$6개의$Review$SPOTLIGHT과$6개의$일반$리뷰$페이지를$테스트하게$됨

PA1P"PA1SPA2P"PA2SPB1P"PB1SPB2P"PB2SPA3P"PA4NPB3P"PB4N"

•12개의$순서는$랜덤•각각$26개의$word$pair$를$가지고$있으며$평균$66개의$adjective를$가지고$있음

�����������

Apparatus

•실험$컴퓨터에$미리$Review$Spotlight$summarization과$review$pages를$설치•캐시$설정하여$로딩$시간을$최소화•두$식당을$한$화면에서$편하게$볼$수$있도록$충분히$큰$스크린을$제공,$$마우스$제공

�����������

Participants

•총$10명의$실험$참가자-$남자$5명$+$여자$5명-$$20세$~$50세-$다양한$배경(학생,$시스템관리자,$$소매상,$주부,$회계사$등)

•formative$study$참여자와$중복되지$않음

•웹$브라우징을$종종$하지만$적극적인$리뷰어는$아님$-$formative$study와$거의$유사한$조건

•50분동안$실험이$진행되었고$현금으로$20$를$지급받음

�����������

LABORATORYSTUDY RESULTS

�����������

Performance Time

PA1N$PA1SPA2N$PA2SPB1N$PB1SPB2N$PB2S네"개의"pair"결정"시간"측정"결과

Review$SPOTLIGHT의$결정$속도가$확연이$빠름$(Welch’s$t-test$확인)

Review SPOTLIGHTReview SPOTLIGHT review pagesreview pages

M SD M SD

122 seconds 49 157 seconds 63

�����������

It’s faster. Instead of like going through reading so much non-sense, [I can] just pick up important things right away.

�����������

O��gC¸�� QÄ�«å?,�nÀ�+XÁ�ÇÁ�ú

¸�Ä�ܸü�+m�}f�á©@;m�ûv�TÚ�

�����������

Forming Detailed Impressions

Using Review Spotlight

review$pages Review$SPOTLIGHT

75%가$두$인터페이스에서$모두$같은$레스토랑을$선택

ÄàÄ�j�R�O�n1��\�ÙÀ�

j�%�nC¸��ùÏR�O�D6

¸�

´`�;Kd¦�üKv�9mü�%

è%�ÉK1�����ÉÚ��Â�Á�

;KjI�*�Ù¢ûß�£Þm�p

À�qº�ÙÁ�)]6¸��%.R�

þjÍÅ+�(6���~Nà��Äà

R�?�ß�£¢�ÄÞm�Â�Ä�

Bx�ÍK1�ûI7¸���

평점,$리뷰$수로$결정

세부적인$특징으로$결정

�����������

Forming Detailed Impressions

Using Review Spotlight

선택$변경을$한$경우에도

KX�·à�c�ô_À�Z�Ü'Ä

]1��&ûI�7¸��?�,�©�

�±¸��ûÞm�tÓ�Äß�£Ú��

´hà�c�ô_À�ùÏÄ�ù8�

Ä�Ä7¸��×�O�Ù¢�ÄC¸�

·à¯���^XÀ�Â��¨3�%

.¯�wV�mÕĀ7¸��´hàÃ�

0º�Â��pÀ�5ã¥Þm�¨Ä�

×�ÍÀ%�¸��Ò��&°��^X

Ä�Ă�Ć�·à�c�ô_Á�O�Ù

¢ûI�+�(¢¸���

평점으로$결정 세부사항으로$결정

Review$SPOTLIGHT이$구체적인$정보를$발견할$수$있도록$도움을$줌$

review$pages Review$SPOTLIGHT

�����������

Quantitative Analysis of

Review Spotlight Usage

•일반적으로$사용자는$searching을$할$때$태그클라우드를$‘읽기(read)’$보다는$‘스캔(scan)’한다$

(Halvey$and$Keane)

•Review$SPOTLIGHT$사용$시-$마우스$움직임$필터링$분석$결과총$4232번의$의도적$움직임-$실험$참가자$평균$35.3$회$움직임$(SD=2.5)→$실험$참가자는$Word$pair를$읽었음(read)을$의미$$→$searching이$아니라$impression$formation$과정이었음을$보여줌

•마우스$클릭수$역시$이러한$사실을$지지-$이용자$평균$10회$클릭$(SD=1.3)-$총$클릭$수의$54.8%가$처음$제시된$pair의$adjective에서$발생함

�����������

Qualitative Analysis of

User Strategies

•sentiment$analysis를$바탕으로$word$pair의$색을$결정하였지만$효용이$높지$않았음

ÌI�L©i�sÌ��1��9KÂÀ�9�L©Ã�ì;i��©¸��9j1�

êuõi��òû1�Ê�û,�Ç©�Ú���^XÄ�ÙK1�ĀIÞ�?

�K1�ĀIÞ��\�ªl?�Ù¢ûIÞI�9A�Ç¿v��÷LĀ©¸��

�Þ©��>f�6�T©�ÉIÞÔâ�¶ÎĆ�=r1�o¥K6¸�

•“good”$“great”$“poor”와$같은$특정$adjective를$선택하고$이$word$pair가$얼마나$많이$쓰였는지를$확인하는$전략을$쓰기도$함

ÆL�%Ë�í�L©i��6¸��4eÄ�ÉIÞ�ĂÅü�ă¯�#����$�

#"����$i�Ha�©¸��Â����#����$I�V���#"����$I������¹T

®7¸����

�����������

Qualitative Analysis of

User Strategies

•참가자$대부분이$컨텍스트에서$word$pair를$확인할$수$있다는$점에$대해서$좋은$평가를$내림•흥미가$생기는$세부적인$정보를$찾기$위해$리뷰들을$읽어야$하는$부담을$덜어주기$때문임

Ò%�%Ë�ÙK1��&ĀP���À�oÄÚ��m§�Ò%�Â�Ä?��

��¯m�4�Ä�ÉK1�ûv��9*�îkÿ��9+¯�Nü�ëuõ

m��v�UKI�)Ú��j��Îæi��Þ�£¢R�ą|f»�©6i�î

kû;m�ûv�S¸��

�����������

DISCUSSION

�����������

Providing

a More Consistent Presentation

•오히려$특정한$정보를$원할$때$찾기$어렵다는$단점

¤1��À�+Ä��ÄI�2¯�}f�É¿v�ÙÞm����¢Jv�9,�É

IÞ�IÞ�¤¢@;%�ćX7¸��±;�Æ~���!��"�����¯�I�

%.���M±����Ë�YÁ�¤¢@;�Ą¡��©¸��9bQ���!��"�

� �����À�Äb�Ñ�%�ÉÁÞ�ÁÞ�wg-C¸�

�����������

Providing

a More Consistent Presentation

•랜덤$배열이$impression$formation을$용이하게$함$(Rivadeneira$et$al.)•특정$순서로$배열할$때$발생할$수$있는$bias를$완화시켜줌

��!��"�� �����Ã�ËÏÀ�9A���!��"�����¯�I�ä;%�

©d½Á�+�(À�ö�ü�Ð�i�º²Ć��/û,�UKI�+į¸��

m§��^XÄ�léfJ�èÝ?�ç1Ã��j\�(À�+Á�ä¿d1�

üKv���!��"�� �����¯��䢦�ý�)¯¸�

�����������

Providing

a More Consistent Presentation

•사용자$리뷰뿐만아니라$기본적인$정보$(시간,$가격대,$분위기,$사진$등)가$용이할$수$있음.

→$기존의$review$page와$Review$SPOTLIGHT를$결합하는$방식을$고려

�����������

Graceful Recovery from

Linguistic Analysis Problems

•자연어$처리$프로세스(natural$language$process)-$몇몇$pair들은$폰트의$색이$적절하지$못함-$복잡한$리뷰에서$문제시$될$수$있음

•word$pair의$의미가$context에$따라$다를$수$있음“Last$time$we$went,$we$had$and$loved$the$grilled$chicken”“I$will$avoid$their$grilled$chicken$next$time”

•negative$sentence의$문제“This$is$not$a$good$restaurant”

→$추가적인$context$정보를$제공하는$것이$중요

�����������

Controlling

Displayed Word Pairs

사용자의"word"pairs"조절"문제

Subjective-Objective$Parameter

Time$Parameter

Û4ÍÅ�L©XÀ�c�ô_¯�Nü�F7%Ã�Ã/Æ��į¸��

#������ ���$?�#�"����������$(Ä��Ò�Â�Ä?����µI�K

g��

ðÈ}����������(À�+Ä�āv�¾¯�ÉKv�W`9ÿ��ç:Ã�

j��ó�%Ë�´`U�j�=Þ�ñ9i��ā�ï���ÉÁ�)¯¸�

�����������

REVIEWSPOTLIGHTEXTENSION

�����������

타임라인$히스토그램$

Yelp.com$&$Amazon.com

sentiment$타입을$표시할$수$있는$체크박스

•구글$크롬에$공개$11명의$사용자$로그$분석-$분석$결과$Review$SPOTLIGHT$이용률$높음-$기능에$대해서$이해하고$계속해서$사용하였음

Revised

Review SPOTLIGHT INTERFACE

�����������

CONCLUSIONS&

FUTURE WORKS

�����������

•Reliability의"문제악의적인$동기로$게시된$리뷰를$필터링$할$수$없음

→$신뢰할$수$있는$리뷰어를$선정하는$방법

•언어의"문제"adjective-noun$pairs가$모든$언어에서$적절할$것인가의$문제

�����������

changhoon’sTHOUGHT

�����������

1.논문의"구조가"짜임새"있고"정교하다.formative"study$→$prototype$→$evaluation$→$extension

2.실험에서"사용하는"것들"인상적이다."논문들에서"개발한"툴들"이용함..."재인용

3.CommandMaps이"한가지"특징을"다양한"변수를"측정해서"평가했다는"점이"좋았다면,"Review"SPOTLIGHT은"다양한"근거들을"이용해서"하나의"툴을"만드는"과정을"보여주었다는게"인상적이다.

4.인터뷰"내용을"아주"자세하게"소개하고"있어서"로그"데이터"이상의"중요한"비중을"차지하고"있다.

�����������

THANK YOU!

�����������