26
Mining User Experience through Crowdsourcing: A Property Search Behavior Corpus Derived from Microblogging Timelines Yoji Kiyota (NEXT Co., Ltd, Tokyo, Japan) Yasuyuki Nirei, Kosuke Shinoda, and Satoshi Kurihara (Univ. of ElectroCommunicaPons, Tokyo, Japan) Hirohiko Suwa (NAIST, Nara, Japan) DOCMAS/WEIN 2015 (WS1 of WIIAT 2015) 6 th Dec. 2015 at Singapore Management University 1

Mining User Experience through Crowdsourcing: A Property Search Behavior Corpus Derived from Microblogging Timelines

Embed Size (px)

Citation preview

Mining  User  Experience  through  Crowdsourcing:  A  Property  Search  Behavior  Corpus  Derived  from  

Microblogging  Timelines

Yoji  Kiyota  (NEXT  Co.,  Ltd,  Tokyo,  Japan)  Yasuyuki  Nirei,  Kosuke  Shinoda,  

and  Satoshi  Kurihara    (Univ.  of  Electro-­‐CommunicaPons,  Tokyo,  Japan)  

Hirohiko  Suwa  (NAIST,  Nara,  Japan)

DOCMAS/WEIN  2015  (WS1  of  WI-­‐IAT  2015)  6th  Dec.  2015  at  Singapore  Management  University 1

The  goals  of  this  study

 •  Establish  a  method  to  understand  various  behaviors  of  users  who  search  for  proper3es  (for  rent,  for  sales,  etc.)  

•  EsPmate  how  effecPve  is  microtask-­‐based  crowdsourcing  for  annota3ng  microblogging  3melines  with  user  experiences  

2

HOME’S:  an  online  property  search  service  in  Japan

3

CharacterisPcs  of  property  search  (compared  with  other  products)

•  taking  a  long  Pme  for  decision  – potenPal  needs  -­‐>  informaPon  gathering  -­‐>  contacPng  agents  -­‐>  property  preview  -­‐>  decision-­‐making  and  contracPng  

•  user  needs  could  change  –  trade-­‐offs  (price  vs.  condiPons)  –  target  areas  –  for  rent  or  for  sale?  –  ...  

→  understanding  user  needs  is  difficult! 4

ConvenPonal  approaches  for  understanding  user  needs

approaches pros cons Analysis  of  user  behavior  logs

exhausPve  user  behavior  data  on  touch  points  (PCs,  smart  phones,  etc.)  is  available

behaviors  outside  the  available  touch  points  (e.g.  conversa3ons  with  agents,  families  and  friends)  have  major  impacts  on  user  experiences

QuesPonnaires

users’  thoughts  and  senPments  can  be  gathered

unexpected  user  needs  and  unconscious  thoughts  and  sen3ments  cannot  be  obtained

Behavior  observaPon

suitable  for    idenPfying  needs  that  users  themselves  do  not  recognize

user  behaviors  on  property  search  services  change  through  search  processes  which  con3nue  from  weeks  to  several  years

5

Why  we  focused  on  Twiaer  Pmelines?

•  Tweet  data  is  abundant  in  daily  user  behaviors,  including  acPons,  thoughts,  and  senPments  on  property  search  processes

•  User  Pmelines  enable  us  to  trace  property  search  processes  of  specific  users,  which  conPnues  for  from  weeks  to  several  years.

6

A  snapshot  of  a  user  Pmeline

2010-­‐06-­‐14  19:16

Hmm.  We  have  just  moved  in  a  rented  house,  however,  I  get  rapidly  interested  in  buying  a  new  house!  I  feel  like  been  more  interested  by  previewing  properPes.

(48  tweets  are  omiaed) 2010-­‐07-­‐14  17:31

Now  I  come  to  the  decisive  moment  for  selecPng  a  new  house  for  buying.  Presently,  I  prefer  apartment  houses  to  single  houses,  because  single  houses  are  expensive.  But  I'm  indecisive,  and  I  cannot  decide  for  a  while...

(2  tweets  are  omiaed) 2010-­‐07-­‐14  18:17

@foo  Quite  so!  Now  we  are  currently  living  in  a  duplex  apartment,  I  am  worried  when  I  see  my  pregnant  wife  is  climbing  stairs  wheezily...  I  think  it  will  be  too  hard  over  seventy  years  old.  Finally,  we  would  feel  Presome  climbing  stairs,  and  spend  Pme  on  the  ground  floor.

7

The  issues  for  using  Twiaer  data

•  User  Pmelines  also  have  a  lot  of  tweets  which  are  NOT  related  to  property  search  – How  to  extract  only  tweets  which  are  related  to  property  search?  

•  Tweet  analyses  based  on  a  convenPonal  framework  of  property  search  process  are  desirable  – potenPal  needs  -­‐>  informaPon  gathering  -­‐>  contacPng  agents  -­‐>  property  preview  -­‐>  decision-­‐making  and  contracPng  

8

Microtask-­‐based  crowdsourcing

9

The  overview  of  our  approach

10 (Task  1) (Task  2)

Gathering  Twiaer  Pmelines •  Select  Pmelines  of  approx.  40,000  followers  of  @homes_kun  (a  mascot  character    of  HOME’S)  

•  Include  only  Pmelines  in  which  either    of  the  following  keywords  occur  –  key  money  (礼金),  preview  (内見),  rent  (家賃)  

•  Exclude  Pmelines  of  which  over  25%  of  tweets  are  with  hyperlinks  –  because  such  accounts  are  operated  by  real  estate  agents  

→  86  user  3melines  were  extracted  11

Task  1:  disPnguish  Pmeline  fragments  related  to  property  search  behaviors

•  Each  microtask  is  genarated  by  dividing  user  Pmelines  into  fragments  (at  most  five  tweets)  – 2,400  microtasking  ques3ons  were  generated  

•  Each  microtask  has  three  choices    •  Each  microtask  is  requested  to  three  workers  (applying  the  majority  rule)  

•  A  task  set  consists  of  five  microtasks  – one  of  the  five  microtask  is  an  embedded  (dummy)  task  – workers  who  send  some  wrong  answers  to  the  embedded  task  were  eliminated  

12

A  task  quesPon  of  Task  1 Q:  Judge  whether  the  tweet  user  want  to  search  properPes  or  not,  by  viewing  the  following  Pmeline  fragment.

a  Pmeline  fragment  (five  tweets)

he/she  is  searching  properPes. he/she  is  NOT  searching  properPes. I  don’t  know.

13

Task  1:  stats •  Task  size  – 2,400  microtasking  quesPons  – 396  workers  parPcipated  in  the  task  – all  the  microtasks  were  performed  in  2  hours  25  min.  

– 18,000  JPY  (approx.  150  USD)  •  223  of  396  workers  correctly  answered  all  the  embedded  tasks,  and  secondly  105  workers  correctly  –  the  answers  by  the  328  workers  were  finally  accepted  

14

Task  2:  results  by  applying  the  majority  rule

User  Pmelines  which  have  either  of  the  286  fragments  are  extracted  as  the  candidates  for  Task  2

15

Task  2:  tagging  of  user  Pmelines  with  four  property  search  stages

•  Choose  only  user  Pmelines  in  which  mulPple  fragments  within  six  months  were  categorized  by  the  majority  rule  – 67  user  Pmelines  were  chosen  

•  The  task  definiPon:  annotate  each  Pmeline  fragment  (at  most  ten  tweets)  into  five  categories  (four  property  stages  +  “no  stage”)  

•  Each  microtask  is  judged  by  the  majority  rule

16

Four  property  search  stages

S1 potenPal  needs  for  property  search

S2 gathering  of  property  informaPon

S3 contacPng  agents  and  previewing  properPes

S4 decision-­‐making  and  contracPng 17

Issues  of  annotaPon

•  A  task  quesPon  with  five  choices  (four  stages  +  “no  stage”)  is  not  suitable  for  microtask-­‐based  crowdsourcing  – difficult  tasks  should  be  divided  into  combinaPons  of  easy  tasks  

•  Naïve  division  of  an  annotaPon  tasks  into  a  combinaPon  of    five  Yes/No  quesPons  extremely  increases  costs

18

A  task  flow  eliminaPng  #  of  quesPons  using  dependencies  between  stages

19

Whether  does  the  user  have  potenPal  needs  for  property  search?

Whether  is  the  user  gathering  property  informaPon?

Yes

Yes

Yes

“no stage”�

S1 (potential needs) �

S2 (gathering information)�

S3 (contacting agents)�

S4 (decision-making)�

Yes

No

No

No

No

Whether  is  the  user  contacPng  agents  and  previewing  properPes

Whether  is  the  user  decide  to  move?

2400  fragments

32  fragments

51  fragments

47  fragments

14  fragments

196  fragments

132  fragments

68  fragments

Task  2:  combinaPon  of  stages

single  stage  (50  3melines) mul3ple  stages  (17  3melines)

20

Major  user  behaviors  in  S1

behaviors #  of  tagged  

fragments

#  of  users

cohabitaPon  with  partners 3 3

college/university  graduaPon

1 1

changing  jobs 2 1

lease  expiraPon  of  rooms 1 1

21

Major  user  behaviors  in  S2

behaviors #  of  tagged  

fragments

#  of  users

work  trip  lengths 13 12

costs  (rents  and  prices) 20 17

locaPon 7 6

storage 3 3

menPons  of  property  searches

10 7

22

Major  user  behaviors  in  S3

behaviors #  of  tagged  fragments

#  of  users

work  trip  lengths 7 7 costs  (rents  and  prices) 20 11 locaPon 6 3 public  security 3 3 menPons  of  property  searches 15 12 menPons  of  previewing  properPes 9 7 complicaPons  for  agents 4 3

23

Major  user  behaviors  in  S4

behaviors #  of  tagged  

fragments

#  of  users

menPons  of  decisions  of  new  houses

3 3

complicaPon  for  agents 3 4

24

Related  work •  Twiaer  as  a  social  sensor  – Dow  Jones  Industrial  Average  (Bollen  2011)  –  stock  market  events  (Ruiz  2012)  –  earthquake  reporPng  system  (Sakaki  2013)  –  this  study  focuses  on  gaining  deeper  insights  for  user  experiences  

•  AnnotaPng  Twiaer  Pmelines  using  microtask-­‐based  crowdsourcing  –  named  enPPes  (person,  organizaPon,  etc.)  (Finin  2010)  

–  this  study  focuses  on  user  experiences  /  behaviors  25

Conclusion

•  Mining  user  experiences  and  behaviors  of  property  search  by  applying  microtask-­‐based  crowdsourcing  to  Twiaer  Pmelines  – effecPve  for  tracing  long-­‐Pme  property  search  processes  

•  Future  work  –  larger  experiments  – applying  to  other  domains  (cars,  insurances,  educaPons,  and  jobs)

26