Mining User Experience through Crowdsourcing: A Property Search Behavior Corpus Derived from Microblogging Timelines

Mining User Experience through Crowdsourcing: A Property Search Behavior Corpus Derived from

Microblogging Timelines

Yoji Kiyota (NEXT Co., Ltd, Tokyo, Japan) Yasuyuki Nirei, Kosuke Shinoda,

and Satoshi Kurihara (Univ. of Electro-‐CommunicaPons, Tokyo, Japan)

Hirohiko Suwa (NAIST, Nara, Japan)

DOCMAS/WEIN 2015 (WS1 of WI-‐IAT 2015) 6th Dec. 2015 at Singapore Management University 1

The goals of this study

•  Establish a method to understand various behaviors of users who search for proper3es (for rent, for sales, etc.)

•  EsPmate how effecPve is microtask-‐based crowdsourcing for annota3ng microblogging 3melines with user experiences

2

HOME’S: an online property search service in Japan

3

CharacterisPcs of property search (compared with other products)

•  taking a long Pme for decision – potenPal needs -‐> informaPon gathering -‐> contacPng agents -‐> property preview -‐> decision-‐making and contracPng

•  user needs could change –  trade-‐offs (price vs. condiPons) –  target areas –  for rent or for sale? –  ...

→ understanding user needs is difficult! 4

ConvenPonal approaches for understanding user needs

approaches pros cons Analysis of user behavior logs

exhausPve user behavior data on touch points (PCs, smart phones, etc.) is available

behaviors outside the available touch points (e.g. conversa3ons with agents, families and friends) have major impacts on user experiences

QuesPonnaires

users’ thoughts and senPments can be gathered

unexpected user needs and unconscious thoughts and sen3ments cannot be obtained

Behavior observaPon

suitable for idenPfying needs that users themselves do not recognize

user behaviors on property search services change through search processes which con3nue from weeks to several years

5

Why we focused on Twiaer Pmelines?

•  Tweet data is abundant in daily user behaviors, including acPons, thoughts, and senPments on property search processes

•  User Pmelines enable us to trace property search processes of specific users, which conPnues for from weeks to several years.

6

A snapshot of a user Pmeline

2010-‐06-‐14 19:16

Hmm. We have just moved in a rented house, however, I get rapidly interested in buying a new house! I feel like been more interested by previewing properPes.

(48 tweets are omiaed) 2010-‐07-‐14 17:31

Now I come to the decisive moment for selecPng a new house for buying. Presently, I prefer apartment houses to single houses, because single houses are expensive. But I'm indecisive, and I cannot decide for a while...

(2 tweets are omiaed) 2010-‐07-‐14 18:17

@foo Quite so! Now we are currently living in a duplex apartment, I am worried when I see my pregnant wife is climbing stairs wheezily... I think it will be too hard over seventy years old. Finally, we would feel Presome climbing stairs, and spend Pme on the ground floor.

7

The issues for using Twiaer data

•  User Pmelines also have a lot of tweets which are NOT related to property search – How to extract only tweets which are related to property search?

•  Tweet analyses based on a convenPonal framework of property search process are desirable – potenPal needs -‐> informaPon gathering -‐> contacPng agents -‐> property preview -‐> decision-‐making and contracPng

8

Microtask-‐based crowdsourcing

9

The overview of our approach

10 (Task 1) (Task 2)

Gathering Twiaer Pmelines •  Select Pmelines of approx. 40,000 followers of @homes_kun (a mascot character of HOME’S)

•  Include only Pmelines in which either of the following keywords occur –  key money (礼金), preview (内見), rent (家賃)

•  Exclude Pmelines of which over 25% of tweets are with hyperlinks –  because such accounts are operated by real estate agents

→ 86 user 3melines were extracted 11

Task 1: disPnguish Pmeline fragments related to property search behaviors

•  Each microtask is genarated by dividing user Pmelines into fragments (at most five tweets) – 2,400 microtasking ques3ons were generated

•  Each microtask has three choices •  Each microtask is requested to three workers (applying the majority rule)

•  A task set consists of five microtasks – one of the five microtask is an embedded (dummy) task – workers who send some wrong answers to the embedded task were eliminated

12

A task quesPon of Task 1 Q: Judge whether the tweet user want to search properPes or not, by viewing the following Pmeline fragment.

a Pmeline fragment (five tweets)

he/she is searching properPes. he/she is NOT searching properPes. I don’t know.

13

Task 1: stats •  Task size – 2,400 microtasking quesPons – 396 workers parPcipated in the task – all the microtasks were performed in 2 hours 25 min.

– 18,000 JPY (approx. 150 USD) •  223 of 396 workers correctly answered all the embedded tasks, and secondly 105 workers correctly –  the answers by the 328 workers were finally accepted

14

Task 2: results by applying the majority rule

User Pmelines which have either of the 286 fragments are extracted as the candidates for Task 2

15

Task 2: tagging of user Pmelines with four property search stages

•  Choose only user Pmelines in which mulPple fragments within six months were categorized by the majority rule – 67 user Pmelines were chosen

•  The task definiPon: annotate each Pmeline fragment (at most ten tweets) into five categories (four property stages + “no stage”)

•  Each microtask is judged by the majority rule

16

Four property search stages

S1 potenPal needs for property search

S2 gathering of property informaPon

S3 contacPng agents and previewing properPes

S4 decision-‐making and contracPng 17

Issues of annotaPon

•  A task quesPon with five choices (four stages + “no stage”) is not suitable for microtask-‐based crowdsourcing – difficult tasks should be divided into combinaPons of easy tasks

•  Naïve division of an annotaPon tasks into a combinaPon of five Yes/No quesPons extremely increases costs

18

A task flow eliminaPng # of quesPons using dependencies between stages

19

Whether does the user have potenPal needs for property search?

Whether is the user gathering property informaPon?

Yes

Yes

Yes

“no stage”�

S1 (potential needs) �

S2 (gathering information)�

S3 (contacting agents)�

S4 (decision-making)�

Yes

No

No

No

No

Whether is the user contacPng agents and previewing properPes

Whether is the user decide to move?

2400 fragments

32 fragments

51 fragments

47 fragments

14 fragments

196 fragments

132 fragments

68 fragments

Task 2: combinaPon of stages

single stage (50 3melines) mul3ple stages (17 3melines)

20

Major user behaviors in S1

behaviors # of tagged

fragments

# of users

cohabitaPon with partners 3 3

college/university graduaPon

1 1

changing jobs 2 1

lease expiraPon of rooms 1 1

21



fragments

# of users

work trip lengths 13 12

costs (rents and prices) 20 17

locaPon 7 6

storage 3 3

menPons of property searches

10 7

22


behaviors # of tagged fragments

# of users

work trip lengths 7 7 costs (rents and prices) 20 11 locaPon 6 3 public security 3 3 menPons of property searches 15 12 menPons of previewing properPes 9 7 complicaPons for agents 4 3

23



fragments

# of users

menPons of decisions of new houses

3 3

complicaPon for agents 3 4

24

Related work •  Twiaer as a social sensor – Dow Jones Industrial Average (Bollen 2011) –  stock market events (Ruiz 2012) –  earthquake reporPng system (Sakaki 2013) –  this study focuses on gaining deeper insights for user experiences

•  AnnotaPng Twiaer Pmelines using microtask-‐based crowdsourcing –  named enPPes (person, organizaPon, etc.) (Finin 2010)

–  this study focuses on user experiences / behaviors 25

Conclusion

•  Mining user experiences and behaviors of property search by applying microtask-‐based crowdsourcing to Twiaer Pmelines – effecPve for tracing long-‐Pme property search processes

•  Future work –  larger experiments – applying to other domains (cars, insurances, educaPons, and jobs)

26

Data & Analytics

Mining User Experience through Crowdsourcing: A Property Search Behavior Corpus Derived from Microblogging Timelines