23
Identify User’s Interest from Dialogue by Learning with a Partial Observable Markov Decision Process Oscar Li Jen Hsu ( 徐徐徐 ) Von-Wun Soo ( 徐徐徐 ) Hsu Chen Chen ( 徐徐徐 ) Institute of Information Systems and Applications National Tsing Hua University NCS 2013 2013/12/ 14 1

# Identify User's Interest from Dialogue by Learning with a Partial Observable Markov Decision Process

• View
272

• Download
1

Embed Size (px)

Citation preview

Identify User’s Interest from Dialogue by Learning with a Partial Observable Markov

Decision Process

Oscar Li Jen Hsu (徐立人 )

Von-Wun Soo (蘇豐文 )

Hsu Chen Chen (陳旭晨 )

Institute of Information Systems and Applications

National Tsing Hua University

NCS 2013

2013/12/14

1

The Problem

2

Movie Information System

3

Movie Information System

4

Movie Information System

5

Movie Information System

6

Movie Information System

7

Which one should be done? To ask for more information. To make a guess based on probability.

Ask many question <= Annoying Probabilistic guess <= Not always correct.

8

System can ask few questions Users have patient for that.

9

The Method

10

The CONCEPT of the system

11

Eighteen movie types

12

Algorithm 1: E-HowNet Module

13

Example

14448：：24436

52 0.58 0.58 0.58 0.001：：11 0.58 0.58 2 0.03

52111122

112

18

14

Table 1 : Actions

Action class Show message

SQL(Movie_Type)Dump the movie information about ( Movie_Type )

SELECT(size,types)“您想找的影片類型是下列哪一個呢 ?”

CONFIRM(Movie_Type)

“您想找 (Movie_Type)類的電影嗎 ?”

15

Observation Matrix

Confirm0

1 0 0 0 0

0 0.25

0.25

0.25

0.25

0 0.25

0.25

0.25

0.25

0 0.25

0.25

0.25

0.25

0 0.25

0.25

0.25

0.2516

Observation Matrix of SELECTSelect2-0,2

1 0 0 0 0

0 0.33

0 0.33

0.33

0 0 1 0 0

0 0.33

0 0.33

0.33

0 0.33

0 0.33

0.33

Select4-0,2,3,4

1 0 0 0 0

0.05

0.8 0.05

0.05

0.05

0 0 1 0 0

0 0 0 1 0

0 0 0 0 1

17

Reward Matrix

SQL0

20 -20 -20 -20 -20

-20 20 -20 -20 -20

-20 -20 20 -20 -20

-20 -20 -20 20 -20

-20 -20 -20 -20 20

18

Reward Matrix

Select2-0,1

-2 -2 -2 -2 -2

-2 -2 -2 -2 -2

-2 -2 -2 -2 -2

-2 -2 -2 -2 -2

-2 -2 -2 -2 -2

19

Table 2: Three test cases

20

Figure 3: The policy graph of 1st test with SO=5 in Table 2

21

Exam

ple

22

CONCLUSION

Parallel Computing

Over-rational Limit the Actions

23