30
Dialogue Modelling Milica Gašić Dialogue Systems Group

Dialogue Modelling

  • Upload
    osgood

  • View
    55

  • Download
    1

Embed Size (px)

DESCRIPTION

Dialogue Modelling. Milica Ga š i ć. Dialogue Systems Group. Why are current methods poor?. Dialogue as a Partially Observable Markov Decision Process ( POMDP ). a t. s t. s t+1. State is unobservable and depends on the previous state and action: - PowerPoint PPT Presentation

Citation preview

Slide 1

Dialogue ModellingMilica GaiDialogue Systems Group1Why are current methods poor?

What you see from this video is that the system constantly rejects user input probably due to the same low confidence score. 2Dialogue as a Partially Observable Markov Decision Process (POMDP)atstst+1rtotot+1State is unobservableand depends on the previous state and action:P(st+1|st,at) the transition probabilityState depends on a noisy observationP(st|ot) -- the observation probability Action selection (policy) is based on the distribution over all states at every time step t belief state b(st)In order to overcome this problem we model the dialogue as a POMDPThat means that the dialogue is a sequences of states, where the next state depends on the previous one and each state is unobservable, depending on a noisy observation.There are three important components that need to be defined or estimatedTransition probabilityObservation probabilityReward function

At every point in the dialogue we maintain a distribution over all states. That is called the belief state. and based on that we take action.3How to track belief state?

A very important question is how to calculate the belief state.It is important to note that belief state depends on everything that happened before in the dialogue, all the observations there were.In order to calculate the belief state we are first going to review belief propagation for Bayesian networks, see some common examples and that well help us calculate the belief state.4Belief propagationProbabilities conditional on the observations

Interested in the marginal probabilities p(x|D), D={Da,Db}

Da Dbx

Belief state of a note is the probability of the node given the observationsThe ones that influcence x (Da)And the ones influcenced by xWe are interested in p(x|D) called the marginal probabilityIt can be calculated in the following way5Belief propagation Da Dbx Dc Dd

Split Db further into Dc and DdIn case the observations are split then the distibution can be further factored6Belief propagation Da Dcac Dbb

If there are unobservable nodes we need to sum up over all their values7Belief propagation

Da Dbab

8How to track belief state?

atstst+1rtotot+1Belief state trackingRequires summation over every dialogue state!!!

atstst+1rtotot+1Requires summation over all possible states at every dialogue turn intractable!!!When we do belief propagation we need to multiply marginal probabilities and sum over the unknown variablesIn this case observed variables are o and a And we need to sum over all possible statesThis is a huge limitation since the number of possible dialogue states is very large10Challenges in POMDP dialogue modellingThere are three main challenges that face POMDP dialogue modelling1. is How to define the state space?-what is essential to include in the dialogue space-we obviously want to support real world dialogue systems-but we saw that we need to sum over all the state values at every turn which can become very slow2. Given that we defined the state, how to tractably maintain the belief state? Can we do any approximations?3. Finally, we need to know what the transition and the observation parameters are? Can we learn them from data?11How to represent dialogue state?The dialogue state needs to have three properies.

It needs to satisfy Markov property. The next state should only depend on the previous state. In order to achieve that state needs to have a memory of what happened before in the dialogue. This is captured by the dialogue history.We want to model task-oriented dialogues. In these dialogues, the user has a clear goal in mind that they want to achieve. The state needs to be aware of this goal in order to be able to satisfy it.Finally in order to be able to deal with errors from the recogniser, it needs to know that the user said the true user act. 12Dialogue state factorisation Decompose the sate into conditionally independent elements:user goaluser actionstgtutdtdialogue historyatrtotot+1gt+1ut+1dt+1So now lets see how the Bayesian network changes when we decompose it into these conditionally independent elements.13Belief updategtutdtatrtotot+1gt+1ut+1dt+1Requires summation over all possible goals intractable!!!Requires summation over all possible histories and user actions intractable!!!

And now we can write down the belief update formula, again using the belief propagation rules and trying to sum over the unknown variablesIn this case we need to sum over all goals, dialogue histories and the user acts. And these sets can again be huge!Still, there exists solutions which allow for this model to be used for real world dialogue systems.14Dialogue models for real-world dialogue systemWe are going to focus here on two models for building real world spoken dialogue systems.They both use rather different approximations to be able to enable belief tracking15Hidden Information State systemLets remind ourselves of the dialogue act specification16Hidden Information State system dialogue actsinform ( pricerange = cheap, area = centre)dialogue act typesemantics slots and valuesIs there um maybe a cheap place in the centre of town please?informrequestconfirmtype=restaurantfood=ChineseDialogue acts are semantic representation of the user or the system utterance

Dialogue type incorporate the intention that the user or the system haveExamples: inform, confirm, request,..

Slot value pairs represent the information that the utterance contains

17Hidden Information State system -- ontologyOntology defines relationship between different conceptsConcepts include slots and valuesFor example types of venues are hotels, restaurant and barRestaurants have food type: Chinese, Indian, Italian..Hotels have stars: 1, 2, 3Both restaurants and hotel can have an area: part of town they are located in: north, west, ...

18Hidden Information State system belief updateOnly the user acts from the N-best IistDialogue histories take a small number of valuesGoals are grouped into partitions All probabilities are handcrafted

Lets see now how each of these components look like19Dialogue history in the HIS systemDialogue history ideally represent everything that happened

History states: system informed, user informed, user requested, system requested for each concept in the dialogue

either 1 or 0 and defined by a finite state automaton

Its not possible to represent everything that happened in a dialogue so we only want to represent the minimal amount of history that can help us maintain the Markov propery

Lets see now how we group goals together to make the belief tracking tractable20HIS partitions Represent group of (most probable) goals

Dynamically built during the dialogue

is set to a high value if gt+1 is in line with gt and at, otherwise a small value

Note the probabilities are handcrafed

Lets see how we dinamically build dialogues21HIS partitions --exampleSystem: How may I help you? request(task)User: Id like a restaurant in the centre. inform(entity=venue, type=restaurant, area=centre)entity! venueentity venue type area !restaurantentity venue type area restaurant !central !centralentity venue type area !restaurant centralentity venue type area restaurant centralentity=venuetype=restaurantarea=centralPruning23entity! venueentity venue type area !restaurantentity venue type area restaurant !central !centralentity venue type area !restaurant centralentity venue type area restaurant centralentity=venue 0.9type=restaurant 0.2area=central 0.5

Hidden Information State systemsAny limitations?What are the limitations of the HIS system?

All probability distributions are handcrafted!!

Bayesian Udapte of Dialogue State system offers solutions.24Bayesian Update of Dialogue State systemThe main idea is to further decompose the dialogue state into conditionally indented elements. In that way there will be many factors but summation is needed over small sets which makes it more tractable. Finally we can even learn what the distribution is.25Bayesian network model for dialoguegtutdtatrtotot+1gt+1ut+1dt+1gtfooddtfoodutfoodgtareadtareautareagt+1fooddt+1foodut+1foodgt+1areadt+1areaut+1foodDecompose each goal for a concept.26Belief trackingFor each node xStart on one side, and keep getting p(x|Da)Then start on the other ends and keep getting p(Db|x)To get a marginal simply multiply these

Every time we need to sum over unobservable variable it contains a small number of values for example area has north, east, west and south.27Bayesian network model for dialogueatrtotot+1gtfooddtfoodutfoodgtareadtareautareagt+1fooddt+1foodut+1foodgt+1areadt+1areaut+1foodDistributions can be parameterisedAssume that the parameters are just another unknown node in the networkOptions for learning the parameters:Annotate the goal and do Maximum LikelihoodIf the state is not factored Expectation MaximisationExpectation Propagation approximates the true distribution over parameters with the one that matches expectation and all the moments

28Training policy using different parametersPolicy trained using reinforcement learning (explained in next lecture)Examined on different errors in the user inputAverage reward

Summary