Transcript
Page 1: KSU Team’s Dialogue System at the NTCIR-13 Short Text ...research.nii.ac.jp/ntcir/workshop/Online... · In the retrieval-based methods, a comment text with high similarity with

YoichiIshibashi,Sho Sugimoto,Hisashi Miyamori (KyotoSangyoUniversity,Japan){g1445539,g1444693,miya}@cc.kyoto-su.ac.jp

NTCIR-13STC-2

Generation-basedmethod

Input

The University Entrance exam will be held on 14th and 15th.

Well, today is All Japan Figure Skating Championships.

14日と15日は、 大学入試センター試験です。

さあ、きょうまずは、フィギュアスケートの全日本選手権。

Outputby Baseline

There will be a large-scale fire that is also in western Japan and eastern

Japan.

Aiming for four consecutive championships in the women's singles, athletes of the Japanese championships

participated in the tournament.

西日本や東日本にもなる、 も大規模な火災ができます。

女子シングルで4連覇を狙うで、 日本選手権の選手が、 大会に出場しました

Outputby ACM

It is highly expected to be snowy and windy.

A player who has won the gold medal in women's singles.

雪の風の予想が 広がっています。

女子シングルで3した金メダル を獲得している選手。

Image associated from

the input

Words generated mainly from the associated image

Snowy Gold medal

雪 金メダル

1. Visualassociation fromtheinputtext2. Fusion betweentextualinformationandassociatedvisualinformation3. Responsetextgenerationbasedonthefusedinformation.

How canthevisualinformationbeusedwithoutimageinput?

Priortrain1:Extractionofcontextvectorsbetweentextualandvisualinformation

Priortrain2:Learningforvisualassociation

Finalstep:Generationofresponsetextviaassociation

ThemodelhastwoLSTMencoder,onefusionlayer,andonedecoderwithattention.!"iscalculatedbyfollowingequationatthefusionlayer. Thetrainedmodelisusedtoextractthecorrespondencebetweenthetextualandvisualinformation.

Theassociativeencoder(LSTM)istrained,whichinputsthetextualcontextvector!"$"extractedinstep1andoutputsthevisualcontextvector!%&'correspondingtotheinpututterancetext.

Figure2showsanetworkusedinstep3.Learningisperformedinthenetworkwherethevisualencoderinstep1isreplacedwiththeassociativeencodertrainedinstep2.

• In Rule 1, Mean 34456@2and Mean 34456@1, which are the accuracy whenonly L2 is correct, were more than 0.5, indicating that Run2 had many L1 andfew L2.

•InRule2,Mean34456@2,andMean34456@1gotworsethanthoseinRule1

• Subtitles in TV drama and the video where the subtitles were displayed.→ The image or video may contain

more detailed informationthan texts.

Why doweusethevisualinformation?

• ComparisonbetweenourmodelandbaselinemodelBaselinemodel:Anencoder-decodermodelwithattentionmechanism

•Ourgeneration-basedmodelistrainedwithTVnewsdatainsteadofTVdramadata.

•SubtitlesinTVnewsandthevideowherethesubtitlesweredisplayed.

Ourgeneration-basedmodel(Run2)istrainedwithTVdramadata.

•Ourmodel(ACM)associatedvisualinformationfromtheinputtext.

•Ourmodelcouldgenerateresponsesincludingusefulinformationcomparedwiththemodelwithoutassociation.

Figure 1: Network in step 1

Figure 2: Network in step 3

NTCIR-13STC-2formal-run

AdditionalExperiments

KSUTeam’sDialogueSystemattheNTCIR-13ShortTextConversationTask2

• our model associates the snow scenewith the input text and generatedthe word “snowy” (Fig.3 left).•Notethatitisafactthatitactuallysnowedonthedayoftheexam.

LearningMethod

Overview

!"%&' = )*+,(!""$")

!" = /"$"!""$" +/%&'!"%&' + 12

insufficient amount of learning data, lack of tuning,and�many unknown words (proper nouns) in the test data.

Abstract Conclusion

Figure 3: Example of comparison results

Intheretrieval-based methods, itwasfoundthattheaccuracyisimprovedbyexcludingplacenamesandpersonnamesfromarticlethemes.Inthegeneration-based method,wecouldnotfindenoughevidencefromtheresultsusingtheevaluationdataofSTCtoshowthattheassociatedvisualinformationwouldacceleratetogeneratemoreappropriateresponses. Newly,weconductedadditionalexperimentbasedontheproblemsfoundintheevaluationresults.Asapreliminaryresult,itwasconfirmedthatvisualinformationseemedtoworkeffectivelyinseveralexamples.

Intheretrieval-basedmethods,acommenttextwithhighsimilaritywiththegivenutterancetextisobtained,andthereplytexttothecommenttextisreturnedastheresponsetotheinput.Inthegeneration-basedmethod,weproposetheAssociativeConversationModelthatgeneratesvisualinformationfromtextualinformationandusesitforgeneratingsentencesinordertoutilizevisualinformationinadialoguesystemwithoutimageinput.

• ResponseSelectionThefinaloutputisthereplytext correspondingtotheobtainedcommenttextinDocumentRetrieval.

Attention

Thefactorscanbeconsideredasproblems;

Attributions Fig3left: “NHK����7”,NHK,11January2017Fig3right: “NHK����7”,NHK,23February2017

• QueryGenerationAqueryusedforsimilaritysearchisgeneratedfromthegiveninputdatabyusingtheanalyzersprovidedinSolr.InRun1,theplacenames andpersonnameswereremovedfromthewordsinthethemeasthesearchquery.

• DocumentRetrievalSimilaritysearchbyOkapiBM25 isconductedforonlycommenttextsintheYahoo!newscommentsdata.Theeachquerywasweighted.(refertothetableontheleft)

Retrieval-basedmethod

Output

QueryGeneration

InputDataComment

TextTitle

inYahoo!Topics Theme

DocumentRetrievalDialogueCorpus

ResponseSelection

Querytype WeightCommenttext 1.0TitleinYahoo!Topics 9.1Theme 3.5

Result

Input

<Commenttext> Istoppedgoingoutbybicyclebecausetheforecastsaiditwouldrainorsnowat12o’clock,butthereisnothingfallingyet.

<TitleinYahoo!Topics> SnowfallsindowntownTokyo.SnowlikelytopileupintheKantoflatland<Theme> TochigiPrefecture,Weatherforecast,Gunmaprefecture,Ibaraki prefecture,Snowdamageanditsmeasures

Run1 Run3

Rank1

Itwouldbefunifthesnowoccasionallypiledupindowntownarea.BecauseeventhoughitistheKantolocalnews,thenewsprogramofthekeystationmakesafusslikeanationwideincident.

It'samazingthatasmanyastwoschoolswillparticipatefromKochiprefecture,includinga"21st-centuryselection"slot.

Rank2 Unusually,therewasnosnowlastmonth,wasn'tit? It'salsoheavysnowinthesouthernpartofKanagawaprefecture.

Rank3

It'dbeeasiertoobjectivelyunderstandthesituationwiththeexpressionslikepossibilityorprobability.Ridiculoustodescribeitwiththeword"worry"...It'squiteanemotionalexpression.

Intheelectricityindustry,NiigataprefectureiswithinthejurisdictionofTohokuElectricPowerCo.

OverviewofArchitecture

(Sourceisdescribedbelow)

ExampleofVisualAssociation

ResultsandAnalysisofVisualAssociation

TVnewsDataset

Results

TVdramaDataset

Source

Recommended