Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal...

Preview:

Citation preview

Parsing

KevinDuhIntrotoNLP,Fall2019

Refresher

• RecalltheRandomSentenceGeneratortheCFGlecturesandhomework

Sentence

Ssymbol

CFGruleapplication

ExampleSentenceandParseTree

From:KevinGimpel,https://ttic.uchicago.edu/~kgimpel/teaching/31190/lectures/12.pdf

Whatisparsing?

• Parsing:givenSentence,findalegal(ormostprobable)parsetree(withSasroot)

Sentence

Ssymbol

CFGruleapplication

Top-DownParsing(v1)

• RunRandomSentenceGeneratoruntilwegetthesentencewewant

Sentence

Ssymbol

CFGruleapplication

Top-DownParsing(v1)

• RunRandomSentenceGeneratoruntilwegetthesentencewewant

Sentence

Ssymbol

CFGruleapplication

Allrandomlygeneratedtreesandsentences

Allrandomlygeneratedtreeswithdesiredsentence

Mostprobabletree(ifPCFG)

Bottom-UpParsing(v1)

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

Top-Downvs.Bottom-Up

• Bottom-Up:– Doesn’twastetimegeneratingtreesthatdon’tgetourdesiredsentence

• Top-Down:– Doesn’twastetimegeneratingtreesthatdon’tgettoSsymbolattheend

ParsingasaSearchProblem

StartState

IntermediateState

IntermediateState

IntermediateState

DeadEnd

IntermediateState

GoalState

Manymethods!TakeanAIclass.We’llfocusonclassofalgorithmsbasedonDynamicProgramming

SuccessfulalgorithmsbasedonDynamicProgramming

ViterbiAlgorithmforDigitalCommunications

Needleman-WunschAlgorithmforBiologicalSequenceAlignment

Bellman-FordAlgorithmforShortestPathonGraphs

Cocke-Younger-KasamiAlgorithmforparsingoflanguage

WhatisDynamicProgramming

• Astrategyforsolvingproblemsby– dividingintooverlappingsub-problems,and– re-usingsub-problemsolutions

WhatisDynamicProgramming

• Astrategyforsolvingproblemsby– dividingintooverlappingsub-problems,and– re-usingsub-problemsolutions

• Canbethoughtasintelligent“brute-force”:– Tryallpossiblesolutionswithoutactuallytryingallpossiblesolutions

WhatisDynamicProgramming

• Astrategyforsolvingproblemsby– dividingintooverlappingsub-problems,and– re-usingsub-problemsolutions

• Canbethoughtasintelligent“brute-force”:– Tryallpossiblesolutionswithoutactuallytryingallpossiblesolutions

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

CKYAlgorithmforParsingTrees

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)• Rulesmustbe– RHShas2non-terminals:Aà BC– RHShas1terminal:Aà w

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)• Rulesmustbe– RHShas2non-terminals:Aà BC– RHShas1terminal:Aà w

• Bottom-upparsing

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)• Rulesmustbe– RHShas2non-terminals:Aà BC– RHShas1terminal:Aà w

• Bottom-upparsing• Useanefficientdatastructuretostorepartialparsingsub-problemsolutions

ConversionofanygrammartoCNF

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Intuition

• 2-dimmatrixencodingofsolutions

• Eachnon-terminalabovePOShasexactly2children

• Foreach[i,j]span,theremustbeapositioni<k<jthatsplitsitintotwoconstituents

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Example:Parsing“BooktheflightthroughHouston”

• 0Book1the2flight3through4Houston5• [0,1]=parsefor:“Book”• [1,2]=parsefor:“the”• [2,3]=parsefor:“flight”• [0,2]=parsefor:“Bookthe”• [1,3]=parsefor:“theflight”• [0,3]=parsefor:“Booktheflight”• [0,5]=parseforsentence

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Example:Parsing“BooktheflightthroughHouston”

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Note:thisfindsallvalidtrees.Thinkabouthowtomodifythistofindthemostprobabletree

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

DynamicProgrammingSub-Problems

[0,n]

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

• CKYExercise-Parsethissentence:fruitflieslikeabanana• Grammar:

– SàNPVP– NPàAdjNoun– NPàDetNoun– VPàVerbNP– VPàVerbPP– PPàPrepNP– Adjàfruit– Nounàfruit– NPàfruit– Nounàflies– Verbàflies– Verbàlike– Prepàlike– Detàa– Nounàbanana

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

Treebank

• Treebank:Acorpusofparsedsentences• PennTreebank– 40ksentences,mostlyfromWallStreetJournal

HowtogetprobabilitiesforPCFG

• Onemethod:Countrulesfromatreebank• P(SàNPVP)=count(SàNPVP)/count(S)• Wecanapplyvarioussmoothingtechniques

• Largeresearchliterature:– HowtogetbestPCFGfromtreebank?–Whatifwedon’thaveatreebank?

EvaluatingParseResults

From:YoavArtzi.http://www.cs.cornell.edu/courses/cs5740/2018sp/lectures/10-parsing-const.pdf

EvaluatingParseResults

From:YoavArtzi.http://www.cs.cornell.edu/courses/cs5740/2018sp/lectures/10-parsing-const.pdf

Recall=#correctcandidateconstituents/#goldconstituents =3/8=37.5%Precision=#correctcandidateconstituents/#constituentsincandidate =3/7=42.9%F1=(2*precision*recall)/(precision+recall)i.e.harmonicmean =40%

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

ModifierA HeadA

d(parent)(child)

Insiders must report purchases and sales immediately

root

DependencyTree

Representsheadandmodifierrelationship

From:AkifumiYoshimoto.http://hylos.jp/about

Insiders must report purchases and sales immediately

root

DependencyParsing

• Input:Sentence(Wordsequence)

• Output: Dependency

• Allelementsareacyclic,connected

S={w1,…,wn}

D={d1,…,dn}

From:AkifumiYoshimoto.http://hylos.jp/about

ProjectiveDependencyTrees:nocrossings

Non-ProjectiveDependencyTrees

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

DependencyvsConstituency

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

DependencyvsConstituency• Differentwaystothinkaboutsyntax• E.g.freewordorderinCzech

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

DependencyvsConstituency• Differentwaystothinkaboutsyntax• E.g.freewordorderinCzech

• Differentapplicationsneeddifferentformalisms

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

DependencyvsConstituency• Differentwaystothinkaboutsyntax• E.g.freewordorderinCzech

• Differentapplicationsneeddifferentformalisms• Linguistictraditionbasedonlanguage– E.Practicalconcerns:– Dependencytreebanksmightbeeasiertoannotate– Constituency-Dependencyconversionpossible

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

TheEisnerAlgorithm• ABottom-UpDynamicProgram

Rules

= +

= +

= word

Dependency

SubTree

(asatomictree)

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

EisneralgorithmextendsCKY

�34

Root The Ways and Means Committee

―s→ ―t→

← k →

CKYTable:C[start][end][rule]fork:1..nfors:1..nt=s+kift>nthenbreak

endforendfor

= +

= +

Bottom-upDynamicProgramming

From:AkifumiYoshimoto.http://hylos.jp/about

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

RecurringTheme!

Recommended