51
Parsing Kevin Duh Intro to NLP, Fall 2019

Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Parsing

KevinDuhIntrotoNLP,Fall2019

Page 2: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Refresher

• RecalltheRandomSentenceGeneratortheCFGlecturesandhomework

Sentence

Ssymbol

CFGruleapplication

Page 3: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

ExampleSentenceandParseTree

From:KevinGimpel,https://ttic.uchicago.edu/~kgimpel/teaching/31190/lectures/12.pdf

Page 4: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Whatisparsing?

• Parsing:givenSentence,findalegal(ormostprobable)parsetree(withSasroot)

Sentence

Ssymbol

CFGruleapplication

Page 5: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Top-DownParsing(v1)

• RunRandomSentenceGeneratoruntilwegetthesentencewewant

Sentence

Ssymbol

CFGruleapplication

Page 6: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Top-DownParsing(v1)

• RunRandomSentenceGeneratoruntilwegetthesentencewewant

Sentence

Ssymbol

CFGruleapplication

Allrandomlygeneratedtreesandsentences

Allrandomlygeneratedtreeswithdesiredsentence

Mostprobabletree(ifPCFG)

Page 7: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Bottom-UpParsing(v1)

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Page 8: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

Page 9: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Top-Downvs.Bottom-Up

• Bottom-Up:– Doesn’twastetimegeneratingtreesthatdon’tgetourdesiredsentence

• Top-Down:– Doesn’twastetimegeneratingtreesthatdon’tgettoSsymbolattheend

Page 10: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

ParsingasaSearchProblem

StartState

IntermediateState

IntermediateState

IntermediateState

DeadEnd

IntermediateState

GoalState

Manymethods!TakeanAIclass.We’llfocusonclassofalgorithmsbasedonDynamicProgramming

Page 11: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

SuccessfulalgorithmsbasedonDynamicProgramming

ViterbiAlgorithmforDigitalCommunications

Needleman-WunschAlgorithmforBiologicalSequenceAlignment

Bellman-FordAlgorithmforShortestPathonGraphs

Cocke-Younger-KasamiAlgorithmforparsingoflanguage

Page 12: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

WhatisDynamicProgramming

• Astrategyforsolvingproblemsby– dividingintooverlappingsub-problems,and– re-usingsub-problemsolutions

Page 13: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

WhatisDynamicProgramming

• Astrategyforsolvingproblemsby– dividingintooverlappingsub-problems,and– re-usingsub-problemsolutions

• Canbethoughtasintelligent“brute-force”:– Tryallpossiblesolutionswithoutactuallytryingallpossiblesolutions

Page 14: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

WhatisDynamicProgramming

• Astrategyforsolvingproblemsby– dividingintooverlappingsub-problems,and– re-usingsub-problemsolutions

• Canbethoughtasintelligent“brute-force”:– Tryallpossiblesolutionswithoutactuallytryingallpossiblesolutions

Page 15: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

Page 16: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

CKYAlgorithmforParsingTrees

Page 17: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)

Page 18: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)• Rulesmustbe– RHShas2non-terminals:Aà BC– RHShas1terminal:Aà w

Page 19: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)• Rulesmustbe– RHShas2non-terminals:Aà BC– RHShas1terminal:Aà w

• Bottom-upparsing

Page 20: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

CKYAlgorithmforParsingTrees

• AssumeChomskyNormalForm(CNF)• Rulesmustbe– RHShas2non-terminals:Aà BC– RHShas1terminal:Aà w

• Bottom-upparsing• Useanefficientdatastructuretostorepartialparsingsub-problemsolutions

Page 21: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

ConversionofanygrammartoCNF

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Page 22: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Intuition

• 2-dimmatrixencodingofsolutions

• Eachnon-terminalabovePOShasexactly2children

• Foreach[i,j]span,theremustbeapositioni<k<jthatsplitsitintotwoconstituents

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Page 23: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Example:Parsing“BooktheflightthroughHouston”

• 0Book1the2flight3through4Houston5• [0,1]=parsefor:“Book”• [1,2]=parsefor:“the”• [2,3]=parsefor:“flight”• [0,2]=parsefor:“Bookthe”• [1,3]=parsefor:“theflight”• [0,3]=parsefor:“Booktheflight”• [0,5]=parseforsentence

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Page 24: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Example:Parsing“BooktheflightthroughHouston”

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Page 25: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Note:thisfindsallvalidtrees.Thinkabouthowtomodifythistofindthemostprobabletree

Page 26: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

DynamicProgrammingSub-Problems

[0,n]

Page 27: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

From:Jurafsky&Martin(2009),SpeechandLanguageProcessing,2nded.,PearsonEducation

Page 28: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

• CKYExercise-Parsethissentence:fruitflieslikeabanana• Grammar:

– SàNPVP– NPàAdjNoun– NPàDetNoun– VPàVerbNP– VPàVerbPP– PPàPrepNP– Adjàfruit– Nounàfruit– NPàfruit– Nounàflies– Verbàflies– Verbàlike– Prepàlike– Detàa– Nounàbanana

Page 29: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

Page 30: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Treebank

• Treebank:Acorpusofparsedsentences• PennTreebank– 40ksentences,mostlyfromWallStreetJournal

Page 31: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

HowtogetprobabilitiesforPCFG

• Onemethod:Countrulesfromatreebank• P(SàNPVP)=count(SàNPVP)/count(S)• Wecanapplyvarioussmoothingtechniques

• Largeresearchliterature:– HowtogetbestPCFGfromtreebank?–Whatifwedon’thaveatreebank?

Page 32: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

EvaluatingParseResults

From:YoavArtzi.http://www.cs.cornell.edu/courses/cs5740/2018sp/lectures/10-parsing-const.pdf

Page 33: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

EvaluatingParseResults

From:YoavArtzi.http://www.cs.cornell.edu/courses/cs5740/2018sp/lectures/10-parsing-const.pdf

Recall=#correctcandidateconstituents/#goldconstituents =3/8=37.5%Precision=#correctcandidateconstituents/#constituentsincandidate =3/7=42.9%F1=(2*precision*recall)/(precision+recall)i.e.harmonicmean =40%

Page 34: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

Page 35: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

ModifierA HeadA

d(parent)(child)

Insiders must report purchases and sales immediately

root

DependencyTree

Representsheadandmodifierrelationship

From:AkifumiYoshimoto.http://hylos.jp/about

Page 36: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Insiders must report purchases and sales immediately

root

DependencyParsing

• Input:Sentence(Wordsequence)

• Output: Dependency

• Allelementsareacyclic,connected

S={w1,…,wn}

D={d1,…,dn}

From:AkifumiYoshimoto.http://hylos.jp/about

Page 37: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

ProjectiveDependencyTrees:nocrossings

Non-ProjectiveDependencyTrees

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

Page 38: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

DependencyvsConstituency

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

Page 39: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

DependencyvsConstituency• Differentwaystothinkaboutsyntax• E.g.freewordorderinCzech

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

Page 40: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

DependencyvsConstituency• Differentwaystothinkaboutsyntax• E.g.freewordorderinCzech

• Differentapplicationsneeddifferentformalisms

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

Page 41: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

DependencyvsConstituency• Differentwaystothinkaboutsyntax• E.g.freewordorderinCzech

• Differentapplicationsneeddifferentformalisms• Linguistictraditionbasedonlanguage– E.Practicalconcerns:– Dependencytreebanksmightbeeasiertoannotate– Constituency-Dependencyconversionpossible

From:Kubler,McDonald,Nivre(2009).DependencyParsing.SynthesisLecturesinHLT,Morgan&Claypool

Page 42: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

TheEisnerAlgorithm• ABottom-UpDynamicProgram

Rules

= +

= +

= word

Dependency

SubTree

(asatomictree)

Page 43: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 44: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 45: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 46: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 47: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 48: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 49: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

The Ways and MeansRoot Committee

= +

= +

= word

Dependency

SubTree

(asatomictree)

TheEisnerAlgorithm

From:AkifumiYoshimoto.http://hylos.jp/about

Page 50: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

EisneralgorithmextendsCKY

�34

Root The Ways and Means Committee

―s→ ―t→

← k →

CKYTable:C[start][end][rule]fork:1..nfors:1..nt=s+kift>nthenbreak

endforendfor

= +

= +

Bottom-upDynamicProgramming

From:AkifumiYoshimoto.http://hylos.jp/about

Page 51: Parsing - GitHub Pages · 2019-12-13 · What is parsing? • Parsing: given Sentence, find a legal (or most probable) parse tree (with S as root) Sentence S symbol CFG rule application

Overview

• ConstituencyParsing– ParsingasSearch;DynamicProgramming– CKYAlgorithm

• TreebanksandEvaluation• DependencyParsing– Definitions– EisnerAlgorithm

RecurringTheme!