26

A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

  • Upload
    roana

  • View
    35

  • Download
    1

Embed Size (px)

DESCRIPTION

A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression. New York University Wei XU & Ralph Grishman @ACL 2009, Singapore. Outline. Motivation & Previous Work Sentence Compression Approach Linguistically-motivated Heuristics Word Significance - PowerPoint PPT Presentation

Citation preview

Page 1: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression
Page 2: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Motivation & Previous WorkSentence Compression Approach

Linguistically-motivated Heuristics Word Significance Compression Generation and Selection

Experiment ResultsConclusions & Future Work

Page 3: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

no Chinese parallel corpushard to create a

sentence/compression parallel corpus

Page 4: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

An example of system output[Original]第四种子乔科维奇退赛 ,让原以三比六 ,六比一 ,四比一领先的第二种子纳达尔获胜过关 .Fourth seed Djokovic withdrew from the game, and allowed second seed Nadal , who was leading 3-6 , 6-1 , 4-1 , to claim the victory and progress through.

[Human]乔科维奇退赛让纳达尔获胜过关 .Djokovic withdrew from the game, and allowed Nadal to claim the victory and progress through.

[Approach 1]乔科维奇退赛 .Djokovic withdrew from the game.

[Approach 2]乔科维奇退赛让种子纳达尔获胜过关 .Djokovic withdrew from the game, and allowed seed Nadal to claim the victory and progress through.

Page 5: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Parse Tree Trim•Dorr 2003 Sentence Scoring•Hori 2003 •Clarke 2006+•Clarke 2008+

Noisy Channel

•Turner 2005 •Knight 2002

•Galley 2007

Decision Tree•Knight 2002•Nguyen 2004

Large Margin Learning•McDonald 2006•Cohn 2007 •Cohn 2008

Unsupervised

Learning

Unsupervised

LearningMaxEnt •Riezler 2003

Supervised LearningSupervised Learning

Sentence Compression

Headline Generati

on

Headline Generati

on

JapaneseSpeech

JapaneseSpeech

Paraphrasing

Corpus

Paraphrasing

CorpusNon-Corpus-

BasedNon-Corpus-

Based

Page 6: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Parse Tree Trimming(Dorr et al. 2003) linguistically-motivated heuristics hand-made rules to remove low content

components iteratively trim until reach desired length reduce the risk of deleting important

information by applying rules in a certain order

safe rules (DT, TIME) more dangerous rules (CONJ) the most dangerous rules

(PP)

Page 7: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Parse Tree Trimming (Dorr et al. 2003) Pros:

▪ comparative good performance▪ retain grammaticality if parsing is correct

Cons:▪ require considerable linguist’s skill to produce

proper rules in a proper order▪ sensitive to POS and parsing errors▪ not flexible and capable to preserve

informative components

Page 8: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Sentence Scoring (Hori & Furui 2004) improved by Clarke & Lapata in 2006 given an input sentence W = w1, w2, … , wn

ranking possible compressions language model + word significance Score(compressed sentence C) = p1 * Word Significance Score (all words in

C) + p2 * Language Model Score (C) + p3 * Subject-Object-Verb Score (all words

in C)

n2

Page 9: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Sentence Scoring (Hori & Furui 2004) language model word significance Pros:

▪ do not rely heavily on training corpus

Cons: ▪ the weighting parameters are experimentally

optimized or estimated by a parallel corpus. ▪ use only language model to encourage

compression and ensure grammaticality

Page 10: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Combine Linguistically-motivated Heuristics

▪ ensure grammaticality ▪ rules are easier to develop, determining only

possible low content components instead of selecting specific constituents for removal

Information Significance Scoring▪ preserve the most important information ▪ enhance the tolerance of POS and parsing

errors

Page 11: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Combined Approach:Heuristics + Information Significance▪ use heuristic to determine potentially low

content constituents ▪ do real deletion according to word

significance

Page 12: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

1. take a Chinese Treebank-style parse as input

2. use linguistically-motivated heuristics to determine potentially removable constituents

3. generate a series of candidate compressions by deleting removable nodes based on word significance

4. select the best compressing according to information density

Page 13: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Combined Approach: Heuristics + Information Significance

Used to determine potentially low content constituents Basic: (same)

▪ parenthetical elements ▪ adverbs except negative▪ adjectives▪ DNPs (phrase + “ 的” , modifiers of NP)▪ DVPs (phrase + “ 地” , modifiers of VP)▪ noun coordination phrases

Complex: (more relaxed, general)▪ verb coordination phrases▪ relative clauses▪ appositive clauses▪ prepositional phrases▪ all children of NP nodes except the last noun word▪ sentential coordination

Page 14: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Heuristics-only Approach Used to remove specific low content constituents Basic: (same)

▪ parenthetical elements ▪ adverbs except negative▪ adjectives▪ DNPs (phrase + “ 的” , modifiers of NP)▪ DVPs (phrase + “ 地” , modifiers of VP)▪ noun coordination phrases

Complex: (more strict, conservative)▪ all children of NP nodes except temporal nouns and proper

nouns and the last noun word▪ all simple clauses (IP) except the first one in sentential

coordination▪ prepositional phrases except those that may contain location or

date information, according to a hand-made list of prepositions

Page 15: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

An example of applying heuristics

*: nodes labeled as removable by combined approach #: nodes trimmed out by heuristics-only approach

( (IP (NP (*NP (NR 韩国 )) (#*ADJP (JJ 现代 )) (NP (#*NN 汽车 ) (NN 公司 ))) (VP (VC 是 ) (NP (#*DNP (NP (NR 沃尔沃 )) (DEG 的 )) (#*ADJP (JJ 潜在 )) (NP (NN 买家 )))) (PU .)))

( (IP (NP (*NP (NR South Korean )) (#*ADJP (JJ Hyundai)) (NP (#*NN motor) (NN company))) (VP (VC is) (NP (#*DNP (NP (NR Volvo)) (DEG ’s)) (#*ADJP (JJ potential)) (NP (NN buyer)))) (PU .)))

POS error

Page 16: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

An example of applying heuristics

*: nodes labeled as removable by combined approach #: nodes trimmed out by heuristics-only approach

( (IP (NP (*NP (NR 韩国 )) (#*ADJP (JJ 现代 )) (NP (#*NN 汽车 ) (NN 公司 ))) (VP (VC 是 ) (NP (#*DNP (NP (NR 沃尔沃 )) (DEG 的 )) (#*ADJP (JJ 潜在 )) (NP (NN 买家 )))) (PU .)))

( (IP (NP (*NP (NR South Korean )) (#*ADJP (JJ Hyundai)) (NP (#*NN motor) (NN company))) (VP (VC is) (NP (#*DNP (NP (NR Volvo)) (DEG ’s)) (#*ADJP (JJ potential)) (NP (NN buyer)))) (PU .)))

trimmed out by heuristic-

only approach

Page 17: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Event-based Word Significance Score verb or common noun: tf-idf proper noun: tf-idf + w 0therwise: 0

weighted parsing tree depend on word itself regardless of POS overcome some POS errors

Page 18: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Generate a series of candidate compressions

by repeatedly trimming the weighted parse tree

greedy algorithm remove one node with the lowest weight

and get a candidate compressed sentence update the weights of all ancestors of the

removed node repeat until no node is removable

Page 19: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Information Density

used to select the best compression

ssentenceoflength

cesignificannounsproperofsumsD

')(

Page 20: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Information DensityD(s) Sentence

0.254 韩国现代汽车公司是沃尔沃的潜在买家 .The South Korean Hyundai Motor Company is a potential buyer of Volvo.

0.288 韩国现代汽车公司是沃尔沃的买家 .The South Korean Hyundai Motor Company is a buyer of Volvo.

0.332 韩国现代公司是沃尔沃的买家 .The South Korean Hyundai Company is a buyer of Volvo.

0.282 韩国公司是沃尔沃的买家 .The South Korean company is a buyer of Volvo.

0.209 公司是沃尔沃的买家 .The company is a buyer of Volvo.

0.0 公司是买家 .The company is a buyer.

Page 21: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

79 documents from Chinese newswires

the first sentence of each news article challenging task headline-like compression average length : 61.5 characters often connects two or more self-

complete sentences together

Page 22: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

Human evaluation

* The combined approach sacrifices grammaticality to reduce the

linguistic complexity of the heuristics** word significance improves the heuristics on informativeness *** with varying length constraints, depending on original sentence

length

Compression Rate

Grammaticality(1 ~ 5)

Informativeness(0~100%)

Human 38.5% 4.962 90.7%

Heuristics 54.1% 4.114 64.9%

Heu+Sig 52.8% 3.854 * 68.8% **

Heu+Sig+L *** 34.3% 3.664 56.1%

Page 23: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

compression with good grammar perform well on most of the cases

perform terribly on about 20 cases out of all 76▪ POS or parsing errors▪ grammatically correct but semantically incorrect

Grammaticality(1 ~ 5)

Number of Sentence

Informativeness(0~100%)

Heuristics > 4.5 45 75.9%

Heuristics >= 4 62 ---

Heu+Sig > 4.5 35 81.8%

Heu+Sig >= 4 57 ---

Page 24: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

First attempt in Chinese heuristics

ensure grammaticality word significance

control word deletion, balancing sentence length and information loss

Pros: not rely on parallel corpus reduce the complexity of composing heuristics easily extend to other languages or domains overcome some POS and parsing errors competitive to a finely-tuned heuristics-only

approach

Page 25: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

applications in summarization, headline generation

keyword selection and weighting language modelparallel corpus in Chinesestatistical, machine learning

Page 26: A Parse-and-Trim Approach with Information Significance for Chinese Sentence Compression

A Parse-and-Trim Approach with Information Significancefor Chinese Sentence Compression