39
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 1 Tree-based Translation Models Yusuke Oda @odashi_t 2014/6/5 NAIST MT-Study Group

Tree-based Translation Models (『機械翻訳』§6.2-6.3)

Embed Size (px)

DESCRIPTION

NAISTの機械翻訳グループの勉強会で発表した資料です。

Citation preview

Page 1: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 1

Tree-based Translation Models

Yusuke Oda@odashi_t

2014/6/5 NAIST MT-Study Group

Page 2: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 2

Agenda● (6.2) Synchronous Context Free Grammar (SCFG)

– (6.2.2) Learning SCFG

– (6.2.3) Introducing Syntax Labels

– (6.2.4) Features

– (6.2.5) Decoding

– (6.2.6) Rescoring

● (6.3) Synchronous Tree Substitution Grammar (STSG)

– (6.3.1) Characteristics of STSG

– (6.3.2) Learning STSG

– (6.3.3) Features

– (6.4.4) Decoding

– (6.3.5) Binarization

● (6.4) Synchronous Parsing

– (6.4.1) Inversion Transduction Grammar (ITG)

– (6.4.2) Span Pruning

– (6.4.3) Beam Search

– (6.4.4) Two Parsing

Hiero

Travatar

Page 3: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 3

Synchronous Context Free Grammar(SCFG)

Page 4: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 4

Learning SCFG

● Synchronous rules are retrieved from each parallel corpora and their word alignment .

● : Source sentence

● : Target sentence

● : Set of word alignment

Page 5: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 5

Closed Phrase Pair under Word Alignment● A phrase pair is closed under its word alignment

● Phrase pair and alignment satisfy below:

he

will

dissolve

the

diet

in

the

near

future彼 は 近い

うち

に 国会

を 解散

する

(国会 を → the diet)

Page 6: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 6

Extracting Abstract Rules

● We can make more abstract synchronous rules by replacing some words in a phrase pair into a non-terminal symbol, when the phrase pair covers other "small" phrase pair.

dissolve

the

diet

in

the

near

future

近い

うち

に 国会

を 解散

する

dissolve

in

the

に 解散

する

(国会 を, the diet)

(近い うち, near future)

(近い うち ... 解散 する, dissolve the ... near future)

(X1 に X2 解散 する, dissolve X2 in the X1)

Page 7: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 7

Hiero Grammar

● Hierarchical phrase grammar (Hiero Grammar):

– Set of all synchronous rule in the parallel corpus

● Algorithm:

1.

where is the set of all possible phrase pair in the parallel corpora.

2. If a rule and a phrase pair satisfies then

3.

Page 8: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 8

Constraints of Hiero Rules

● To suppress size and ambiguity of Hiero grammar, we can introduce some constraints for rule extraction.

● Minimal phrase pair

– (国会 を, the diet) ... BAD

– (国会, the diet) ... GOOD

● Phrase length– (奈良 先端 科学 技術 大学院 大学 情報 科学 研究 科 自然 言語 処理 学 研究 室, ...) BAD (too many words)

● Number of symbol– X → 〈あらゆる X1 を 全て X2 の 方 へ ねじ曲げ た の だ, ...〉 BAD (too many symbols)

● Rank of rule– X → 〈X1 が X2 で X3 に X4 した, ...〉 BAD (too many non-terminals)

the

diet

国会

Page 9: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 9

Glue Rules

● To make large size sentence using small rules, we introduce glue rules as below:

Page 10: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 10

Introducing Syntax Labels

● Up to here, we considered basic ideas of Hiero rules.

– non-terminal symbol are only and .

● This model is very simple, but very ambiguous.

● Next, we introduce syntax information into Hiero rules.

= Syntax-augmented machine translation (SAMT)

S

NP VP

PRP VBZ DT NN

this is a pen

NP

Hiero Syntax

+ → SAMT

Page 11: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 11

Combinatorial Categorical Grammar (CCG)

● SAMT uses categories (≒partial structure of syntax label) based on the idea of combinatorial categorical grammar (CCG) .

● Categories:

: Syntax label with absence of right-side child

: Syntax label with absence of left-side child

: Concatenation of two syntax labels and

Page 12: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 12

Extracting SAMT Rules

dissolve

the

diet

in

the

near

future

近い

うち

に 国会

を 解散

する

VP

VB

NP

PP

DT

NNP

IN

NP

DT

JJ

NN

NP

NP

PP

VP

NP\DT

IN+DT

VP/PP

VP\VB

VP → 〈NP\DT1 に NP2 解散 する, dissolve NP2 in the NP\DT1〉

VP → 〈近い うち IN+DT1 国会 を VB2, VB2 the diet IN+DT1 near future〉

etc...

Page 13: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 13

Probabilistic Formalization of Hiero Model

● We consider that the translation problem using Hiero grammar is maximization of posterior probability (similar to phrase based model):

● And we assume the probability is modeled as log-linear model:

: Set of derivation (≒ set of used synchronous rules)

: Weights

: Feature functions

Page 14: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 14

Features of Hiero Model (1)

● Generative model: likelihoods of translation probability

Forward model:

Backward model:

where

Forward

Backward

Page 15: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 15

Features of Hiero Model (2)

● Generative model: likelihoods of translation probability

Syntax model (f):

Syntax model (e):

where

Syntax (f) Syntax (e)

Page 16: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 16

Features of Hiero Model (3)

● Lexical translation model: goodness of phrase alignment

Forward model:

Backward model:

whereForward

Backward

Page 17: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 17

Features of Hiero Model (4)

● Language model: measuring fluency of hypothesis

Out-of-vocabulary (OOV) penalty: adjusting LM

● Length penalty: adjusting number of words in hypothesis

Glueing penalty: adjusting number of glue rules in derivation

Page 18: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 18

Decoding of Hiero Model

● Now input sentence and set of SCFG rules are given, we find the optimal output sequence :

: Set of possible derivation given a grammar

: Sequence of terminal symbols in given derivationn

Page 19: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 19

Decoding Process

1. Calculate intersection between and .• = Generating syntax forest using CYK algorithm

2. Transform syntax forest into corresponding translation forest .

3. Output the sequence of terminal symbols in that maximizes model score.

S

NP VP

PP NP V

NP P NP

上に

座った

S

NP VP

the dog V NP PP

sat NP of P NP

the upper on the book

"犬 が 本 の 上 に 座った"

"the dog sat on the book"

Page 20: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 20

Synchronous Tree Substitution Grammar(STSG)

Page 21: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 21

Synchronous Tree Substitution Grammar

● STSG is a extension of Tree Substitution Grammar (TSG) for bilingual analysis.

● STSG is a subset of Synchronous Tree Adjoining Grammar (STAG).

● Definition:

SCFG (Hiero)

STSG

STAG

U

USet of non-terminal symbol

Start symbol

Set of terminal symbol

Set of rules

Weight semiring

Page 22: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 22

Synchronous Rules of STSG

● Definition:

where : Elementary tree (source language)

: Elementary tree (target language)

: Association between and

● All rules are also associated a weight:

S

x1:NP VP

x2:NP V

開けた

S

x1:NP VP

VBD x2:NP

opened

frontier

Page 23: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 23

Expressive Power of STSG

● SCFG cannot express the difference of syntax, but STSG can treat it.

● Example:

– This synchronous rule cannot generate using more smaller SCFG rulesbecause these trees not corresponds any structure.

– STSG framework can treat these correspondence of tree structure directly.

NP

NP PP

N P x1:CD PC

犬 が 匹

NP

NNSx1:CD

dogs

Page 24: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 24

Translation Models under STSG Framework

● In the STSG framework, we can use the sequence of frontier nodes (leaves of synchronous rule) instead of full tree.

● 4 translation models are available when we choose either tree or sequence of frontier as data structure about source and target language.

Target : frontier Target : tree

Source : frontierString-to-string

translation(= SCFG)

String-to-treetranslation

Source : treeTree-to-string

translationTree-to-treetranslation

S

x1:NP

VP

x2:NP

V

開けた

Tree

sequence of frontier nodes

Page 25: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 25

Retrieving STSG Synchronous Rules

● Heuristic method (similar to SCFG rule extraction)

: Syntax tree generated from source sentence

: Syntax tree generated from target sentence

dissolve

the

diet

in

the

near

future

近い

うち

に 国会

を 解散

する

VP VB

NP

PP

DT

NNP

IN

NP

DT

JJ

NN

VP

PP NP VP

N P

NP

NP VP PVP

x1:PP x2:NP VP

V P

解散 する

VP

x1:PPx2:NPVB

dissolve

Page 26: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 26

GHKM Algorithm

● Galley-Hopkins-Kinght-Marcu (GHKM) Algorithm

– Generating STSG synchronous rules (string-to-tree rules) by composing minimal rules using inside-outside algorithm.

Minimal ruleSyntax tree

1.Detecting minimal rulesfrom target syntax trees.

2.Generating large synchronousrules by composing minimalrules.

Page 27: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 27

GHKM: Alignment Span (1)

● Alignment span :

– Set of indexes of words in source sentence aligned to partial tree

● Complement alignment span :

– Set of indexes of words in source sentence aligned to other than

● Closure :

– Minimum range that covers the alignment span

Page 28: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 28

GHKM: Alignment Span (2)

he

will

dissolve

the

diet

in

the

near

future

彼 は 近い

うち

に 国会

を 解散

する

VP VB

NP

PP

DT

NNP

IN

NP

DT

JJ

NN

SNP

PRP

MD

Page 29: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 29

GHKM: Admissible Node

● Admissible node:

– Node in target syntax tree that satisfies:

he

will

dissolve

the

diet

in

the

near

future

VP VB

NP

PP

DT

NNP

IN

NP

DT

JJ

NN

SNP

PRP

MD

彼 は 近い

うち

に 国会

を 解散

する

Page 30: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 30

GHKM: Minimal Rule

● Split the syntax tree by admissible node

he

will

dissolve

the

diet

in

the

near

future

VP VB

NP

PP

DT

NNP

IN

NP

DT

JJ

NN

SNP

PRP

MD

彼 は 近い

うち

に 国会

を 解散

する

VP

x1:PP x2:NP x3:VB

x

x3 x2 x1VP

the near future

x

近い うち

DT JJ NN

Page 31: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 31

Extension for Tree-to-tree Model (1)

● We need to extract node pairs of two syntax trees that are admissible each other.

● First, find admissible nodes in given .

● A node pair satisfies below then they are bidirectional admissible:

● Span :

– Minimum range over sentence that covers all terminal symbols in

Page 32: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 32

Extension for Tree-to-tree Model (2)

dissolve

the

diet

in

the

near

future

近い

うち

に 国会

を 解散

する

VP VB

NP

PP

DT

NNP

IN

NP

DT

JJ

NN

VP

PP NP VP

N P

NP

NP VP P

Page 33: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 33

Features of STSG Model (1)

● Generative model: likelihoods of translation probability

Page 34: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 34

Features of STSG Model (2)

● Lexical translation model: goodness of phrase alignment

Page 35: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 35

Features of STSG Model (3)

● Height penalty: adjusting depth of derivation

● Internal node penalty: adjusting total size of derivation

● Some features introduced to Hiero model are also available

Page 36: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 36

Decoding of STSG Model

● STSG decoding is basically same method as Hiero decoding:

Depends on translation model

Page 37: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 37

Difference of Formalization of Each Model

● String-to-string model

– Same model as Hiero (SCFG) model.

● String-to-tree model

– Never use any informations from syntax of source sentence.

● Tree-to-string model

● Tree-to-tree model

– Explicitly use syntax informations of source sentence.

– Translation process can be divided into syntax analysis and decoding.

Sourcesentence

Syntax treeof source sentence

Translationhypothesi(e)s

Syntaxanalyzer Decoder

Non-syntax-basedtranslation

Syntax(tree)-basedtranslation

Page 38: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 38

Formalization of Syntax-based Translation

● Syntax-based translation model uses the syntax tree of source sentence.

● We can ignore because is already decided while syntax analysis.

Page 39: Tree-based Translation Models (『機械翻訳』§6.2-6.3)

14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 39

Questions & Discussions