Insertion Position Selection Model for Flexible Non-Terminals in Dependency Tree-to-Tree

Insertion Position Selection Model for Flexible Non-Terminals

in Dependency Tree-to-TreeMachine Translation

Toshiaki NakazawaJapan Science and Technology Agency

(JST ）John Richardson Sadao Kurohashi

Kyoto University4/11/2016 @ EMNLP2016

Where to insert?

I found Pikachu by chance

yesterdayinsertion positions

0.70.25 0.02 0.01prob. 0.010.01

Where to insert?

I found Pikachu by chance yesterday

in the parkinsertion positions

0.20.1 0.6 0.010.01

@Texas State Capitol

0.010.1

Pikachu

Dependency Tree-to-Tree Translation

私は昨日

公園で

ピカチュウを

見つけた

私は

を見つけた

Input Translation Rules Output

ピカチュウ Pikachu

偶然 [X7][X7]

偶然

chance

公園 thepark

昨日 yesterday

Dependency Tree-to-Tree Translation

私は昨日

公園で

ピカチュウを

見つけた

私は

を見つけた

Input Translation Rules Output

ピカチュウ Pikachu

偶然

公園 thepark

[X7]偶然

昨日 yesterday

chance

[X7]found

Pikachu

chance

yesterday

Pikachu

chance

yesterday

Pikachu

chance

Flexible Non-terminals[Richardson+, 2016]

floatingsubtreefloatingsubtree

Translation Quality and Decoding Speedw/ and w/o Flexible Non-terminals

• Using ASPEC (Asian Scientific Paper Excerpt Corpus) JE and JC

• Time is a relative decoding time

Ja->En En->Ja Ja->Zh Zh->JaBLEU Time BLEU Time BLEU Time BLEU Time

w/o Flex 20.28 1.00 28.77 1.00 24.85 1.00 30.51 1.00w/ Flex 21.61 6.28 30.57 3.30 28.79 5.16 34.32 5.28

Appropriate Insertion Position Selection• roughly half of all translation rules were

augmented with flexible non-terminals [Richardson+, 2016]

• flexible non-terminals make the search space much bigger -> slower decoding speed, increased search error

• reduce the number of possible insertion positions in translation rules by a Neural Network model

Insertion Position Selection Model for Flexible Non-Terminals

in Dependency Tree-to-TreeMachine Translation

Toshiaki NakazawaJapan Science and Technology Agency

John Richardson Sadao KurohashiKyoto University

4/11/2016 @ EMNLP2016

INSERTION POSITION SELECTION MODEL

Insertion Position Selection Model• For each insertion position:–predict• scores of the insertion positions

– given• input: the floating word (I) and its parent word

(Ps) with the distance (Ds)• target: previous (Sp) and next (Sn) sibling words

of the insertion position and the parent (Pt) with the distance (Dt)

Information for Selection Model

私は昨日

公園で

ピカチュウを

見つけた

私は

を見つけた

Input Translation Rules

偶然[X7]

偶然 found

chance

Non-terminals:reverted to the original word in the parallel corpus

[yesterday]

[found]

Information for Selection Model

私は昨日

公園で

ピカチュウを

見つけた

私は

を見つけた

Input Translation Rules

偶然[X7]

偶然 found

chance

= [POST-BOTTOM]

[yesterday]

[found]

Neural Network Model

100100

220220

word to be inserted

parent of I

distance from PS

previous sibling

next sibling

parent of the insertion position

distance from Pt

fully-connectedfeed-forward network

（）

・・・11

・・・

insertion position 2

insertion position N

scores

0.10.6・・・0.1

01・・・0

（）

softmax gold

loss =softmax cross-entropy

insertion position 1

Training Data Creation• Training data for the NN model can be

automatically created from the word-aligned parallel corpus– consider each alignment as the floating word and

remove it from the target tree

私は

を見つけた

byピカチュウ Pikachu

偶然

chance

[X][X][X]

[X]label

EXPERIMENTS

Insertion Position Selection Experiment• Parallel corpus: ASPEC-JE/JC (2M/680K

sentences)• Data size

• Comparison– L2-regularized logistic regression (using Multi-core

LIBLINEAR)

Ja->En

En->Ja

Ja->Zh

Zh->Ja

Training 15.7M 5.7M

Development 160K 58K

Test 160K 58K

Ave. # IP 3.39 3.15 3.72 3.41

Experimental ResultsJa->En En->Ja Ja->Zh Zh->Ja

Training 15.7M 5.7MDevelopment 160K 58KTest 160K 58KAve. # IP 3.39 3.15 3.72 3.41Mean loss 0.089 0.058 0.105 0.056Top 1 Accuracy (%) 97.08 97.72 96.51 97.99Top 2 Accuracy (%) 98.94 99.52 98.97 99.56Logit Accuracy (%) 55.00 89.03 68.04 83.16

Translation Experiment• Parallel corpus: ASPEC-JE/JC (2M/680K

sentences)• Decoder: KyotoEBMT [Richardson+, 2014]• 5 Settings– Phrase-based and hierarchical phrase-based SMTs – w/o Flex: not using flexible non-terminals– w/ Flex: baseline with flexible non-terminals– Prop: using insertion position selection (only top 1)

• BLEU and relative decoding time

Translation Experimental Results

Ja->En En->Ja Ja->Zh Zh->JaBLEU Time BLEU Time BLEU Time BLEU Time

PBSMT 18.45 - 27.48 - 27.96 - 34.65 -HPBSMT 18.72 - 30.19 - 27.71 - 35.43 -w/o Flex 20.28 1.00 28.77 1.00 24.85 1.00 30.51 1.00w/ Flex 21.61 6.28 30.57 3.30 28.79 5.16 34.32 5.28Prop 22.07 2.25 30.50 1.27 29.83 2.21 34.71 1.89

Conclusion• Proposed insertion position selection model to

reduced the number of insertion positions for flexible non-terminals in the translation rules

• Automatic evaluation scores and decoding speed are improved

Future Work• Use grand-children’s info– Recursive NN [Liu et al., 2015] or Convolutional

NN [Mou et al., 2015]

• Shift to NMT!!– Actually, we’ve already shifted and participated

WAT2016 shared tasks• However, NMT is still far from perfect

J->E Adequacy in WAT2016

3.76 3.710%

21.75 2137.25

51.75 46.7530.5

20.75 26.7516.25

4.75 510

1 0.5 6

3.83Average adequacy

BLEU 26.22 26.39 25.41

Kyoto-U(NMT)

NAIST/CMU(NMT)

NAIST(2015 best, F2T)

Team name

Thank You!AD I’m co-organizing

The 3rd Workshop on Asian Translation(WAT2016)

in conjunction with COLING 2016Invited talk by Google about GNMT!

Please come to the workshop!

http://lotus.kuee.kyoto-u.ac.jp/WAT/

Insertion Position Selection Model for Flexible Non-Terminals in Dependency Tree-to-Tree

Science

คู่มือการกํากับข้อมูล Thai Dependency Tree Bank ตามแนว ...ling/contents/File/UD Annotation for Thai.pdf · adv คําวิเศษณ์

Dependency injection, phemto

Dependency Injectionとは

Apm Terminals Presentacion

Impala Terminals

Dependency Injection - ConFoo 2010

Dependency Injection: išmoktos pamokos

Media Dependency Theory

Oferta de terminals

Composer dependency manager

Inversion of Control vs. Dependency Inversion Principle vs. Dependency Injection

Haz Area Terminals

Dependency injection - phpday 2010

PhỤ thuỘc hàm (Functional Dependency)

Do you know Dependency Injection ?

Apm Terminals Callao

1. Angka ketergantungan (Dependency ratio)

Docker - Parte 1 Introducción a - inf.uva.es · # docker run learn/tutorial apt-get install -y ping Reading package lists... Building dependency tree... The following NEW packages

Ascotel Arte Terminals

Dependency management with Composer