46
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches) Pushpak Bhattacharyya CSE Dept., IIT Bombay 25 th Jan and 27 th Jan, 2011 Acknowledgement: parts are from Hansraj’s dual degree seminar presentation

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches)

  • Upload
    teagan

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches). Pushpak Bhattacharyya CSE Dept., IIT Bombay 25 th Jan and 27 th Jan , 2011. Acknowledgement: parts are from Hansraj’s dual degree seminar presentation. Czeck -English data. - PowerPoint PPT Presentation

Citation preview

Resources

CS460/626 : Natural Language Processing/Speech, NLP and the Web(Lecture 10, 11MT approaches)Pushpak BhattacharyyaCSE Dept., IIT Bombay 25th Jan and 27th Jan, 2011

Acknowledgement: parts are from Hansrajs dual degree seminar presentation1Czeck-English data[nesu]I carry[ponese]He will carry[nese]He carries[nesou]They carry[yedu]I drive[plavou]They swim

To translate I will carry.They drive.He swims.They will drive.Hindi-English data[DhotA huM]I carry[DhoegA]He will carry[DhotA hAi]He carries[Dhote hAi]They carry[chalAtA huM]I drive[tErte hEM]They swim

Bangla-English data[bai]I carry[baibe]He will carry[bay]He carries[bay]They carry[chAlAi]I drive[sAMtrAy]They swim

MT ApproacheswordssyntaxsyntaxsemanticssemanticsinterlinguaphrasesphraseswordsSOURCETARGETTaxonomyMTApproachesKnowledgeBased;Rule Based MTData driven;Machine LearningBasedExample BasedMT (EBMT)Statistical MTInterlingua BasedTransfer BasedMotivationMT: NLP CompleteNLP: AI completeAI: CS completeHow will the world be different when the language barrier disappears?Volume of text required to be translated currently exceeds translators capacity (demand outstrips supply).Solution: automation (the only solution)Many machine translation techniquesWhich approach is better for Hindi-English MT

Interlingual representation: complete disambiguationWashington voted Washington to powerVote @pastWashingtonpowerWashington @emphasis action place capability personagentobjectgoalKinds of disambiguation needed for a complete and correct interlingua graphN: NameP: POSA: AttachmentS: SenseC: Co-referenceR: Semantic Role

Target Sentence Generation from interlinguaLexical TransferTarget Sentence GenerationSyntax PlanningMorphological Synthesis(Word/Phrase Translation )(Word form Generation)(Sequence)Role of function wordWashington voted Washington to power.

Washingtonagent ne Washingtonobject ko sattaagoal ke liye chunaa

Vote- chunnaPower- sattaa

Statistical Machine Translation (SMT)Data driven approachGoal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum.

Translations are generated on the basis of statistical modelParameters are estimated using bilingual parallel corpora

SMT: Language ModelTo detect good English sentencesProbability of an English sentence s1s2 sn can be written asPr(s1s2 sn) = Pr(s1) * Pr(s2|s1) *. . . * Pr(sn|s1 s2 . . . sn-1)Here Pr(sn|s1 s2 . . . sn-1) is the probability that word sn follows word string s1 s2 . . . sn-1. N-gram model probabilityTrigram model probability calculation

SMT: Translation ModelP(f|e): Probability of some f given hypothesis English translation eHow to assign the values to p(e|f) ?

Sentences are infinite, not possible to find pair(e,f) for all sentences

Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair

Sentence levelWord level

AlignmentIf the string, e= e1l= e1 e2 el, has l words, and the string, f= f1m=f1f2...fm, has m words, then the alignment, a, can be represented by a series, a1m= a1a2...am , of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then aj= i, and if it is not connected to any English word, then aj= O Example of alignmentEnglish: Ram went to schoolHindi: Raama paathashaalaa gayaa

Ram wenttoschool

Raamapaathashaalaagayaa

Translation Model: Exact expressionFive models for estimating parameters in the expression [2]Model-1, Model-2, Model-3, Model-4, Model-5

Choose alignment given e and mChoose the identity of foreign word given e, m, aChoose the length of foreign language string given e

Proof of Translation Model: Exact expressionm is fixed for a particular f, hence; marginalization; marginalizationModel-1Simplest modelAssumptionsPr(m|e) is independent of m and e and is equal to Alignment of foreign language words (FLWs) depends only on length of English sentence = (l+1)-1l is the length of English sentenceThe likelihood function will be

Maximize the likelihood function constrained to

Model-1: Parameter estimationUsing Lagrange multiplier for constrained maximization, the solution for model-1 parameters

e : normalization constant; c(f|e; f,e) expected count; (f,fj) is 1 if f & fj are same, zero otherwise.

Estimate t(f|e) using Expectation Maximization (EM) procedureModel-2Alignment of FLW to an Eng word depends on its position

The likelihood function is

Model-1 & 2Model-1 is special case of model-2 whereTo instantiate the model-2 parameters, use parameter estimated in model-1

Model-3Fertility: Number of FLWs to which an Eng word is connected in a randomly selected alignmentTablet: List of FLWs connected to an Eng wordTableau: The collection of tablets

The alignment processfor each English wordBeginDecide the fertility of the wordGet a list of French words to connect to the wordEndPermute words in tableau to generate f

Model-3: ExampleEnglish Sentence (e) = Annual inflation rises to 11.42%

Step-1: Deciding fertilities(F)e = Annual inflation rises to 11.42%F = Annual inflation inflation inflation rises rises rises to 11.42%

Model-3: ExampleEnglish Sentence (e) = Annual inflation rises to 11.42%

Step-2: Translation to FLWs(T)e = Annual inflation rises to 11.42%F = Annual inflation inflation inflation rises rises rises to 11.42%T= 11.42%

Model-3: ExampleEnglish Sentence (e) = Annual inflation rises to 11.42%

Step-3: Reordering FLWs(R)e = Annual inflation rises to 11.42%F = Annual inflation inflation inflation rises rises rises to 11.42%T = 11.42%R = 11.42%

Values fPr, T, R calculated using the formulas obtained in model-3 [2]

Model-4 & 5Model-3: Every word is moved independentlyModel-4: Consider phrases (cept) in a sentenceDistortion probability is replaced byA parameter for head of the each ceptA parameter for the remaining part of the ceptDeficiency in model-3 & 4In distortion probabilityModel-5 removes the deficiencyAvoid unavailable positionsIntroduces a new variable for the positionsExample Based Machine Translation (EBMT)Basic idea: translate a sentence by using the closest match in parallel dataInspired by human analogical thinking

Issues Related to Examples in CorporaGranularity of examplesParallel text should be aligned at the subsentence levelNumber of examplesSuitability of examples(i) Columbus discovered America (ii) America was discovered by Columbus(a) Time flies like an arrow (b) Time flies like an arrow

How examples should be stored?Annotated tree structureGeneralized examplesRajesh will reach Mumbai by 10:00 pm->P will reach D by TAnnotated Tree Structure: exampleFully annotated tree with explicit links

EBMT: Matching and Retrieval (1/2)System must be able to recognize the similarity and differences b/w the input and stored examples

String based matching:Longest common subsequenceTakes word similarity into account while sense disabiguation

EBMT: Matching and Retrieval (2/2)Angle of similarity:Trigonometric similarity measure based on relative length & relative contents(x). Select Symbol in the Insert menu.(y). Select Symbol in the Insert menu to enter a character from the symbol set.(z). Select Paste in the Edit menu.(w). Select Paste in the Edit menu to enter some text from the clip board.

xy: the qualitative difference between sentence x and y(x,y): the difference between size of x and y,

EBMT: Adaptation & RecombinationAdaptationExtracting appropriate fragments from the matched translationThe boy entered the house-> I saw a tiger -> The boy eats his breakfast -> I saw the boy -> Boundary FrictionRetrieved translations do not fit the syntactic contextI saw the boy -> *

Recombine fragments into target textSMT language model can be used

Interlingua Based MTInterlingua"between languagesSL text converted into a language-independent or 'universal' abstract representation then transform into several TL

Universal Networking Language (UNL)UNL is an example of interlinguaRepresents information sentence by sentenceUNL is composed ofUniversal wordsRelations Example: I gave him a book{unl}agt ( give.@entry.@past, i )obj ( give.@entry.@past, book.@indef )gol ( give.@entry.@past, he ){/unl}

Issues related to interlinguaInterlingua mustCapture the knowledge in text precisely and accuratelyHandle cross language divergenceDivergence between Hindi-English languageConstituent order divergenceNull subject divergence == * am going (I am going)Conflational divergence == Jim stabbed JohnPromotional divergenceThe play is on == Benefits & Shortcomings(1/3)Statistical Machine translationEvery time I fire a linguist, my systems performance improves(Brown et al. 1988)ProsNo linguistic knowledge is requiredGreat deal of natural language in machine readable textLoose dependencies b/w languages can be modeled betterConsProbability of rare words cant be trustedNot good for idioms, jokes, compound words, text having hidden meaningSelection of correct morphological word is difficult

Benefits & Shortcomings(2/3)Example Based MTProsPerfect translation of a sentence if very similar one found in example sentencesNo need to bother about previously translated sentences

ConsFails if no match found in corporaProblem at points of example concatenation in recombination step

Benefits & Shortcomings(3/3)Interlingua based MTProsAdd a new language and get all-ways translation to all previously added languagesMonolingual lingual development teamEconomical in situation where translation among multiple languages is usedConsMeaning is arbitrarily deep. At what level of detail do we stop?Human development time

Translation is UbiquitousBetween LanguagesDelhi is the capital of India Between dialectsExample next slideBetween registersMy mom not well.My mother is unwell (in a leave application)Between dialects (1/3)Lage Raho Munnabhai: an excellent exampleScene: Munnabhai (Sanjay Dutt) is Prof. Murli Prasad Sharma being interviewed with some citizens asking questions in presence of Jahnavi (Vidya Baalan)Question by citizen: , . .Between dialects (2/3)Bapu from behind invisible to others: Munnabhai Bapu Munnabhai full country Bapu Munnabhai Between dialects (3/3)Bapu Munnabhai , Bapu Munnabhai , , , heart heart !Comparison b/w SMT, EBMT, InterlinguaPropertyExample Based MTStatistical MTInterlingua based MTParallel CorporaYesYesNoDictionaryYesNoYesTransfer RulesNoNoYesParserYesNoYesSemantic analysisNoNoYesData driven incremental improvementYesYesNoTranslation speedSlowSlowFast Language DependencyNoNoYesIntermediate meaning representationNoNoYes (universal representation)References (1/2)P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2), 263-311. (1993)Makoto Nagao. A framework of a mechanical translation between Japanese and English by analogy principle, in A. Elithorn and R. Banerji: Artificial and Human Intelligence. Elsevier Science Publishers. (1984).Somers H. Review Article: Example based Machine Translation. Machine Translation, Volume 14,Number 2, pp. 113-157(45). (June 1999)D. Turcato, F. Popowich. What is Example-Based Machine Translation? In M. Carl and A. Way (eds). Recent Advances of EBMT. Kluwer Adacemic Publishers, Dordrecht. Note, revised version of Workshop Paper. (2003)

References (2/2)Dave S., Parikh J. and Bhattacharyya. Interlingua Based English Hindi Machine Translation and Language Divergence. P. Journal of Machine Translation, Volume 17. (2002)Adam L. Berger, Stephen A. Della Pietra Y, Vincent J. Della Pietra Y. A maximum entropy approach to natural language processing. Computational Linguistics, (22-1), (March 1996).Jason Baldridge, Tom Morton, and Gann Bierner. The opennlp.maxent package: POS tagger, end of sentence detector, tokenizer, name finder. http://maxent.sourceforge.net/ version- 2.4.0 (Oct. 2005)Universal Networking Language (UNL) Specifications. UNL Center of UNDL Foundation. URL: http://www.undl.org/unlsys/unl/unl2005/. 7 June 2005.