Print Supp 111

Embed Size (px)

Citation preview

  • 8/22/2019 Print Supp 111

    1/17

    Streamlining Internationalization

    and Localization

    Evaluating Emerging

    Language Technologies

    Corpus Linguistics and

    the Translation Process

    Creating Your Own

    Multilingual Technology

    LANGUAGE TECHNOLOGYGETTING STARTED:GuideApril/May 2010

    http://www.multilingual.com/
  • 8/22/2019 Print Supp 111

    2/17

    The Localization Industrys 1stCollaborative &Multilingual Terminology

    Development & Management System

    For a free trial, visit www.csoftintl.com.

    TermWikiTMPowered by

    Copyright 2010 CSOFT International Ltd. All rights reserved. http://www.csoftintl.com http://www.termwiki.com http://www.l10nworks.com

    TermWikiTM

    Collaborate. Develop. Control.

    http://www.l10nworks.com/http://www.termwiki.com/http://www.csoftintl.com/http://www.l10nworks.com/http://www.termwiki.com/http://csoftintl.com/http://www.csoftintl.com/http://www.termwiki.com/
  • 8/22/2019 Print Supp 111

    3/17

    page 3

    Our industry would be at a major loss without

    language technolgies and the updates, improve-

    ments, innovations and twists they undergo.

    Ian Henderson begins the guide by proposing

    how to work the kinks out of a process made tricky by technology and how it fits into

    worldwide languages. Not surprisingly, this may involve more technology. Vadim Berman

    then offers some tips on what to look for when buying such technology. Thiana Donato

    explores a method of improving language technology, and finally, Dennis Wakabayashi

    and Chris Golaszewski explain how they went about creating a new technology. We hope

    these different angles are useful to you!The Editors

    Editor-in-Chief, PublisherDonna ParrishManaging EditorKatie Botkin

    ProofreaderJim HealeyNews Kendra Gray

    IllustratorDoug JonesProduction Doug Jones, Darlene Dibble

    Editorial BoardJeff Allen, Ultan Broin, Arturo Quintero,Jessica Roland, Lori Thicke, Jost ZetzscheAdvertising DirectorJennifer Del Carlo

    AdvertisingKevin Watson, Bonnie HaganWebmasterAric Spence

    Technical Analyst Curtis BookerData AdministratorCecilia Spence

    Assistant Shannon AbromeitSubscriptions Terri Jadick

    Special Projects Bernie [email protected]

    www.multilingual.com/advertising208-263-8178

    Subscriptions, customer service, back issues

    [email protected]/subscribeSubmissions [email protected]

    Editorial guidelines are available atwww.multilingual.com/editorialWriter

    Reprints [email protected]

    This guide is published as a supplement toMultiLingual, the magazine about language

    technology, localization, web globalization andinternational software development. It may be

    downloaded at www.multilingual.com/gsg

    Streamlining Internationalization and Localization

    Ian Henderson page 4

    Evaluating Emerging Language Technologies

    Vadim Berman page 8

    Corpus Linguistics and the Translation Process

    Thiana Donato page 10

    Creating Your Own Multilingual Technology

    Dennis Wakabayashi and Chris Golaszewski page 12

    Getting Started:

    Language Technology

    April/May 2010 ww w.multilingual.com/gsg

    The No. 1independent technology for the linguistic supply chain.

    Across Systems GmbH

    Phone +49 7248 925 425

    [email protected]

    Across Systems, Inc.

    Phone +1 877 922 7677

    [email protected]

    Ian Henderson is CEO of Rubric, a provider of localization services

    to the high technology industry for the past 15 years.

    Vadim Berman is a cofounder and a CEO of Digital Sonata,

    a provider of language engineering products and services.

    Thiana Donato is executive director and founder of All Tasks,

    a Brazilian company in the South American multilingual services market.

    Dennis Wakabayashi, founder and chief online officer of Mojofiti,

    has handled international business operations for a number of years.

    Chris Golaszewski is the manager of online development at Mojofiti.

    GETTING STARTED :Guide CONT

    ENTS

    LANGUAGE TECHNOLOGY

    http://www.across.net/mailto:[email protected]:[email protected]://www.across.net/mailto:[email protected]:[email protected]://www.multilingual.com/advertisingmailto:[email protected]://www.multilingual.com/subscribehttp://www.multilingual.com/editorialWritermailto:[email protected]://www.multilingual.com/gsghttp://www.multilingual.com/http://www.multilingual.com/http://www.multilingual.com/gsg
  • 8/22/2019 Print Supp 111

    4/17

    LANGUAGE TECHNOLOGY

    The Guide From MultiLingualpage

    Guide:GETTING STARTED

    Internationalization is defined as anenabling process: Making original con-tent, such as code, ready for markets

    around the world. Localization is then anadaptation process that prepares contentfor a specific target market globally.

    Ideally, the two processes work seam-lessly. Internationalization should ease

    the process of localization. Many years ago perhaps after the fiasco of Y2K whencountless professionals scrambled to undotwo-character year fields IT and productteams, especially in the software industry,learned that it is better to design code orcontent with the intent of presenting it glob-ally than to have to retrofit it after the fact.

    Yet, the gaps between localization andinternationalization often remain wide. In2007, Lingoport conducted a survey of cus-tomers and vendors that identified a majorgap between internationalization andlocalization teams, which can adversely

    impact time-to-market deadlines.Three years later, the significant crev-

    ice between internationalization teamsand localization providers persists. Time-to-market deadlines have shrunk evenfurther, and expectations for lower inter-nationalization and localization costscontinue to increase. Furthermore, recentresearch, such as that conducted by Aber-deen Group (see sidebar) points to thevalue of integrated translation environ-ments. Content providers and code devel-opers who integrate teams from end to end

    will reap sizable benefits as they roll theirproducts and services out worldwide.

    The problemMany times when we begin work as

    a language service provider (LSP), theinternationalization process is fully com-plete. In the clients mind, it is now just aquestion of localizing the files and goingto market. If localization were as simpleas that, we would be out of work prettyquickly.

    The issue we face as an LSP coming up

    to speed to get a product to local markets

    around the world is that the internation-alization effort has frequently been com-pleted without involving an experienced

    internationalization team or any other LSPfor that matter. Take Synaptics, for exam-ple a leading worldwide developer ofhuman interface solutions for mobile com-puting, communications and entertain-ment devices. Previously, this companyhad updated its multilingual resource code(RC) files by adding new English strings atthe end of each language section. This wasa manual, error-prone and time-consumingprocess for the client. An added compli-cation was that not all languages were insync, so the added English strings varied

    from language to language. At our end,we had to extract the English strings fromeach section, translate them and patchthem back into the multilingual files. Theengineering process was laborious.

    Ideally, the internationalization and local-ization teams, whether external or inter-nal, work closely together. We have foundthat the ability to streamline is directlycorrelated to the volume and frequency ofwork. In fact, when we work with a client tostreamline and reduce the effort and cost ofthe localization process, it is imperative that

    there is high volume and frequency of work

    as we often undertake these cost- reductionexercises without passing the cost on to theclient. Once that relationship is established,

    we work with the client to standardize fileformats, file names, codes and so on.

    One of our first recommendations toclients is to employ an integrated transla-tion management solution that includesstandardized terminology in one place,for both the internationalization andlocalization teams, which is a big step inreducing redundancy and encouragingcollaboration. It is not surprising that therecent Aberdeen research found that com-panies using an integrated solution (seeFigure 1) were far more successful with

    production and version control.Within that integrated process, many fac-

    tors influence overall success. For example,it is surprising how many clients believe thatif all localizable text is put into Excel or XMLfiles during the internationalization process,then the problem is solved. Unfortunately,this is not the case. All serious localizationcompanies will use translation tools whenlocalizing files, but these tools will only sup-port standard file formats, such as RC andXLIFF. If you come up with your own XMLschema or Excel spreadsheet, you can be

    pretty sure some engineering effort will be

    Streamlining Internationalization

    and LocalizationIanHenderson

    Best in class All others

    30%

    20%

    10%

    0%

    28%

    13%

    25%

    15%

    Process/project management Terminology management

    Figure 1: Solutions integrated with translation management.

    Source: Aberdeen Group, Translating Product Documentation

    LANGUAG

    ETECHNOLOGY

  • 8/22/2019 Print Supp 111

    5/17

    GETTING STARTED :GuideLANGUAGE TECHNOLOGY

    page April/May 2010 www.multil ingual.com/gsg

    required to separate translatable from non-translatable text. Fixing file format struc-

    ture during the internationalization processspeeds localization.

    Next, we work with clients to ensure thefile naming scheme follows a recognizableand consistent pattern. Typically transla-tion tools retain the name of the sourcefile for the target file, but put the targetfile in a different folder. For example, asource file called /en/resources.rc mayend up as /fr/resources.rc. Alternatively,translation tools may rename the sourcefile by adding or replacing language iden-tifiers at the end of the file. /res/props

    .properties may become res/props_de-DE.properties. Handling a mishmash offile naming conventions is not a problemin itself, but it adds time and increasescost as somebody needs to make surethe translated file names conform to therequired pattern.

    Using standard language and countrycodes reduces the risk of errors. We haveone client using gr to denote German, whileanother client uses the same gr for Greek.Because of this, in one rushed instancewe actually delivered the wrong language.

    Using standard ISO codes, such as de for

    German (Deutsch) and el for Greek (Ellinika),alleviates that problem.

    Another client uses bs-ID for one of itslanguages. This is not Bosnian as spoken inIndonesia, but, in fact, refers to Bahasa Indo-nesia (id-ID in ISO terms). Similarly, bs-BS isneither Bosnian nor Bahasa as spoken in theBahamas, but Bahasa Melayu (ms-MY in ISOspeak). Straightening out these differences using the ISO codes from the very begin-ning streamlines the entire process.

    Multilingual files add to the workload, asthey need to be split into monolingual filesand reassembled after translation. Whenwe can work with the internationalization

    teams, we can limit the impact of multilin-gual source files on the localization processand cost.

    Character, content and contextWhen working with software files, we

    often encounter the issue of having toescape characters. In many cases therewill be no escaped character in the sourcephrase, so deciding how to escape an apos-trophe character () in French, for example,can be a challenge. Should it be: jai, jai,j\ai, j\\ai or j\\\ai? We often see multiple

    different examples, even in the same file.

    Within RC files there is usually some lan-guage-specific content. For example, the

    highlighted text below is not usually pre-sented to the translator as a translatabletext because translation tools will makethese changes automatically.

    //////////////////////////////////////English(U.S.)resources#if!defined(AFX_RESOURCE_DLL)||defined(AFX_TARG_ENU)#ifdef_WIN32LANGUAGE LANG_ENGLISH,SUBLANG_ENGLISH_US#pragma code_page(1252)

    #endif//_WIN32

    //////////////////////////////////////Japanese resources#if!defined(AFX_RESOURCE_DLL)||defined(AFX_TARG_JPN)#ifdef_WIN32 LANGUAGE

    LANG_JAPANESE, SUBLANG_DEFAULT#pragma code_page(932)#endif//_WIN32

    RC localization tools are aware of this andwill change the content accordingly; how-

    ever, we have also seen instances where

    Intelligent Utilities forLanguage Workers

    Excelling MultiTerm: bidirectional MS Excelinterface for MultiTerm

    Synching Language: automatic synch ofserver-based or local translation memoriesand termbases

    Connecting Content: preconfigured engine forworkflow automation, for example, for CMS

    Splitting TTX: divide and merge TTX files Publishing MultiTerm: from MultiTerm to PDF And many more!

    Kaleidoscope GmbHMaria Enzersdorf, Austria

    [email protected]

    Smart TerminologyAccess and Workflow

    Make your terminology accessible company-wide and from any application via a hotkey

    Any user can submit term requests Translators suggest new equivalents while

    they work Terminologists manage term requests and use

    them to create new entries in SDL MultiTerm

    New entries can be sent for approval Specific users approve or comment on terms

    online

    Kaleidoscope GmbHMaria Enzersdorf, Austria

    [email protected]

    Online TranslationReviews for CAT Files

    Upload your CAT translations for onlinereview

    Reviewers work in full layout or in a tabularview depending on source file format

    TM and terminology information completelyintegrated

    Translators and reviewers collaborate via atrack-changes and commenting feature

    Final version goes back to CAT tools

    Kaleidoscope GmbHMaria Enzersdorf, Austria

    [email protected]

    mailto:[email protected]:[email protected]:[email protected]://www.globalreview.at/http://www.quickterm.at/http://www.experttools.at/http://www.globalreview.at/http://www.quickterm.at/http://www.experttools.at/http://www.multilingual.com/gsg
  • 8/22/2019 Print Supp 111

    6/17

    LANGUAGE TECHNOLOGY

    The Guide From MultiLingualpage

    Guide:GETTING STARTED

    content needs to be introduced into the

    translated file. This can be addressed, but

    it requires additional effort, time and cost.

    #The file Contains the property

    Values. Please read \ as escape

    characters on the left handside. For example Test\ Literal=

    should be read as Test Literal

    #Fri Aug 28 18:32:04 IST 2009

    Start\ Receipt=Start Receipt

    On\ Case\ Pre\ Receipt=On Case

    Pre Receipt

    Change\ Shipment\ Status=Change

    Shipment Status

    Blind\ Return\ Receipt=Blind

    Return Receipt

    #The file Contains the property

    Values. Please read \ as escape

    characters on the left hand side.

    For example Test\ Literal= should

    be read as Test Literal

    #Fri Aug 28 18:32:04 IST 2009

    French=Fran\u00E7ais

    Start\ Receipt=Proc\u00e9der

    \u00e0 la r\u00e9ception

    On\ Case\ Pre\ Receipt=Sur pr\

    u00e9r\u00e9ception de caisse

    Change\ Shipment\ Status=

    Modification d'\u00e9tat d'exp

    \u00e9ditionBlind\ Return\ Receipt=Re\u00e7u

    retour sans autorisation

    Translating out of context invariably

    leads to a lower quality product and is the

    biggest challenge facing linguists as clients

    try to reduce the localization cost. When all

    contextual information is stripped out and

    translated pieces are reduced to an Excel

    spreadsheet, the challenge for the linguist

    is considerable. What clients do not realize

    is that stripping out context may actually

    be more expensive, as more time has to

    be spent testing and fixing the translated

    strings in context after translation.

    In closing, I would like to return to Syn-

    aptics, the company I mentioned before.

    We decided it would be best to create a

    master list of all English strings that had

    been translated in one or more languages.

    The cost of translating the complete master

    file for every language, even if the strings

    were not required in a particular language,

    turned out to be much cheaper than main-

    taining a separate list of English stringsfor each language. Once the master files

    have been translated, Synaptics merges

    all the languages into multilingual RC files

    by using automated scripts. Synaptics has

    been open to implementing suggested

    changes and eliminating wasted effort in

    order to streamline the process. As a result

    of reducing the overall effort, the transla-

    tions are much quicker than before and

    cost less.

    This centralized approach with a reposi-

    tory of common source language content

    worked for Synaptics. According to the

    recent Aberdeen study, that kind of cen-

    tralization, paired with standardized termi-

    nology and a closed-loop review process,

    is crucial to achieving higher translation

    performance. G

    Advanced Leveraging Translation Memory

    30% more matches than conventionalTMs (subsegments and paragraphs)

    See the context of identified matches

    See how subsegment matches werepreviously translated

    Create, extract, manage, and sharemultilingual terminology in real time

    The Language Technology Experts

    USA/CANADA: 877.725.7070

    EUROPE: +32 (0) 2.213.00.20

    www.multicorpora.comGovernments I Enterprises I Language Service Providers

    Complete their TMS solution

    Complement their existing TMs

    Complement their MT solution

    Transform their CMS into a GMS

    Organizations Select MultiTrans To:

    http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/http://www.multicorpora.com/
  • 8/22/2019 Print Supp 111

    7/17

    GETTING STARTED :GuideLANGUAGE TECHNOLOGY

    page April/May 2010 www.multil ingual.com/gsg

    BetHWalsH

    Aberdeen Groups recently released Translating Product Documenta-

    tion study identifies how top companies effectively manage translation

    and localization efforts while reducing costs and increasing efficiency.

    Based on the experiences of nearly 200 companies, the study was done

    as a follow-up to Documentation Goes Global, a report completed in

    the spring of 2008 that determined most companies were facing trans-

    lation cost increases from 18% to 32% due to increased volume and

    language requirements.

    In the fall of 2009 Aberdeen also conducted research on Technical

    Communications as a Profit Center, which determined that technical

    communications departments provide significant customer-facing value

    by publishing product documentation online. Aberdeen analysts believed

    it was important, given the two previous studies revelations, that they

    look at how top-performing best-in-class companies were managing to

    find the right balance between cost and quality in the localization chain.

    The results of the newest study reveal that companies that are mostsuccessful in managing their translation and localization efforts maintain

    consistently lower costs, are more efficient with personnel resources,

    and produce higher quality work than their competitors. Best-in-class

    companies save 240% over their competitors in translation expenses

    and 630% more in localization costs. They reduce the time required to

    complete translation projects by 30% and translate content into 48% more

    languages than their competitors. In addition, they complete 88% of their

    translation projects by targeted deadlines, and 91% come in under budget.

    Best-in-class companies translate into about 11 languages on average.

    Our research clearly demonstrates that top companies, focused on

    ROI, effectively manage time and expenses involved with translation and

    localization projects, said David Houlihan, senior research associate

    with Aberdeens Product Innovation and Engineering practice. We found

    that leading companies utilize integrated translation environments and

    realize performance improvements more than three times those achieved

    by their competitors. This high level of productivity comes with no sacri-

    fice to the quality of work and may, in fact, improve quality.

    Quality localization can have significant benefits for the enterprise,

    as the prior Aberdeen research showed that high-quality documentation

    contributes as much as a 41% increase in customer satisfaction scores

    and a 41% reduction in inbound calls to customer service organizations.

    How do they achieve these great results? The capabilities reviewed in

    Aberdeens research are divided into five core areas: process, organiza-

    tion, knowledge management, technology and performance management.

    Significant productivity drivers are increased control and transparency

    over the entire process, closed loop processes that promote internal

    and external accountability, and automated reuse of content.

    Leading performers are much more likely than their competitors to

    assign a dedicated project manager to manage the total translationprocess, institute a formal review process for translated documents,

    and control content with terminology management and the use of

    integrated translation management solutions. The highly specialized

    and irregular nature of translation work prevents many companies

    from maintaining a standing translation staff, and much of the work

    is outsourced. However, transparency facilitated by comprehensive

    management solutions is an aid to greater internal ownership over

    even outsourced translation and localization resources. Without this

    transparency it is difficult for companies to understand how to improve

    their translation processes, either in terms of operational execution or

    quality of output.

    Increased reuse of translated content offers a compelling value proposi-

    tion. As such, it is the most popular initiative pursued most often by study

    participants, at about 45% of all respondents. However, what stands out

    in best-in-class companies is the process of incremental translation or

    creating topic-based authoring in source language content modules; this

    opens up the opportunity of reuse significantly, potentially leading to

    tremendous savings. Combined with standardization of terminology avail-

    able to all translation workers and a formal closed-loop review process,

    it ensures consistency in both quality of translation and operational

    performance of partners. When the review is done by a native speaker,

    in particular, it enables companies to preserve the intended meaning to

    better serve their customers.

    Technology is being used to support internal ownership and account-

    ability as well as to gain cost and time savings, leading to higher effi-

    ciencies. Integrated translation management solutions are proving to

    be an emerging trend among best-in-class companies. The use of thesesolutions gives best-in-class performers a considerable advantage by

    providing them with a centralized repository for translated content and

    centralized control as well as easy accessibility of approved terminology

    by internal and external workers. This single source for all multilingual

    content further enables reuse, maintains version control and eliminates

    redundant rework across the localization chain. Aberdeen advises all

    levels to actively assess translation quality through formal ranking and

    asserts that centralized processes will continue to improve results.

    Across Systems, which was a major sponsor of the research, found the

    results confirmed the approach they advise customers and prospects to

    take. Aberdeens research identified that the integration of project and

    terminology management into translation management solutions is an

    emerging practice of best-in-class companies, said Daniel Nackovski,

    president of Across Systems, Inc. We were gratified to find the study

    supports our strategy to include project and workflow management,

    a translation memory, a terminology system and more in a unified work

    environment.

    As reported in the Aberdeen study, about 48% of the best-in-class

    companies use translation management software solutions, and 28%

    have an integration with project management, both of which are emerg-

    ing practices. However, the difference in adoption is high, with these

    top companies using it more than two times the norm. This means that

    even though it is an area where still less than half of companies are

    taking action, the great majority of those that do is reaching the top

    tier of performance, proving it is a highly useful practice. G

    Aberdeen Research Study Reveals Practices of Top-performing Companies

    39%

    6%

    80%

    70%

    60%

    50%

    40%

    30%

    20%

    10%

    0%

    73%

    54%

    45%

    70%

    48%

    38%

    52%48%

    38%

    16%

    41%

    25%

    12%

    Process/project Translation memory Terminology Translation Machine translationmanagement management system management system

    Best-in-class I ndus tr y ave rage Lagga rd

    Beth Walsh is the vice president of Clearpoint Agency

    Technology use by best in class.

    http://www.multilingual.com/gsg
  • 8/22/2019 Print Supp 111

    8/17

    The Guide From MultiLingualpage

    LANGUAG

    ETECHNOLOGY

    LANGUAGE TECHNOLOGYGuide:GETTING STARTED

    Fashion is not only for clothing andshoes. Trend following also can be ap-plied to the world of technology. In the

    silicon gold rush, some technologies aremore favored by wanna-be inventors thanothers. YouNoodle.com, a barometer of in-novation, lists 140 startups tagged with the

    expressionVoIP

    (voice over internet pro-tocol), 669 startups containing the wordcommunication, 41 startups working on

    surveillance, and a whopping 849 startupstagged with natural language processing(NLP). The once-obscure field of languagetechnology seems to be getting hot.

    It makes sense that this is happeningnow. An individual can travel around theworld within a couple of days. A globalcommunication network has been estab-lished to capture bits and pieces of real-ity in clear video and audio signals thatcan be stored forever. Business pro-

    cesses are mostly digitized. Now userswant machines to understand humanlanguage. Buzzword addicts call all thisWeb 3.0.

    But human languages have their ownlogic, which is nothing like the strict true-or-false machine logic, and machineshave their own different ways of makingsense of human language. While regu-lar business logic can manifest itself vialabels, textboxes and the like, linguisticlogic is largely invisible. You put text in,you get text out. It either matches your

    expectations or it does not. But 99.999%of the inner works of this programmingiceberg is under water. Linguistic soft-ware, while not appearing very high-techon the surface, is a mind-boggling array ofwires, cogs, counterweights, pulleys andbuttons, designed to run by itself. Likeall complex mechanisms, it is prone tobreaking. If you are shopping in this area,you have to either dive into this insanelycomplex world or know the tricks of thetrade. Come to think of it, the tricks of thetrade are mandatory in any case; no one

    has the time to check everything.

    Usability: capability and desireLooking at the exhibits in historical

    museums, one cannot but admire thecraftsmanship of the old masters. K itchenutensils, furniture and wheel-lock gunsare decorated with complex ornamentsand precious stones. However, more

    practical and down-to-earth minds maysay this is a waste. The gargoyles and theGreek deities dont add one bit of usabilityto the tool. The best example of an incred-ible effort with little practical use is thewooden pocket watches of the RussianBronnikov brothers. While magnificentand unique in the way they are made,these chronographs did not accomplishmuch on the practical side of things: apocket watch is still a pocket watch, andwood does not last as long as steel. Thefirst and simple test, if you are looking toinvest in anything, is to ask if it is usable

    and practical.The recently surfaced semantic search

    engines seem to be questionable in thatrespect. Many critics point out that in prac-tice it does not yield much improved experi-ence over the tried-and-true keyword search.Try to assess the market realistically, and seeif the complexity and the costs are worth theniche they are going to fill. Common senseapplies, as usual. Avoid wishful thinking.

    If you are looking for a tool to accomplisha certain task, are you sure that this beauti-ful and intelligent masterpiece can handle

    it well? Consider the following example.You are building a software package tosearch content in a foreign language. Somepeople take a straightforward approach:apply machine translation (MT) to the con-tent, index it and connect to a plain searchengine. Can it work? Maybe, but MT is yetto become accurate enough to be reliablefor some types of language pairs, such asChinese English or Japanese French.With an accuracy of 70% 80%, nearlyevery third or fifth word is incorrect, whichmay result in arcane, unexplainable search

    results.

    The same principle applies to a crudeapproach in building speech-to-speech MTsystems. Take two reasonably good sys-tems, text MT and speech recognition. Letsassume both have an accuracy of 0.9. Whenlinked together, the complete solution hasthe accuracy of 0.9 times 0.9 = 0.81. If the MT

    system is rule-based, it will not take kindlythe lack of punctuation in the text input, andthe accuracy is likely to degrade further.This means that well have a frustrating ten-dency to get every fifth word wrong.

    On the other hand, with stronger emphasison the underlying algorithms some aspectsdo not have to be scrutinized as much asthey are in other software. User interface isnot that difficult to change, so let it be evenif you dont like it. Stability is paramount insoftware, but in the early stages it does nothave to influence your decision too much.

    Scalability: from toy data to real worldBut how should the results be checked

    when a product is still in development? Lan-guage technologies have a distinctive traitthat makes them so insanely hard. While anormal database application may deal witha small or moderate amount of data, thelinguistic applications by definition mustdeal with a potentially infinite set of wordscomprising a language and endless combi-nations within this infinite set.

    A newly born application doesnt knowmuch of this infinite set. It starts off with

    a small portion of data, and this is usuallywhy the examples are limited. They all maywork great, but there are just ten or twentyof them. This is normal, but if a technologyis limited to this toy world, it is not of muchuse. Most developers understand it. Thequestion is, however, how they plan toenlarge the scope of the input. Learningfrom corpora? Importing machine-readabledictionaries? User input? Crowdsourcing?

    There are no good and bad methods,just suitable and unsuitable ones or well-planned and not well-planned strategies.

    Try to check whether this data acquisition

    Evaluating Emerging

    Language TechnologiesVadimBerman

  • 8/22/2019 Print Supp 111

    9/17

    GETTING STARTED :Guide

    page April/May 2010 www.multil ingual.com/gsg

    LANGUAGE TECHNOLOGY

    method has been tried already. Apply com-mon sense. Does it work? What is requiredto make it work on a production level?Finances, personnel, linguistic resources?Does it require 50 expensive highly skilledcomputational linguists to build a dictionary

    manually or petabytes of high-quality cor-pora for a rare language? Try to assess thefeasibility and suitability of these resourcesnecessary for growth. Even if a spaceshipcan carry you to the stars, it is difficult to useits potential if the fuel must be pure gold.

    This, however, does not mean that if thedeveloper is unable to meet expectations,the requirements are unrealistic. The bud-get might be tiny, as creating new technol-ogies is not as glamorous and profitable inthe beginning of a venture. Be understand-ing, but analytical.

    Real-world examplesDid you ever wonder why so many natu-

    ral language search engines and speechrecognition packages demonstrate theircapabilities by looking either for pizza or

    sushi? While these might be just similari-ties in the life style and culinary prefer-ences of the linguistic crowd, the mainreason may be quite prosaic. These exam-ples are perfect to demonstrate systemsthat claim to be production-ready.

    When humans must strain their brainto understand or spell long, rare, exotic

    words, the machines have a differentproblem. The main problem is ambigu-ity. Epidermis or uranium do not have toomany interpretations, and so they are easyfor machines. On the other hand, wordswith numerous meanings such as put or

    setare a nightmare for every NLP package.Human readability is not the same and isoften the opposite of machine readabil-ity. However, epidermis is a specializedterm and might not be present in a smalldictionary. Pizza and sushi, on the otherhand, are quite common, yet still unam-

    biguous. These words are targets that arequite easy to hit.

    Try tougher tasks. Is this about food?Trysteak for a speech input, or lamb with

    sage for MT (yes, the latter often yieldslamb with a wise man in statistical MTs).See how well the system makes complexdecisions. On a more advanced stage,dont forget to introduce noise, either lit-erally for speech or figuratively for text.

    Dont overdo your attempts to make thesystem fail, though. Accents and regional-isms are only relevant if the system is to be

    deployed in the markets where the accents

    and the regionalisms are coming from. Fur-thermore, if a system is good in principleand offers some customization capabili-ties, the support for regional dialects canbe added externally. Dont bother to checkdomains you know youll never use or ones

    that the system is not built for. I remembera customer testing a MT system by trying totranslate a fragment of an Agatha Christiethriller. This is not guaranteed to work well,for good reason. Language engineering ismeant to handle mundane tasks, not to pro-duce literary masterpieces. According to aclassically apt comparison, it is similar toassessing the performance of an industrialrobot by making it dance Swan Lake.

    ExtensibilityWhat if the system seems to be a good

    basis for what you are looking for but doesnot have the exact functionality that youare looking for? Due to the complexity oflanguage engineering, the choice of lin-guistic tools is small. There is rarely a widearray of choices, so an almost-suitableproduct may be the only option.

    Then, in addition to other criteria, youneed to see how fast the product can beadapted to your needs. The answer isoften in no time. In fact, the extensionis almost there, 95% complete. Dont fallinto that one. Even though the choiceof products and suppliers is scarce, the

    language engineering job market is evenscarcer. These guys may not be so surethemselves. Feelings of a developer for hisbrainchild are similar to those of a motherfor her child. Neither is usually the bestaddress to seek for an objective opinion.

    Of course, if the extension is trivial anddoes not touch on the linguistic parts,there are no reasons to worry. However, ifit touches the core engine or requires imple-menting or modifying some linguistic logic,the feasibility should be carefully analyzed.Common sense applies, like everywhere

    else. Try asking what the plan is and whethera similar modification has been done before.Another useful question is What can gowrong? Nothing is not a good answer,especially if replied immediately.

    Doctors and shamansTechnology might be the heart of the

    offering, but this is not all. If a person hasa strong, well-functioning heart but severeissues with other vital organs, one cantcall it perfect health. Similarly, the man-agement and other relevant parts of the

    team also must be capable of delivering

    solid results. Experience, successful trackrecord, social standing, reputation, hardwork there is no escaping the basics.

    It might be more difficult with a startup.Usually, odds are against the gold dig-gers, so startup people either have a gam-

    bling trait or are not experienced enoughto understand how long and difficult thepath before them is. Young entrepreneursusually have more drive than their moreexperienced counterparts, but they haveother traits as well, and only the future(or maybe also YouNoodle.com) will tellwhether it is a winning combination.

    More often than one would havethought, new technologies are presentedby people with questionable honesty andprofessionalism. With the abundance ofstrange characters and plenty of legiti-

    mate hard-working garage inventors,some shamans are successfully posingas real doctors. There is no recipe to tella scam, and even when the technologyitself is legitimate, peek under the hood.

    There is no place for impractical vision-aries at the steering wheel. They maycontribute to the main idea and maybeeven the initial architecture, but they arelikely to doom the enterprise no matterhow good the technology or the prospectsare. Driving a car or flying a plane does notallow for chasing birds or stars.

    Should techies or salespeople run a com-

    pany? Normally, salespeople, but I believelanguage engineering is a bit different. It isa small, tightly-knit community where manypeople arrive from other industries. Its idio-syncrasies are so distinctive that an externalobserver might doubt these people actuallylive on the same planet as the rest of themankind. Mainstream salespeople mightnot be able to figure out this strange world,let alone explain the small technicalities toa potential customer. Imagine that you arebuying an electric appliance, and the sales-person tells you, Well, the interface is very

    intuitive. I know that there are three greenbuttons, one red bulb, and a lever. Im notsure what they do. I think you need to pullthe lever, but to make sure, Ill just catch ourmain techie and hell tell you how to makeit work. Apparently, with salespeople likethese, there are no sales. Ive seen it hap-pening, too.

    Finally, as the saying goes, if somethingis too good to be true, it probably is. Dontstruggle to find overlooked diamonds;look for more realistic copper, nickel or sil-ver, and you wont spend your efforts and

    resources on fools gold. G

    http://www.multilingual.com/gsg
  • 8/22/2019 Print Supp 111

    10/17

    LANGUAGE TECHNOLOGY

    The Guide From MultiLingualpage 10

    LANGUAG

    ETECHNOLOGY

    Guide:GETTING STARTED

    The multilingual services market hasreceived a series of innovationsthrough computational linguistics or

    natural language processing (NLP), a mul-tidisciplinary area that encompasses arti-ficial intelligence, information technologyand linguistics, using computer processesto handle human language. Artificial intel-ligence is the field of research within com-puter science that studies how machines

    can think, simulating the human capacityfor intelligence and solving problems. As aresult of the integration of these sciences,research has been providing important ap-plications for translators work, such assearch tools, spell-checkers and voice rec-ognition, as well as tools in computer-aid-ed translation (CAT), including translationmemory (TM), management terminologyand machine translation (MT). These pro-jects aim to develop a search mechanism forthe most common terms, by segmentation,thus eliminating repetition and resulting ina more natural translation. The goal of these

    artificial intelligence researchers is to devel-op CAT tools and MT that can simulate thehuman ability to think and solve problems.

    Corpus linguistics studies languagein use, investigating language throughobservation of large quantities of authen-tic data contained in the corpus, which is arepresentative set of texts on a particulararea, electronically organized to enablesearches by using specialized search tools.Corpus linguistics considers language asa probabilistic system. That is, there aremany possibilities for an expression in lan-

    guage, but not all are as frequent.Research in this area advanced in the 1980s

    with the widespread use of personal com-puters that led to the increased availabilityand accessibility of corpora and processingtools, helping to strengthen research in thefield and reinforcing the fact that this areaof research is and always has been closelyrelated to technology. Since then, researchon the subject has contributed to translationin several ways. Using the most commonlyused standards in a language results in atranslation that flows more naturally and is

    more faithful to the native language. Also,

    the majority of MT systems are based on acorpus comprised of bilingual texts (originaland translated).

    The computational tools used by corpuslinguistics provide a mechanism that col-lects, stores and analyzes linguistic data the so-called corpus. This data is used asresearch material that can help elaboratetheories about language functionality.

    Some programs list words according to

    the frequency with which they occur in thecorpus. Others are called concordancersand serve to allow specific word searchesin a corpus, pulling up a comprehensivelist of phrases that shows the contexts inwhich the word has been used. The use oftagging is also common to automaticallyanalyze the corpora and produce codes ortags that contain only data belonging to aparticular morphosyntactic and syntactic.This area of research has contributed toimproving hybrid MT software, throughits theories on linguistic variables, directlyinfluencing the translation so that the final

    text is as close as possible to the originalone. The MT systems are based on a cor-pus comprised of bilingual texts (originaland translated) and a database with sys-tems of rules and statistics. Technologicalinnovations can therefore speed up thetranslation process, resulting in a betterquality MT, with the human translator act-ing as a sort of validator of the MT data.

    This is a valuable contribution whenwe consider that the first technologicaladvance used to support translation workwas the development of MT, created by the

    Americans in the 1950s to spy on the Rus-sians during the Cold War period. Thesesoftware components were capable ofanalyzing sentences based on grammar,giving rise to very unnatural, sometimesmeaningless translations that had to becorrected and validated by a human trans-lator. Today, the most famous MT systemworldwide is Googles, which proves that atleast currently the results of MT cannot besatisfactory without human intervention.

    Another technological contribution wasthe development of CAT tools, which gave

    rise to software products such Trados, Dj

    Vu and Wordfast. These tools, besides con-sidering grammar, use a TM that enablesterms used in a text to be standardized andadded to a glossary, making quality con-trol in translation easier. These tools aredesigned to support the translators work,for instance, storing previously translatedsegments into a TM so that when the samesegment of text appears again, the soft-ware brings up the previous translation

    used for that phrase.Each technological advance brings ru-mors that the days of the professional trans-lator are numbered. However, the work ofhuman translators continues to be essen-tial. Technology is no substitute for humanwork, but is rather a tool to help speed upcertain types of translation work.

    Terminology is one of the areas thatmay be significantly influenced by corpuslinguistics, which has been developingvocabularies by using its own methodol-ogy. Glossaries are prepared from a corpus,creating a kind of filter so that the vocabu-

    lary shows only terms contained in the cor-pus, compiled according to specific criteria.As a result, the glossary contains the mostcommonly used terms for a particular areaof specialization. Another characteristicof glossaries created by corpus linguisticsis that they are rich in authentic examplesextracted from the corpus and other infor-mation that can facilitate the translatorstask. Therefore, the type of translation thatcan benefit most from corpus linguisticsis technical translation, which focuses onvarious areas of specialization from a tech-

    nical or scientific standpoint. This is a typeof translation that involves a high degreeof terminology research and the develop-ment of glossaries to ensure the use ofstandardized terminology in the documentin question, and also for any future projectscarried out on the same subject.

    Both the reference material and theresearch material that have led to the devel-opment of computer tools can speed up thetechnical translation process and providegains in terms of quality, by giving the trans-lator not only a better knowledge of the

    specialized terminology of the industry that

    Corpus Linguistics and

    the Translation ProcessThiana DonaTo

  • 8/22/2019 Print Supp 111

    11/17

    GuideLANGUAGE TECHNOLOGY

    page 11

    the translation is aimed to, but also the sup-port of multifunctional software, like the

    programs that have been launching in themultilingual services market.

    In Brazil, for example, research in corpuslinguistics is still in its infancy, but it hasbeen gathering strength. Brazilian researchin this field is carried out by interest groupssuch as the COMET project (Corpus Multi-lngue para Ensino e Traduo), developedtogether with the modern literature depart-ment of the Faculty of Philosophy, Literatureand Human Sciences at University of SoPaulo (USP). Members are mostly graduatestudents and volunteers.

    An example of the contribution of corpuslinguistics is CorTrad, a project developedby USP, Linguateca and NILC, which appliesa methodology proposed by corpus lin-guistics that has new functionalities, suchas new search types, for translation. Theproject also enables different versions ofthe same translation to be compared andspecific structural components to be con-sulted. CorTrad is available on COMETswebsite. One of its main advantages is itsefficient search mechanism, which refinesthe search into three different subcorpora,

    including genre, text type and other specific

    characteristics. So far, this project has pro-duced two important reference materials in

    the areas of Brazilian cuisine and receivingguests. What makes this project differentis its presentation of a parallel corpus thatmakes it possible to compare the originalwith the translation.

    Another contribution is CorTec, a tech-nical corpus for Portuguese-English thatenables terminology comparisons. It isdivided into 14 subcorpora segmentedinto specialized areas. These studies arerecent and are still in the initial stages;however, they need to have their relevanceacknowledged. The development of lan-

    guage technology is extremely dependenton these studies, which means that thegrowth of the translation market dependson investments in this area of research.

    Some TM systems have already receivednew functionalities derived from corpuslinguistics methodology. Although it wouldbe incorrect to say that statistical MT usessome type of corpus linguistics, it is truethat these methods and techniques canhelp computational linguistics developnew mechanisms for TM systems.

    Currently, corpus linguistics is being

    developed in various linguistic research

    centers around the world. One of the majorcenters is in Great Britain, with projects

    being carried out at various universities,in the cities of Birmingham, Brighton,Lancaster, Liverpool, London and others.Research in British institutions has con-tributed to the theorization of corpora andother support materials in various areas. Inthe Scandinavian countries there are alsoactive centers dedicated to this research.Corpus linguistics appears to be morewidespread in Europe than in other partsof the world. In the United States, corpuslinguistics exists but is more modest. NorthAmerican researchers are more engaged

    in projects involving NLP, which, althoughclosely related to computer sciences withvarious characteristics in common with cor-pus linguistics, is treated separately.

    A new trend in the worldwide corpuslinguistics scenario is investment by pri-vate companies, through partnershipsbetween companies and universities.The business world has a great interest instudies in this area of knowledge for com-mercial purposes such as the automatedprocessing of texts, computerization ofdatabases, and the creation of intelligent

    voice and data management systems. G

    Human LanguageTechnology ExpertsPMLS, the PetaMem Language Server, might be

    the most versatile and comprehensive HLT solutionyou will ever see.

    ISO639-3 complete multilingual support Currently more than 500 supported

    languages. Yes, over 25,000 language pairs. Dictionaries and machine translation Language identification, text categorization Semantic clustering and inference

    Discourse engine (Chatbot) Academic license available Intranet appliances Server-client, multiuser JAVA, C#, Perl, PHP API libraries available And more

    PetaMem GmbHFrth, Germany

    [email protected] www.petamem.com

    #1 Provider of Platform-independent TM Technology

    Wordfast Translation Studio includes: Wordfast Classic, the #1 Microsoft Word-

    based TM tool Wordfast Pro, the #1 standalone TM tool

    for any platformAnd introducing: Wordfast Anywhere, the most advanced

    free web-based TM tool featuring completeconfidentiality

    Wordfast also markets server products to addresstranslation management at the enterprise level.

    To learn why Wordfast has become thepreferred TM software of over 30,000professionals worldwide, visit www.wordfast.com

    Wordfast LLCParis, France

    [email protected] www.wordfast.com

    Cost-effectiveFilter Software

    Sysfilter Tools prepare texts that have beencollected in various programs so that they can beprocessed with your standard software (for example,Word or XML-Editor). When the texts are translated,they are returned to their original format.

    Sysfilter Tools are available for Adobe InDesign, Illustrator, Photoshop CorelDraw Visio and Excel

    Compatible with all common translationmemory systems

    Translation projects cost savings up to 95% arepossible.

    Please check our Sysfilter Pack offer.

    ECM EngineeringBreitenbrunn, Germany

    [email protected] www.sysfilter.de

    GETTING STARTED :

    April/May 2010 www.multil ingual.com/gsg

    mailto:[email protected]://sysfilter.de/http://www.wordfast.com/http://www.petamem.com/mailto:[email protected]://www.sysfilter.de/mailto:[email protected]://www.wordfast.com/http://www.petamem.com/http://www.multilingual.com/gsg
  • 8/22/2019 Print Supp 111

    12/17

    page 12

    LANGUAG

    ETECHNOLOGY

    Our global landscape presents oppor-tunity everywhere, from business,educational, social, local, travel

    and humanitarian efforts, just to name afew. If you look around, youll find endlesscommunication methods using electroniclanguage enablement. Whats great about

    all this opportunity is the vast number ofdifferent ways you can manifest technol-ogy to educate, foster peace or help thosein need.

    Multilanguage technology falls intoone of three camps human translation,machine translation (MT) or hybrid solu-tions. Human translation technology ishandled today by systems that use con-tact, billing and workflow management toshuffle translation jobs to a large groupof online translators. These systems areoften accessible by an application pro-gram interface (API).

    With an API, you can access the coretranslation methods and bubble themup to your own user interface. MT sys-tems are also accessible by API. Hybridtranslation technologies are less commonbecause many of these are proprietaryand/or protected by patents. Hybrid solu-tions allow users to take advantage of bothhuman translation and MT from within oneapplication.

    From these three core technologies anynumber of great ideas can emerge www.dotsub.com, for example, allows users

    to upload video and then work as a giantcrowdsource community to translate sub-titles into languages around the world.

    Ingredients for a new technologyIn our case, an idea emerged from the

    void of social networking we observedactively connecting members regardlessof natively spoken language. We set outto create www.mojofiti.com, the basis forour how to explanation. Our goal is tocreate a place where internet users fromaround the world can gather to publish

    blogs, send messages and socially interact

    with all that interaction invisibly trans-lated behind-the-scenes so that readerscan traverse the landscape of the content.

    Open-source technology is a kind ofgoodness that allows ideas to rapidlydevelop, reusing modular programmingthats already been done by someone

    else. Its like going to an assembly lineand picking all the parts needed for yourcreation for free.

    As great as that sounds, those partsstill require considerable programmingand thinking to coalesce into softwarethat works the way you want. In the caseof www.mojofiti.com, getting things toscale consistently became a focus of ourthinking and developmental investment.In some cases its like coaxing a squarepeg into a round hole or creating an effi-cient custom adapter kit.

    For the blog publishing system, we

    chose WordPress MU for the followingreasons:

    Search engine optimization (SEO)advantage: Over the years WordPress hasdone a number of savvy things to playnice with Googles Search. Permalinks andsitemapping, for example, gave us confi-dence that our users would have a first-class chance of experiencing competentSEO throughout the world.

    Worldwide: WordPress is localizedin over 50 languages worldwide. Whatthis meant to us is that users from around

    the world would have access to robustpublishing tools from the onset of ourdevelopment.

    Open source: WordPress philosophyis something we support and were happyto take advantage of. Heres a link to oneof the videos that influenced our decision:http://wordpress.t v/2009/10/13/matt-mullenweg-wordpress-gpl

    At the time we were concocting our tech-nology recipe, WordPress was developingthis new way to link users together calledBuddyPress. BuddyPress was in beta

    and felt more like duct tape and spider

    webs than anything else, but we believedin the potential and the track record ofWordPress and so decided to develop it.During our initial development, we saw agreat leap forward with the release of Bud-dyPress 1.1.

    What Buddypress did for us was allow

    our multilingual users to link to each otherand share things such as e-mail and shortformat communications called wires.

    We tried a number of ways to languageenable our WordPress MU + BuddyPressenvironment. After several attempts withvarious plug-in technologies, we wereable to get a modified version of http://transposh.org and Googles API to work.

    We chose www.softlayer.com for host-ing because it has an ability to scale atsmall incremental levels. This meantthat we could grow our hosting in smallsteps as we grew, essentially making it

    so we didnt have to pay for much unusedhosting space over the development andgrowth stages.

    Our first and favored project man-agement (PM) solution, while not opensource, is Basecamp (http://basecamphq.com) from 37 Signals. We beganthinking this would prove as our end-allsolution but found our use of the productto be best suited for file management,high-level PM duties and overall businessgoals management. For a nominal fee, wehandle graphic source files, requirements

    documentation, high-level business goalsand projects with Basecamp.

    With high-level information being trackedin Basecamp, we wanted a separate solu-tion to track the details of our developmentefforts bug fixes and iterative featurerequests. We decided to use Mantis (www.mantisbt.org) for this purpose. The Devel-opment Manager translates high-levelbusiness goals into digestible tasks forinsertion into Mantis. We then try to grouprelated requests for assignment to specificdevelopers. Each developer sets the tick-

    ets to Resolved upon completion, and

    Creating Your Own

    Multilingual TechnologyDennisWakabayashianD ChrisGolaszeWski

    LANGUAGE TECHNOLOGYGuide:GETTING STARTED

    The Guide From MultiLingual

  • 8/22/2019 Print Supp 111

    13/17

    The Power of

    Collaborative TranslationFocusing on translation project management

    and collaborative translation projects, memoQ isthe best solution for language service providersand enterprises alike.

    Some reasons why our customers appreciateand choose memoQ:

    The worlds best client-server translationenvironment

    Online documents for effective and quickteam translation

    Real-time collaboration between translatorsand proofreaders

    Extensive integration with other systems Control over translation processes and costs Responsive and quick supportChange gear with memoQ!

    Kilgray Translation TechnologiesBudapest, Hungary

    [email protected] www.kilgray.com

    The Power of Your InsightDj Vu X is a computer-aided translation

    (CAT) system that learns from your owntranslations.

    Its unique technology achieves a high levelof translation reuse and controls the use ofterminology automatically, ensuring consistencyand saving you valuable time.

    You can translate many proprietary formats

    directly and import their translation memoriesand terminology bases.

    Other tools have users. Dj Vu has fans.

    ATRILMadrid, Spain

    [email protected] www.atril.com

    Global. Unlimited.Consistent. Affordable.

    ATRIL introduces to the translating communitythe most powerful solution for large translationteams working across different locations.

    By integrating with and extending theIntelligent Quality technology in Dj Vu XWorkgroup, TeaM Server allows translators

    who work on extensive, multinational andmultisite translation projects to efficiently andseamlessly share their translations in real time,ensuring superior quality and consistency.

    ATRILMadrid, Spain

    [email protected] www.atril.com

    A BetterLocalization ExperienceRubric specializes in globalization services for

    the high-technology industry, providing flexibility,on-demand scalability, and integrity to guaranteelocalization success. Rubrics refined processesadapt to the high-tech sectors need for proactivelocalization planning, with its anticipation ofdynamically changing requirements and its agileresponse capabilities.

    RubricSan Diego, California USA

    [email protected] www.rubric.com

    Join Us TodayTDA is a nonprofit organization providing a

    neutral and secure platform for sharing languagedata. Share your translation memories andin return get access to the data of all other

    members.

    TDA is a super cloud for the global translationindustry, helping to improve translation qualityand automation and to fuel business innovation.

    TAUS Data AssociationDe Rijp, The Netherlands

    [email protected]

    High-quality MTfor International SuccessSYSTRAN is the leading provider of machine

    translation (MT) solutions for the desktop,enterprise and internet. Our solutions facilitatemultilingual communications in 52+ languagepairs and in 20 domains. SYSTRAN Enterprise

    Server 7 is powered by our new hybrid MT enginethat combines the predictability and consistencyof rule-based MT with the fluency of the statisticalapproach. The self-learning techniques allowusers to train the software to any specific domainto achieve cost-effective, publishable qualitytranslations. SYSTRAN solutions are used bySymantec, Cisco, Ford and other enterprises tosupport international business operations. Formore information, visit www.systransoft.com

    SYSTRAN

    San Diego, California USA Paris, [email protected] www.systransoft.com

    GuideLANGUAGE TECHNOLOGYGETTING STARTED :

    April/May 2010 www.multil ingual.com/gsg page 13

    http://www.rubric.com/http://www.systransoft.com/http://www.tausdata.org/http://www.tausdata.org/mailto:[email protected]://www.systransoft.com/mailto:[email protected]://www.rubric.com/mailto:[email protected]://www.atril.com/mailto:[email protected]://www.atril.com/mailto:[email protected]:[email protected]://www.kilgray.com/http://www.multilingual.com/gsg
  • 8/22/2019 Print Supp 111

    14/17

    The Guide From MultiLingualpage 14

    LANGUAGE TECHNOLOGYGuide:GETTING STARTED

    a Change Manager closes them oncepushed to our live environment.

    Human resourcesWe utilized a senior engineer who was

    able to strategically lead the holistic

    hosting, system administration and pro-gramming development. Mojofiti at anygiven time has one manager coordinatinghuman, software and financial resourcesrelated to a system of production. Theproduction system manages bug tracking,new development and priorities.

    Our programming team is comprisedof several PHP, MySQL and WordPress-specific developers. Typical duties includeeverything from printing a users selectedprimary language to a template creating aBuddyPress-compatible plugin, thus allow-ing users to request and save crowdsourcedtranslations. A few overall challenges facedat the start of this project include a URLrewriting override for the default methodprovided by BuddyPress in order for Trans-posh to work as expected; a handful ofsmaller compatibility issues with Trans-posh and this WordPress MU/ BuddyPressenvironment; a modification to the defaultBlog creation process in order to pick up adefault set of Transposh settings per Blog;and a link changing of default Transposhoutput to work within the multiuser setup.

    We have designers who contribute cre-

    atively to the user experience. Most of thesepeople have five or more years experience,which helps a lot when you want to cyclethrough things frequently and continuously.

    Technological developmentSo after we determined our ingredients,

    we set out to manifest the idea. Our pro-cess went something like this.

    At the Business Requirement Documen-tation stage we gather user experience,customer benefits and resource availabil-ity information. We evaluate these items

    together and distill a potential combina-tion that exhibits a strong opportunity forthe users of the software to benef it. In thecase of www.mojofiti.com, it amounted tousers publishing blogs, with those users/blogs united into a social network systemwith system-wide communications with-out language barriers. Costs were to beless than $500,000.

    At the Production Scheduling stage,the production manager maps out thedevelopment. Then come staff reviewand approval. The staff gets together and

    reviews the work business requirements

    and the proposed schedule. If approved,the resources are allocated and the workbegins. Next comes the user interfacedesign development, where the teamsdo the design of screens associated withthe software. Then again come the staff

    review and approval, and the team col-lectively determines if we are on-target tomeet the business goals.

    Software/programming/developmentis the next logical step. PHP/MySQL pro-grammers get to work. Business goals areredefined as digestible and logical devel-opment tasks. The development teamdetermines the appropriate technical solu-tion and executes. Development cyclesiterate until the business goals are met.Then come quality assurance and testingof the software from a technical perspec-tive, which sometimes includes an exter-nal focus group team or service to doublecheck the functionality and user experi-ence. Theres another staff review andapproval and then the closed beta launch.The closed beta step allows a larger groupof our teams, both internal and externalto test the production. Sometimes weinclude our public relations teams, adver-tising agency and investors. A punchlistof items to be completed before launchis reviewed, prioritized and worked on.Final staff review and approval take placebefore launching the beta to the public.

    At the open beta step, the public getsto test the software and weigh in on anyupdates, bugs, modifications or changesthat are to be considered. Feedback fromthe public beta is then reviewed and pri-oritized by staff. If all items are done andapproved, we move to launch. The inter-nal launch includes contingency planning,server configurations and release sched-ule. At this time all things move live to thepublic servers. Finally, theres the publiclaunch, and the files are transferred to liveservers. Refinements become a version-

    ing system where we launch new updatesweekly or monthly to sites.

    Once your software is in a place so thatusers can start working with it, get it online.We recommend a beta label to informusers that the site is a work in progress.During this period, have users start to tellyou whats working and whats not, thenidentify and fix bugs and improve the site.This process is probably the most efficientway to develop as it gives you real worldinsight into how to manage your ongoinginvestments to get the best results for

    your users. GIterations of the site from the design team.

  • 8/22/2019 Print Supp 111

    15/17

    w

    ww

    .star-gro

    up.n

    et

    More than 25 years ofknowledge and experienceat your fingertips

    distribute, collaborate, share

    WebTe

    rm

    Globalandcollaborativeterminologymanagement

    WebbasedTerminology

    Reducedrevisioncosts

    through

    highformattingquality

    Form

    atChe

    cker

    DataQuality

    SP

    ID

    ER

    Fullyautomatedcross-mediapublishing

    inalllanguages

    AutomatedPublishing

    Successfully

    communicatin

    ginformation

    worldwide

    i-KNOW

    Interactive Com

    munication

    Transit

    Computer

    assisted

    translation

    withTrans

    lationMe

    mory

    Translation

    andLocaliz

    ation

    Conquer thech

    allenges

    ofglobalmark

    ets

    withinformati

    on

    managementG

    RIPS

    InformationMa

    nagement

    Thedelta principlein technical authoring M

    indReader

    Authoring Assistance

    Successful

    productcommunications

    CorporateLanguageManagement

    STAR

    CLM

    Modular concepts in the palm of your hand

    Write, reuse, translate, publish,Write, reuse, translate, publish,

    sat your fingertip

    distribute, collaborate, share

    educedrevsionosts

    hrugh

    ighormattingquality

    Form

    athec

    eri

    CLMCorporate Language Management

    TransitNXT

    Translation Memory

    TermStarNXT / WebTermTerminology Management

    GRIPSCorporate Product Information Management

    MindReaderContext Sensitive Authoring Assistant

    STAR James

    Process Control and Automation

    TermStar

    Target

    edcom

    munic

    ation

    using

    corpor

    atelan

    guage

    Termin

    ology

    http://www.star-group.net/
  • 8/22/2019 Print Supp 111

    16/17

    http://www.madcapsoftware.com/
  • 8/22/2019 Print Supp 111

    17/17

    GETTING STARTED :GuideLANGUAGE TECHNOLOGY

    This guide is a component of the magazine MultiLingual. Theever-growing easy international access to information, ser-vices and goods underscores the importance of language

    nd culture awareness. What issues are involved in reaching annternational audience? Are there technologies to help? Who pro-ides services in this area? Where do I star t?

    Savvy people in todays world use MultiLingual to answer theseuestions and to help them discover what other questions theyhould be asking.

    MultiLinguals eight issues a year are filled with news, technicalevelopments and language information for people who are inter-sted in the role of language, technology and translation in ourwenty-first-century world. A ninth issue, the Resource Directorynd Index, provides listings of companies in the language industrynd an index to the previous years content.

    Two issues each year include Getting Started Guides such ashis one, which are primers for moving into new territories botheographically and professionally.

    The magazine itself covers a multitude of topics.

    ranslation

    How are translation tools changing the art and science of com-municating ideas and information between speakers of differentanguages? Translators are vital to the development of interna-onal and localized software. Those who specialize in technicalocuments, such as manuals for computer hardware and soft-

    ware, industrial equipment and medical products, use sophisti-ated tools along with professional expertise to translate complexext clearly and precisely. Translators and people who use transla-on services track new developments through articles and news

    tems in MultiLingual.

    anguage technology

    From multiple keyboard layouts and input methods to Unicode-nabled operating systems, language-specific encodings, systemshat recognize your handwriting or your speech in any language

    language technology is changing day by day. And this technol-gy is also changing the way in which people communicate on aersonal level changing the requirements for international soft-

    ware and changing how business is done all over the world.MultiLingual is your source for the best information and insight

    nto these developments and how they will affect you and yourusiness.

    Global webEvery website is a global website, and even a site designed

    or one country may require several languages to be effective.xperienced web professionals explain how to create a site that

    works for users everywhere, how to at tract those users to yourite and how to keep the site current. Whether you use the inter-et and worldwide web for e-mail, for purchasing services, for

    promoting your business or for conducting fully international e-commerce, youll benefit from the information and ideas in eachissue ofMultiLingual .

    Managing contentHow do you track all the words and the changes that occur

    in a multilingual website? How do you know whos doing whatand where? How do you respond to customers and vendors ina prompt manner and in their own lang uages? The growing andchanging field of content management and global manage-ment systems (CMS and GMS), customer relations management(CRM) and other management disciplines is increasingly impor-tant as systems become more complex. Leaders in the devel-opment of these systems explain how they work and how theywork together.

    InternationalizationMaking software ready for the international market requires

    more than just a good idea. How does an international developerprepare a product for multiple locales? Will the pictures and col-ors you select for a user interface in France be suitable for users

    in Brazil? Elements such as date and currency formats sound likesimple components, but developers who ignore the many inter-national variants find that their products may be unusable. Youllfind sound ideas and practical help in every issue.

    LocalizationHow can you make your product look and feel as if it were built in

    another country for users of that language and culture? How do youchoose a localization service vendor? Developers and localizersoffer their ideas and relate their experiences with practical advicethat will save you time and money in your localization projects.

    And theres much moreAuthors with in-depth knowledge summarize changes in thelanguage industry and explain its financial side, describe the chal-lenges of computing in various languages, explain and updateencoding schemes, and evaluate software and systems. Otherarticles focus on particular countries or regions; specific lan-guages; translation and localization training programs; the usesof language technology in specific industries a wide array ofcurrent topics from the world of multilingual computing.

    If you are interested in reaching an international audience in thebest way possible, you need to read MultiLingual.G

    An invitation to subscribe to

    Subscribe to MultiLingual atwww.multilingual.com/subscribe

    http://www.multilingual.com/subscribehttp://www.multilingual.com/subscribehttp://www.multilingual.com/subscribehttp://www.multilingual.com/subscribehttp://www.multilingual.com/http://www.multilingual.com/subscribe