66
Successful I18n Project Planning using Static Analysis Lingoport, Inc. 3985 Wonderland Hill Ave. Boulder, Colorado USA 80304 +1 303 444 8020 www.lingoport.com Adam Asnes Grand Poisson Copyright: March 2011 Please do not reproduce without authorized permission Olivier Libouban G11n Lead

Wordware 2011: Lingoport i18n Planning & Static Analysis

Embed Size (px)

DESCRIPTION

The business case for internationalization, character encoding, a Java internationalization example and an overview of Globalyzer’s static analysis.

Citation preview

Successful I18n Project

Planning using Static Analysis

Lingoport, Inc.

3985 Wonderland Hill Ave.

Boulder, Colorado

USA 80304

+1 303 444 8020

www.lingoport.com

Adam Asnes

Grand Poisson

Copyright: March 2011Please do not reproduce without authorized permission

Olivier Libouban

G11n Lead

Lingoport

• Internationalization Services

– Assessment

– Project planning

– I18n development

– I18n testing

– Localization integration

• Globalyzer

– Internationalization software

• Find and fix i18n issues in code

Agenda

• Business Case

• I18n issues

• Static Analysis Background

• Requirements Gathering

• Static Analysis Detail

• Project Plan Example

• Agile planning

• Continuous Integration for i18n

Engineering for Locale Support

• Globalization (g11n) has two components :

– Internationalization (i18n) : software engineering to

enable localization

– Localization (L10n): culture specific resources

(translation, etc.)

Business Case:

Nothing gets internationalized or localized just cause it would be cool

I18n Needs – Biz vs. Tech

Engineering thinks about…

1. Multi-tiered web application?2. Complex Interface?3. Database components?4. Embedded Strings?5. Locale aware application?6. Can it manage multiple data formats?7. I18n testing plan?8. Tactics to get it done

Our Software must be in Japanese, French, German, Chinese, and Spanish by November

I18n is Business Driven

• Global initiatives

– Expanding opportunities, New customers

• Competitive pressure

• Lost time to market

• Iterative code fixing, problems keep slipping

through

• Development costs in the hundreds of

thousands to millions of dollars

You Need a Plan – Scope 1st

, design later

• Project becomes real with $$$

• CFO thinking in terms of ROI

– Deal Based• Revenue – Costs = Profit

– Strategic

• Revenue over X years – Costs +

effect on equity – risk

• Leverage global investment of

organization

– Cost of Time to Market

• If you‟re late or lousy, that has

significant opportunity cost

Engineering:

Localization is a Downstream Concern

• “Somebody else‟s problem” in the world of many

developers

• Creates an opportunity to educate and shepherd

teams through globalization

Is It Internationalized?

• Typically underestimate i18n requirements

• Most don‟t know the answer

• Agile or other feature and release requirements

often overrun less formally measured i18n

requirements

• There is a Management Value in being able to

confirm global readiness

Example: Hard-Coded English Text

1 million lines of source codeFound:

20,000 Embedded Strings which cannot be efficiently translated

String orderStatus = “Your order has been

processed. A confirmation e-mail will be

sent to you shortly.”;

Character Sets/Encodings

• Character set (e.g. Unicode)

– A set of characters used to support a given language or series of

languages

• Character encoding (e.g. UTF-16, UTF-8)

– A set of code points that defines numeric values for each

character within a character set (coded character set)

Character Sets and Encoding

• This is broken:

Sample Code (Java) – i18n Examples

I18n Engineering Considerations

• Locale Handling

• Character encoding

• Strings– External, Grammar, Segments, Plurals, Wrapping

– String Handling (char *, etc.)

– Tabs, spaces, delimiters, etc.

• Resource management –centralized, normalized, re-usable

• Dates - Calendar

• Times

• Sorting & searching

• Currency

• Transaction process

• Character set conversions

• On line help

• Sounds

• Honorific titles

• Telephone formats

• Postal formats

• Region-specific functions

• Shipping conditions

• Numerical formats

• Page layout, LTR, RTL

• Fonts and attributes

• Icons, colors

• Reporting, workflow

• Database support

• Multi-byte enabling

• Business logic

• Measurements, units

• Input Methods

• Data exchange

Internationalization Challenge

• Software Data Path - it‟s not just the display

Display Input Transform

Store

RetrieveTransform

New Internationalization Project!

• What to do?

– Large amount of code

– Change in requirements

– Change in architecture

– Change in development practices

– Change in testing requirements

Practical Challenges

• Sift through hundreds of thousands or millions of

lines of code

• Managing fixing complex problems

• Creating a product that looks, feels and behaves

natively to its worldwide users

• Source code must be adapted to seamlessly

adapt to any language, streamlining support and

updates

Code Review

• What to Identify– Embedded strings

– Locale-Sensitive methods/functions/classes

– Image references

– Unsafe programming constructs (ex: regular expressions needing US Alphabetical Order, Pointer arithmetic and more)

Code Analysis

• How to Identify Issues– “Brute force”

• Engineers search for and resolve known issues

• Count display pages

• Pseudo-localization

• Scripts and page by page analysis

– Globalyzer-assisted review, static analysis

• An I18n code analysis tool is employed to examine source code for a large range of potential and known issues

• Issues can be identified and resolved in a more systematic fashion

Traditional Approach - repeat, and repeat, and repeat, and repeat

Localize and see what you‟re missing

GREP, overwhelm developers

View pages. Pour through code for strings,

methods, etc.Externalize and refactor

one by one

Test, Pseudo-Localize

Globalyzer Server and Clients

Static Analysis on the Source Code

Server

Client Command Line

Globalyzer is methodology agnostic. Project Managers may use it in a „traditional‟ approach or Agile approach.

Globalyzer Principles - Customization

• Globalyzer Server manages Rule Sets Configuration– Globalyzer Rule Sets are used to

identify i18n issues in the code base

– Rules embody the i18n issue detection logic

– One rule set targets one programming language (& variant)

– Default rule sets are based on research and years of experience

– Rules must be tailored to a specific project

– Rules can be shared amongst team members

Globalyzer Principles – Desktop Analysis

• Globalyzer desktop client:

– Scan source code using Globalyzer Rule Sets

– Detect and report i18n potential issues

– Manage i18n issues

– Assist Fixing the code to become i18n compliant

Globalyzer Principles - Automation

• Globalyzer Command Line

– For integration in the overall software process to run at given

frequencies

– Generate reports once a setup has been established

– Different strategies

• Segment the code base into small scan projects that

reflect the i18n effort

• Focus on i18n scope

I18n Processes

• Planning

• Market Requirements Analysis

• Architectural Requirements Analysis

• Code Review

• I18n Design

• I18n Implementation

• Testing

• And beyond…

• Localization

• Support

Merging Requirements and

• Architectural Changes

What‟s not in the code

– Locale support

– Changes to how data

is passed around

– Discuss and Analyze

technical requirements

• Code Analysis

What‟s in the code

– Strings

– Refactoring Locale-

limiting

methods/functions

– Find and count issues

I18n Architectural Challenge – what’s not in the code

DatabaseCharacterencoding support

Application Codee.g. Java, C++, VB

3rd Party Products

U/Ie.g. JSP,

ASP, ASPX

Business Logic

Platforms, Browser Support Requirements

Marketing RequirementsLocale behavior

COMPLICATIONS

Operational Challenges

• Ongoing development

– Agile?

– Code Branching?

– Multiple teams?

Release Path

• Internationalization,

1st Time

– Most of U/I

– Breaks the DB

– Data I/O

– Test entire product

• Feature Release

– 3 week sprint?

– Focus on code subset

– Concentrated testing

• Static analysis with

Globalyzer

Code branch, merge, testing strategy

Factors to Plan On

• Programming languages

• How many tiers, what do they do

• Database support

• Locale Requirements

• 3rd Party Products – support for Unicode?

• Size of Application – Lines of Code

• Amount of Embedded Strings to be Externalized

• Estimate of concatenation

• DB refactoring

• Methods/Functions/Classes replacement

Tiers and Technologies

1

• Java

• C#

2

• JavaScript

• VB

3

• C++

• Older languages: e.g. RPG

Time and effort increase

Other Issues

• Stability of the build

• Quality of the code

– History

• Focus of the developers

• Source code management approach

• New concurrent development introducing new

i18n problems

Questions & Answers

Adam Asnes

[email protected]

Olivier Libouban

[email protected]

Resources

http://www.lingoport.com

Globalyzer

http://www.globalyzer.com

Blog

http://i18nblog.com

Lingoport:

Requirements and Planning

Adam Asnes

President & CEO

Lingoport

Olivier Libouban

Globalization Lead

Lingoport

Why go through requirements?

• I18n work is software engineering

• To determine the scope of the i18n work, the

i18n cannot simply look at the code and come

up with an i18n project

• Scope also leads to planning, cost, resources

• How to describe i18n requirements?

Focus on one requirement: Locale

• One product instance per locale?

• Multi-locale support

• Locale detection?

• User account support?

Ex: WebSphere Portal Locale

Determination

– User logged in: display user‟s preferred language

– No preferred user language: look for user‟s browser

language

• If supports of that language, displays in that language.

• If browser has more than one language defined, uss the first

language in the list to display the content.

– If no browser language can be found, for example if the

browser used does not send a language, the portal

resorts to its own default language.

– If the user has a portlet that does not support the

language that was determined by the previous steps,

that portlet is shown in its own default language.

One-Time Locale Selection

System based Locale Detection

More of the typical i18n requirements

• Target date(s)

• System requirements

• Existing & potential use cases for UI text entry,

• Text display

• Text processing

• Collation

• Handling of locale-sensitive data (dates,

numbers, currencies, etc.).

• Client Installer considerations

Architectural Discussion

• Thorough Product Demo

• Walk through major architecture components

Conceptual illustrative architecture

Specific development and integration

Web Services Rules Engine JMS

RDBMS LDAP CMS

CODE

UI

Business

Persistance

Workflow

Engine3rd Parties

April 19, 2011 – p 45

Specific i18n software engineering focus

• UI : html, server side, JavaScript,

input forms, css, content

presentation, etc.

• Business logic, searches,

comparisons, data exchange with

external systems

• Persistence : exchanges with

RDMBS, Content Management,

LDAP, file based persistence

(xml, etc.)

April 19, 2011 – p 46

Specific development and integration

Web Services Rules Engine JMS

RDBMS LDAP CMS

CODE

UI

Business

Persistance

Workflow

Engine3rd Parties

Specific development i18n issues

• String externalization (outside of

code) and i18n resource bundles

• Locale sensitive methods :

searching, retrieving, sorting, date

and time, string operations,

character operations, etc.

• Code resources (images, etc.)

• Overall programming language

specifics

April 19, 2011 – p 47

Specific development and integration

Web Services Rules Engine JMS

RDBMS LDAP CMS

CODE

UI

Business

Persistance

Workflow

Engine3rd Parties

Specific development and integration

Web Services Rules Engine JMS

RDBMS LDAP CMS

CODE

UI

Business

Persistance

Workflow

Engine3rd Parties

Data stores i18n issues

• PL/SQL

• Encoding

• Locale files (xml, xls, csv, etc)

• Database specific issues, date/time, conversion, sorting, soundex, etc.

• Storing and retrieving local data in local language (vs. a “generic” schema)

• User entered data

• Columns requiring translation

• Attributes, user names, postal addresses, etc

• Database design

April 19, 2011 – p 48

Specific development and integration

Web Services Rules Engine JMS

RDBMS LDAP CMS

CODE

UI

Business

Persistance

Workflow

Engine3rd Parties

Content Management i18n issues

• Accessing the proper locale

• Encoding of content

April 19, 2011 – p 49

Specific development and integration

Web Services Rules Engine JMS

RDBMS LDAP CMS

CODE

UI

Business

Persistance

Workflow

Engine

3rd Parties

External system i18n issues

• Modality of data exchange /

data loss

• Accessing the proper locale

• Encoding/persistence of

content on external system

April 19, 2011 – p 50

I18n Engineering Considerations

• Locale Handling

• Character encoding

• Strings– External, Grammar, Segments, Plurals, Wrapping

– String Handling (char *, etc.)

– Tabs, spaces, delimiters, etc.

• Resource management –centralized, normalized, re-usable

• Dates - Calendar

• Times

• Sorting & searching

• Currency

• Transaction process

• Character set conversions

• On line help

• Sounds

• Honorific titles

• Telephone formats

• Postal formats

• Region-specific functions

• Shipping conditions

• Numerical formats

• Page layout, LTR, RTL

• Fonts and attributes

• Icons, colors

• Reporting, workflow

• Database support

• Multi-byte enabling

• Business logic

• Measurements, units

• Input Methods

• Data exchange

April 19, 2011 – p 51

Process requirements:

how to fit into an existing environment

• Lifecycle

• Documentation

• Integration

• QA

• Type of meetings

• Build

• Source control

• Branching

• Reporting structure

• Review boards

• JUnit

• Globalyzer

• Bug Reporting

Questions & Answers

Adam Asnes

[email protected]

Olivier Libouban

[email protected]

Resources

http://www.lingoport.com

Globalyzer

http://www.globalyzer.com

Blog

http://i18nblog.com

Static Analysis Detail

Globalyzer example – Running and Reporting

Adam Asnes

President & CEO

Lingoport

Olivier Libouban

Globalization Lead

Lingoport

Example Project Plan

Looking at a plan from a service project

Example Project Plan

Combine:

•1 Part Architecture

•1 Part Code Metrics

•1 Part Experience

Lingoport:

Agile & Internationalization

Adam Asnes

President & CEO

Lingoport

Olivier Libouban

Globalization Lead

Lingoport

Agile in one slide (smallest nutshell)

• Roles (Product Owner, Scrum Master, Team)

• Product Backlog

• Sprints (user stories are designed, implemented, tested in a „short‟ timeframe, e.g. 3 weeks)

• Sprint Backlog

• Daily Scrums

• Demonstrable

• „Shippable‟

i18n and Agile Challenges

• Traditionally, Legacy i18n has followed a waterfall model:– i18n cuts across the code, for instance:

• Encoding problems …in all the code

• Formatting issues … in all the code

• Externalize strings …

– i18n needs a systemic approach

– I18n tend to have long project life cycles

– (L10n: must get an entire locale done)

• From a methodology perspective Agile:– is feature driven

– runs in “short” Sprint

• Sometimes a Hybrid approach works best

Agile & i18n Process Challenges

Lingoport Project Assessment - Legacy

• Uncover i18n potential issues from 2 perspectives:– Code perspective: Globalyzer reporting/metrics

– Architectural: Locale/technical i18n requirements

• Allows to create the initial „i18n product backlog‟

• Can, but does not need to be part of a Sprint

• Allows to have an overall scope and effort estimate

• Can feed into a number of processes– TDD, ADD, Waterfall, … Agile

• Involve the Product Owner: communication resource

Lingoport Project Organization

Backlog identification and Scoping

• The i18n product backlog is a prioritized list of

requirements, stories, features, etc.

• What the customer wants, described using the

(Product Owner‟s) customer‟s terminology

ID Name Imp Est How to demo Notes

1 Locale Setting and Tracking 30 5 Log in,

If no login before,

default locale

Splash screen for

Locale

If first time, otherwise

remembers

… …

… …

2 Locale for languages 10 8

Log in for an 'en US'

user Locale is default

Go to page 'www.'

Change Locale

Check pseudo

localization

… ..

Lingoport Project Organization

Sprint Management

• i18n code branching

• Agile typically uses development build, CI

environments

• Must pass „regular‟ dev criteria

• Must be able to push i18n code branching easily

and vice versa

• I18n tests must be available to other teams in CI

• Some items are more sensitive than others

– Database schema changes and implications on all source

Continuous Integration - Basics

Team 1

Team 2

Team 3

Team 4

Team 5

CI & Scan Results Summary

CI & Scan Details Results

Questions & Answers

Adam Asnes

[email protected]

Olivier Libouban

[email protected]

Resources

http://www.lingoport.com

Globalyzer

http://www.globalyzer.com

Blog

http://i18nblog.com