54
© 2009 IBM Corporation “Addressing the issues of KM for citizens (disabled, rural, women and others) on the other side of the digital divide of the country” SANJEEV GUPTA

SANJEEV GUPTA

  • Upload
    emmet

  • View
    59

  • Download
    2

Embed Size (px)

DESCRIPTION

SANJEEV GUPTA. “Addressing the issues of KM for citizens (disabled, rural, women and others) on the other side of the digital divide of the country”. IBM Research - India. IBM Human Ability & Accessibility Center. - PowerPoint PPT Presentation

Citation preview

Page 1: SANJEEV GUPTA

© 2009 IBM Corporation

“Addressing the issues of KM for citizens (disabled, rural, women and others) on the other side of the digital divide of the country”

SANJEEV GUPTA

Page 2: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Human Ability & Accessibility Center

IBM Research - India

Page 3: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research – India

3

What is Accessibility?

Access to information technology regardless of

ability or disability

“Accessibility – which started out as a philanthropic effort – has now evolved to a business

transformation effort for IBM and our clients.”

Sam Palmisano, IBM CEO, April 27, 2004

Page 4: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research – India

4

From compliance to societal transformation...

IBM Vision on Accessibility

IBM’s vision is to enhance human capabilities through technological innovation so that societal participation and personal fulfillment can be maximized, regardless of age or ability….

Accessibility is not about “them”, it’s about ALL of us…

Page 5: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research – India

5

Accessibility in India – Broader Landscape Factors affecting accessibility

– Physical disability• About 60 million people with

disability• 42.5% of the disabled population

comprises women • 75% of persons with disabilities

live in rural areas

– Educational/Economic disability• About 70% of India’s population

lives in villages• A large percentage of them are

poor • Literacy is mainly limited to write

their names, make signatures, read large hoardings

– Low internet penetration• Computer penetration is still

very low• People in remote areas not very

comfortable in using computers

Page 6: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research – India

6

Addressing India Accessibility Issues

IBMResearchInnovation

Bridging th

e

digital d

ivide

Making Web

Accessible

EWB WebAdapt2Me

aDesigner

Novelways

forInfor-

mationsharing

Hindi SpeechRecognition

Indian English TTS

Telecom Web/ WAV

Page 7: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research – India

7

Accessibility Solutions @ IBM Research

Page 8: SANJEEV GUPTA

8IBM

The Spoken Web

IBM Research - India

Page 9: SANJEEV GUPTA

IBM

A VoiceSite is:

A voice driven application hosted in the network and created by subscribers themselves

Consists of a set of interconnected VoicePages (eg vxml files)

Accessed by calling up the associated phone number and interacting with its underlying application flow through a telephony interface

Analogous to websites in the World Wide Web

Introducing VoiceSites

Page 10: SANJEEV GUPTA

IBM

VoiServ

Call VoiServ to create VoiceSite VoiSer

v:

Caller:

Please say your name

Sam

Please specify your profession

Plumber

Please record your welcome message

Hi my name is Sam, and I am a plumber. Please find information regarding my services on my VoiceSite.

Please enter your working hours

9 am to 7 pm

Please specify your service charges

I charge 5 dollars an hour

Would you like to offer appointment scheduling services?

Yes

Would you like to publish your information in yellow pages?

Yes

Do you accept jobs while you are away from your home location?

Yes

Please say your home location

South Delhi

Would you like to provide some references for your work?

Yes

Please say the name and phone number of your references

You can talk to Jack about my work. His number is 41292100

You have now created your voice site successfully. Users can now access your voice site through your phone number. Thank you for using this system.

CalendaringService

Database

VoiceSite

VoiceSite

Location Tracker

Presence Server

IMS

Yellow PageService

Yellow Pages Server

WWWYellowPagesWebsite

VoiGen

Page 11: SANJEEV GUPTA

IBM

Small user study – Plumber VoiceSite

Methodology for survey Electricians/Plumbers/Carpenters make a call to VoiGen

and create their voice sites We ask the subjects about the usability of the VoiGen

system

12 subjects surveyed for technology validation 10 were able to create the voice site successfully (within

4 minutes) There were usability issues with respect to conversation flow,

speech recognition accuracy Everyone realised that this technology can have tremendous

impact Since this technology does not require the end-user to own

any costs in terms of devices, it has a low acceptance barrier

Carpenters/Electriciansmake a call to VoiGen to

generate their voice

sites

The voice sites are

automatically deployed in the system

People call

these voice

sites to schedule time

with the specialis

ts

Page 12: SANJEEV GUPTA

IBM

What is the Spoken Web?

The Spoken Web is a world wide web in the telecom network, where people can host and browse VoiceSites, traverse VoiLinks, even conduct business transactions, all just by talking over the existing telephone network.

The Spoken Web will interoperate with the existing WWW.

The Spoken Web will interoperate with Next Generation Networks too.

Page 13: SANJEEV GUPTA

IBM

Spoken Web enables multiple business opportunities

New source of revenue opportunity for telecom operators Creation and hosting voicesites Payments and financial transactions

SMBs and microbusinesses can leverage the T-Web

Examples Microbusiness Voicesite VoiceSite Personalisation Rural Voikiosk

“Anyone with a mobile handset can become a T-Web enabled microbusiness voicesite owner and accessor, and

also conduct transactions on the T-Web”

Page 14: SANJEEV GUPTA

IBM

SpokenWeb Andhra Pilot Statistics

Pilot Launch: May 23, 2008 Report Summary (ended on Jan 28, 2009)

Total number of calls received = 114782 Number of unique callers = 6509

Total time spent = 2135 hours Average call time spent = 0 hours, 1 min, and 14

seconds. Maximum call duration = 0 hours, 49 min, and 40

seconds. Minimum call duration = 0 hours, 0 min, and 0 seconds.

Number of calls to Ashwini Center = 8399 Number of calls to Health Center = 14216 Number of calls to V-Agri = 13881 Number of calls to Professional Services = 37112

Page 15: SANJEEV GUPTA

© 2006 IBM Corporation

WAV: Web Access through Voice

Page 16: SANJEEV GUPTA

© 2006 IBM Corporation16

Motivation

Web is a rich source of useful local information Weather, travel, entertainment, insurance, finance

However a significant population (specially in emerging countries) is not using this information due to computer skills, exposure to browsing, language skills, physical

limitations, aging

A large number of such people have access to phone (landline/mobile) Growing at a fast rate

Even computer users can’t browse in several conditions On the move, no connectivity areas, low speed, etc.

Page 17: SANJEEV GUPTA

© 2006 IBM Corporation17

Proposition

Decouple web information from web browsing Let the people access web information without having to browse/know

how to browse

However, still leverage the web interface

• No change required on the website/content provider side

Let the system browse instead of the user System figures out how to extract the information from web for a user’s

query

The interaction can be enabled in user’s language for simple queries (structured input/output) through speech recognition and language translation

Page 18: SANJEEV GUPTA

© 2006 IBM Corporation18

Scenarios A person wants to go from station A to B. He wants to know what all trains

are available, their schedule, availability, etc. He has only a phone and is not familiar with the web. Access a relevant website (e.g., indianrail.gov.in)

• Get the required inputs from the user– Source, destination, class, dates, etc.

• Fetch the information from the web and give it back to the user

A person wants to know what are the interest rates offered by various banks for home loan System can goto a popular website (e.g., apnaloan.com)

• Get the required inputs from the user– Term, floating, fixed

• Fetch the information from the web and give it back to the user– lowest interest rate offered

A person is planning a trip to Chennai and wants to know the current weather there Goto cnn.com

• Fill up the form• Speak the weather over phone in local language

Goto google• Get the weather information and reply back to the user

Page 19: SANJEEV GUPTA

© 2006 IBM Corporation19

What is currently available?

Web browsers on mobile phones Person can browse the web on handheld device

• Costly, complicated, tiny interface, inaccessible, not suited for common man,

Browsing of voice sites Created from scratch using VXML

Speech interface to specific services (TellMe, Nuance) Nearest restaurant, police station, hospitals, etc.

• Based on knowledgebase created offline OR• Proprietary tie-up with content providers to have access to databases• Predefined, Keyword based

A third-party data provider gathers the business information that Tellme provides in the Tellme download and on 1-800-555-TELL, so we are unable to directly add or correct specific business information. If you would like to add or correct information that is listedfor your business, please use the easy form on the InfoUSA website. (Taken from TellMe Website : http://www.tellme.com/you/faqs)

Page 20: SANJEEV GUPTA

© 2006 IBM Corporation20

Proposed approach

Dialogue

Component

ASR/TTS

Request

Generator

Response

Generator

World Wide WebVoice/ DTMF

Page 21: SANJEEV GUPTA

© 2006 IBM Corporation21

How does it work?

Web Site

Service1

Service2

Service3 Service4

Request

Process

Generator

Response

Process

Generator

Browser scripting

tools such as

Co-scripter

Information Extraction

tools

Page 22: SANJEEV GUPTA

© 2006 IBM Corporation22

Request Generation & Execution

Leveraging the browser interface through scripts Generate a script with inputs taken from the user

Execute the script with a browser

Input Collection

(VXML)

Scripting Tool

ScriptData

Web Browser Web-Page Inputs

Data for

Script

Page 23: SANJEEV GUPTA

© 2006 IBM Corporation23

Information Extraction & Response

Web-Page’s HTML Source

Information Extraction ModuleResponse

Relevant Keywords/

SemanticsSyntax

User

Use HTML Syntax and Semantics to extract information Look in HTML sections using syntax knowledge

Use semantics based on context and keywords

Keywords :

Airline / Lowest / Cheapest / Prices

HTML Syntax :

TABLE , ROW-COLUMN (<TR><TD>)

Top three cheapest flights are :

Go Air 4435 Rs at 5:05 AM

Deccan 4449 Rs at 4:15 AM

Spice Jet 4859 Rs at 8:00 AM

Request Generation

& Execution

User

Interaction

(Iterative)

Page 24: SANJEEV GUPTA

©2008 IBM Corporation

IBM Easy Web Browser

Page 25: SANJEEV GUPTA

©2008 IBM Corporation25

Overview

• Having difficulty viewing Web pages? Easy Web Browsing is a solution that helps bridge the digital divide

for novice computer users, people who are experiencing vision loss, second-language learners, seniors, and persons with reading challenges

• Highlights– Installs by automatically downloading from Web site.

– Reads text aloud with adjustable speed and volume control.

– Allows users to customize size and color of Web content.

– Ruler function that helps users find and follow their reading position.

– Highlight function focusing on the reading text with four patterns of marking.

– Customizable line and word spacing features that enhance readability.

Page 26: SANJEEV GUPTA

©2008 IBM Corporation26

Easy Web Browsing

IBM Easy Web Browsing display on a client's personal computer

Page 27: SANJEEV GUPTA

©2008 IBM Corporation27

Summary

• Accessibility web sites are required for PWD’s but they offer seniors, novices and non native speakers assistance as well.

• Financial, Retailers, Travel and Government Industries are interested in Web & Kiosk accessibility.

• IBM’s EWB is a quick and reasonable solution to make web sites and Kiosk more accessible for consumers, citizens and travelers.

• It also drag new opportunities by combined offering

• Both the customers and the end users are satisfied with this solution

Page 28: SANJEEV GUPTA

©2010 IBM Corporation28

Reading Companion

IBM’s multi-million-dollar investment in literacy, using voice recognition technology over the web to help children and adults learn to read.

– Anytime, anywhere web access, providing feedback and as-needed assistance

– More than 1,380 schools and nonprofit organizations -- about half of which are schools -- in 25 countries and approximately 56,200 users are participating in this grant program.850 schools & nonprofit organizations in 26 countries, benefitting more than 40,000 children and adults

– Evaluation showed:• Child: higher test scores on word recognition and reading

comprehension• Adult: Increased English communication skills and literacy; positive

job outcomes for some learners

“Reading Companion has opened new cultural horizons for our children. With such a wide choice of books to increase their vocabulary and improve their comprehension skills, they’re developing a true love for reading.”

Patricia Diaz Covarrubias, Executive Director, Christel House de Mexico, A.C.

readingcompanion.org

Page 29: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

29

IBM: Employing Diversity & Excellence

Meet IBMer Dimitri Kanevsky:

• Deaf

• Master Inventor in IBM Research

• 2002 Science Accomplishment for Maximization Algorithms

• Generated 80 IBM patents

Page 30: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

30

Meet IBMer Mike Squillace:

• Blind

• Joined IBM in 2002

• Sun Certified Java Programmer

• PhD in Philosophy and B.S in Computer Science

• Developed Patents for multiple GUI architectures and defining GUIs via mark up languages & reflection

IBM: Employing Diversity & Excellence

Page 31: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

31

Meet IBMer Chieko Asakawa:

• Blind

• Joined IBM Research in 1985

• An IBM Fellow

• Member of Women in Technology Hall of Fame

• Developed Digital Braille System & 3 key applications

IBM: Employing Diversity & Excellence

Page 32: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

32

Recognition by the Hon President of India

in 2007 & 2009for providing technology

for people on the other side of the digital world to make

complete knowledge society.

Page 33: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

33

Thank You

Page 34: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

34

aDesigner

• Characteristics- Visualization of blind usability- Simulation of low-vision users’ view

- Weak eyesight, color vision deficiency, cataracts.

- Checking compliance items- WCAG, Section 508, IBM CI162, JIS,

etc.

• Award- Wall Street Journal Technology

Innovation Award 2004 (Runner-up)• Status

- Opensourced as a basis of Eclipse.org ACTF (Accessibility Tools Framework)

Page 35: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

35

Blind Usability Visualization Example

Original

Inaccessible With skip-link

With heading Tags

Easy to find main contents •Headers can use as TOC•Easy to navigate through the page

Page 36: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

36

Low Vision Simulation

Low vision simulation. In this example, Color Vision Deficiency (Deutan) and cataract are simulated.

Problem map that indicates the positions of problems.

The original Web page which people without low vision view.

Simulating the experience of users who have low vision

Summary Report

Setting panel(Eyesight, color vision deficiencies, crystalline lens transparency)

Page 37: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

37

IBM Research OverviewFamous for its science and vital to IBM

1945 1st IBM Research Lab in NY

(Columbia U) Zürich 1955

Austin 1995Tokyo 1982

1952San JoseCalifornia

Almaden 1986Watson 1961

Haifa 1972

♦Corporate funded research agenda

♦Technology transfer

Centrally funded

1970's 1980's

♦Collaborative team

♦Shared agenda

♦Effectiveness

Joint programs

Innovation that MattersInnovation that Matters

1990's

♦ Work on customer problems

Research in the marketplace FOAKFirst of a Kind

2000's

♦ Create business advantage for customers

eBusiness research

EBO Emerging Business Opportunities

ODIS On Demand Innovation Services

Technology Transfer

Business

TechnologySociety

New Insights

Beijing 1995

Delhi 1998

Page 38: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

38

History of Innovations: 60 years of stellar research

Page 39: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

39

Innovations: 10 Years of India Research Lab

Page 40: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

40

Technical Competencies

Business Areas

Computer Science• Distributed Systems – system mgmt., middleware• Information Management – data mining, machine learning• Interaction Technologies – speech• Programming Technologies – parallel and hi-perf. prog.• Software Engineering – model-driven, distributed dev.

Service Delivery

Focus Areas

InfrastructureServices

ApplicationServices

ContactCenter

Services

Emerging Solutions

TelecomOthers

(Banking, etc.)

Math Science• Operations Research• Algorithms• Optimization• Game Theory

Service Science• Service Engineering• Service Productivity• Service Management• Service Quality• Service Supply Chains

Software

Systems

Page 41: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

41

IBM Research Websitehttp://www.research.ibm.com

IBM Research - Indiahttp://www.ibm.com/in/research

Page 42: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

42

Easy Web Browsing – UI technologies (1)

Easy operations and Easy-to-use operation panel– No URL input field, could only surf within specified domains.– Operation Panel

• Navigation (Home/Back/Stop)• Voice speed/volume• Zoom• Line Spacing• Ruler• Color setting• Print• Detail Setting• Help

Page 43: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

43

Easy Web Browsing – UI technologies (2)

Read aloud with speed control Character enlarging (/w screen magnifier)

Accessibility at IBM means enabling IT hardawa,

Page 44: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

44

Easy Web Browsing – UI technologies (3)

Background color change–Color vision deficiency–Cataract–Weak sighted

1. Black text on a white background with blue for links for normal display

2. Yellow text on a blue background with white for links

3. Black text on a light yellow background with blue for links

4. Yellow text on a black background with white for links

Page 45: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

45

Automatic language switch (panel, TTS etc) according to the lang attribute of the Web page .– Support for thirteen languages : Chinese (Simplified), Chinese Traditional (Taiwan, Hong Kong), English

(US and UK), French, German, Italian, Japanese, Korean, Spanish, and Portuguese (Brazil, Portugal).

Easy Web Browsing – UI technologies (4)

English Japanese

Page 46: SANJEEV GUPTA

© 2009 IBM Corporation

Sensei : A web application for spoken language assessment

Page 47: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

47

Sensei

An automated tool for assessing spoken English skills

– Evaluates pronunciation, grammar, comprehension

– Uses advanced speech processing techniques

– Provides scores for each of the categories in real time

The tool is Web enabled

– Can be used for remote hiring/assessment

– Can be used for training

– Centralized database/content update Can help children learn English language

Page 48: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

48

Evaluation of Syllable Stress Lexical stress evaluation

– Important for spoken English comprehension• Meaning changes with stress pattern (PROject,

proJECT, conTENT, CONtent)– Different stress point for different words (aVAilable,

Industry) Primary features – pitch, duration & energy Challenges

– Every word has a different stress pattern– Stress can also change depending upon context– Relative importance of the features varies for different

words and speakers– primary syllable can be inherently low in energy or short in

duration

Acoustic Feature Used

(Syllable level)

General behavior

(if stressed)

1. Average Fundamental Frequency (F0)

Higher

2. Average energy higher

3. Duration Longer

4. Average filtered energy

(above 4 kHz)

Higher

5. Average energy * duration

Higher

6. F0 * duration higher

7. F0 ratio

(of next syllable to current)

Lower

8. Energy ratio

(of next syllable to current)

lower

Word Dependent Classifiers– A separate classifiers is trained for each of the words– Performs better than the word independent models

Single class classifiers– Estimating the multi-dimensional shape

corresponding to the correct class (spanned only by the correct utterances)

Word Independent Classifiers– Classify each individual syllables into

stressed/unstressed– Combine soft decisions to determine correctness at

word level

• Human Assessors Repeatability is 86%

• Human Assessors Reproducibility is 64%

• Human Assessors Accuracy is 85%

• Sensei Accuracy is 81 %

• Sensei Reproducibility is 100%

Accuracy with Human

Page 49: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

49

Evaluation of Spoken Grammar

Evaluate spoken grammar skills of the candidate– Not possible to evaluate free speech – low recognition accuracy, LM

bias– Prompts and answer (make it interactive)– Prompts designed to test various parameters

• Tenses, articles, propositions, subject-verb agreement Challenges

– Correct and incorrect answers acoustically close to each other– Multiple correct answers are possible– Incomplete recordings (last word chopped), response outside speech

grammar– Content challenges

• Effectiveness of questions

• Both the dogs is barking (x)• Both the dogs are barking ()• Those dogs are barking ()• Both the dogs were barking ()

Possible responses

Grammatically

correct or incorrect

sentence

Candidate records

correct sentence

1: assigned for

correct sentence

0: assigned for

incorrect sentence

Prompt: Both the dogs is barking

Correct answers: Both the dogs are barking Both the dogs were barking

SpeechRecognition

• Human Assessors Repeatability is 94%• Human Assessors Reproducibility is 82%• Human Assessors Accuracy is 95%• Sensei Accuracy is 85 %• Sensei Reproducibility is 100%

Accuracy with Human

Page 50: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

50

Evaluation of Articulation

0.5 1 1.5 2 2.5 3 3.5 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

confidence score per time

Fra

ctio

n of

tot

al w

ords

dec

oded

Selected Vs. Rejected Speakers

selectedrejected

Impact Sounds– S,sh, z,sh, v,w, t,d, AO (ball)– Correct pronunciation of words

Different from speech recognition– Recognition should discard pronunciation variation

Customization of acoustic models– Models trained from models speakers– US/UK models adapted to model speakers– Indian English models adapted to model speakers

Features– phone confidence scores.– Word endings– Duration of phones

Challenges– Subjectivity in human ratings – Lack of model speakers data (only few model speakers)– Other considerations in human rating – stress, fluency,

etc.

Decision Level Accuracy with Human

Accuracy with Human

Page 51: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

51

Combined Scores by Sensei and Assessors

Assessor 1 Assessor 2 Assessor 3

Sensei 0.78 0.75 0.79

Assessor 1 - 0.89 0.93

Assessor 2 - 0.90

Assessor 3 -

Correlation between Sensei and Average Assessor Score = 0.80

Correlation between individual assessors

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Avg Assessor Score ->

Sen

sei S

core

->

S = 0.3*Articulation + 0.2*SylStress + 0.5*Grammar

Page 52: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

52

Sensei Assessment Tool

Page 53: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

53

Sensei Assessment Tool

Page 54: SANJEEV GUPTA

© 2009 IBM Corporation

IBM Research

54

Accessibilitymetadata

Accessibilitymetadata

New methodology to make webpages on the Internet more accessible by gathering users’ voices and by using the power of the open community.

Generate

Visuallyimpaired users

SocialAccessibility

Service

Notification

SubmitLoad

2) Respond by using authoring tools.

Sightedvolunteers

“Yes we can”3) Access the improved page!

Social computing + accessibility

Any visually impaired user can join the improvement process through various collaboration mechanisms.

Any Web user can improve accessibility of any webpage on the Internet without changing the original content.

1) Encounter a problemin Web content.

alttext = “Yes we can”

Report

Social Accessibility