Upload
emmet
View
59
Download
2
Embed Size (px)
DESCRIPTION
SANJEEV GUPTA. “Addressing the issues of KM for citizens (disabled, rural, women and others) on the other side of the digital divide of the country”. IBM Research - India. IBM Human Ability & Accessibility Center. - PowerPoint PPT Presentation
Citation preview
© 2009 IBM Corporation
“Addressing the issues of KM for citizens (disabled, rural, women and others) on the other side of the digital divide of the country”
SANJEEV GUPTA
© 2009 IBM Corporation
IBM Human Ability & Accessibility Center
IBM Research - India
© 2009 IBM Corporation
IBM Research – India
3
What is Accessibility?
Access to information technology regardless of
ability or disability
“Accessibility – which started out as a philanthropic effort – has now evolved to a business
transformation effort for IBM and our clients.”
Sam Palmisano, IBM CEO, April 27, 2004
© 2009 IBM Corporation
IBM Research – India
4
From compliance to societal transformation...
IBM Vision on Accessibility
IBM’s vision is to enhance human capabilities through technological innovation so that societal participation and personal fulfillment can be maximized, regardless of age or ability….
Accessibility is not about “them”, it’s about ALL of us…
© 2009 IBM Corporation
IBM Research – India
5
Accessibility in India – Broader Landscape Factors affecting accessibility
– Physical disability• About 60 million people with
disability• 42.5% of the disabled population
comprises women • 75% of persons with disabilities
live in rural areas
– Educational/Economic disability• About 70% of India’s population
lives in villages• A large percentage of them are
poor • Literacy is mainly limited to write
their names, make signatures, read large hoardings
– Low internet penetration• Computer penetration is still
very low• People in remote areas not very
comfortable in using computers
© 2009 IBM Corporation
IBM Research – India
6
Addressing India Accessibility Issues
IBMResearchInnovation
Bridging th
e
digital d
ivide
Making Web
Accessible
EWB WebAdapt2Me
aDesigner
Novelways
forInfor-
mationsharing
Hindi SpeechRecognition
Indian English TTS
Telecom Web/ WAV
© 2009 IBM Corporation
IBM Research – India
7
Accessibility Solutions @ IBM Research
8IBM
The Spoken Web
IBM Research - India
IBM
A VoiceSite is:
A voice driven application hosted in the network and created by subscribers themselves
Consists of a set of interconnected VoicePages (eg vxml files)
Accessed by calling up the associated phone number and interacting with its underlying application flow through a telephony interface
Analogous to websites in the World Wide Web
Introducing VoiceSites
IBM
VoiServ
Call VoiServ to create VoiceSite VoiSer
v:
Caller:
Please say your name
Sam
Please specify your profession
Plumber
Please record your welcome message
Hi my name is Sam, and I am a plumber. Please find information regarding my services on my VoiceSite.
Please enter your working hours
9 am to 7 pm
Please specify your service charges
I charge 5 dollars an hour
Would you like to offer appointment scheduling services?
Yes
Would you like to publish your information in yellow pages?
Yes
Do you accept jobs while you are away from your home location?
Yes
Please say your home location
South Delhi
Would you like to provide some references for your work?
Yes
Please say the name and phone number of your references
You can talk to Jack about my work. His number is 41292100
You have now created your voice site successfully. Users can now access your voice site through your phone number. Thank you for using this system.
CalendaringService
Database
VoiceSite
VoiceSite
Location Tracker
Presence Server
IMS
Yellow PageService
Yellow Pages Server
WWWYellowPagesWebsite
VoiGen
IBM
Small user study – Plumber VoiceSite
Methodology for survey Electricians/Plumbers/Carpenters make a call to VoiGen
and create their voice sites We ask the subjects about the usability of the VoiGen
system
12 subjects surveyed for technology validation 10 were able to create the voice site successfully (within
4 minutes) There were usability issues with respect to conversation flow,
speech recognition accuracy Everyone realised that this technology can have tremendous
impact Since this technology does not require the end-user to own
any costs in terms of devices, it has a low acceptance barrier
Carpenters/Electriciansmake a call to VoiGen to
generate their voice
sites
The voice sites are
automatically deployed in the system
People call
these voice
sites to schedule time
with the specialis
ts
IBM
What is the Spoken Web?
The Spoken Web is a world wide web in the telecom network, where people can host and browse VoiceSites, traverse VoiLinks, even conduct business transactions, all just by talking over the existing telephone network.
The Spoken Web will interoperate with the existing WWW.
The Spoken Web will interoperate with Next Generation Networks too.
IBM
Spoken Web enables multiple business opportunities
New source of revenue opportunity for telecom operators Creation and hosting voicesites Payments and financial transactions
SMBs and microbusinesses can leverage the T-Web
Examples Microbusiness Voicesite VoiceSite Personalisation Rural Voikiosk
“Anyone with a mobile handset can become a T-Web enabled microbusiness voicesite owner and accessor, and
also conduct transactions on the T-Web”
IBM
SpokenWeb Andhra Pilot Statistics
Pilot Launch: May 23, 2008 Report Summary (ended on Jan 28, 2009)
Total number of calls received = 114782 Number of unique callers = 6509
Total time spent = 2135 hours Average call time spent = 0 hours, 1 min, and 14
seconds. Maximum call duration = 0 hours, 49 min, and 40
seconds. Minimum call duration = 0 hours, 0 min, and 0 seconds.
Number of calls to Ashwini Center = 8399 Number of calls to Health Center = 14216 Number of calls to V-Agri = 13881 Number of calls to Professional Services = 37112
© 2006 IBM Corporation
WAV: Web Access through Voice
© 2006 IBM Corporation16
Motivation
Web is a rich source of useful local information Weather, travel, entertainment, insurance, finance
However a significant population (specially in emerging countries) is not using this information due to computer skills, exposure to browsing, language skills, physical
limitations, aging
A large number of such people have access to phone (landline/mobile) Growing at a fast rate
Even computer users can’t browse in several conditions On the move, no connectivity areas, low speed, etc.
© 2006 IBM Corporation17
Proposition
Decouple web information from web browsing Let the people access web information without having to browse/know
how to browse
However, still leverage the web interface
• No change required on the website/content provider side
Let the system browse instead of the user System figures out how to extract the information from web for a user’s
query
The interaction can be enabled in user’s language for simple queries (structured input/output) through speech recognition and language translation
© 2006 IBM Corporation18
Scenarios A person wants to go from station A to B. He wants to know what all trains
are available, their schedule, availability, etc. He has only a phone and is not familiar with the web. Access a relevant website (e.g., indianrail.gov.in)
• Get the required inputs from the user– Source, destination, class, dates, etc.
• Fetch the information from the web and give it back to the user
A person wants to know what are the interest rates offered by various banks for home loan System can goto a popular website (e.g., apnaloan.com)
• Get the required inputs from the user– Term, floating, fixed
• Fetch the information from the web and give it back to the user– lowest interest rate offered
A person is planning a trip to Chennai and wants to know the current weather there Goto cnn.com
• Fill up the form• Speak the weather over phone in local language
Goto google• Get the weather information and reply back to the user
© 2006 IBM Corporation19
What is currently available?
Web browsers on mobile phones Person can browse the web on handheld device
• Costly, complicated, tiny interface, inaccessible, not suited for common man,
Browsing of voice sites Created from scratch using VXML
Speech interface to specific services (TellMe, Nuance) Nearest restaurant, police station, hospitals, etc.
• Based on knowledgebase created offline OR• Proprietary tie-up with content providers to have access to databases• Predefined, Keyword based
A third-party data provider gathers the business information that Tellme provides in the Tellme download and on 1-800-555-TELL, so we are unable to directly add or correct specific business information. If you would like to add or correct information that is listedfor your business, please use the easy form on the InfoUSA website. (Taken from TellMe Website : http://www.tellme.com/you/faqs)
© 2006 IBM Corporation20
Proposed approach
Dialogue
Component
ASR/TTS
Request
Generator
Response
Generator
World Wide WebVoice/ DTMF
© 2006 IBM Corporation21
How does it work?
Web Site
Service1
Service2
Service3 Service4
Request
Process
Generator
Response
Process
Generator
Browser scripting
tools such as
Co-scripter
Information Extraction
tools
© 2006 IBM Corporation22
Request Generation & Execution
Leveraging the browser interface through scripts Generate a script with inputs taken from the user
Execute the script with a browser
Input Collection
(VXML)
Scripting Tool
ScriptData
Web Browser Web-Page Inputs
Data for
Script
© 2006 IBM Corporation23
Information Extraction & Response
Web-Page’s HTML Source
Information Extraction ModuleResponse
Relevant Keywords/
SemanticsSyntax
User
Use HTML Syntax and Semantics to extract information Look in HTML sections using syntax knowledge
Use semantics based on context and keywords
Keywords :
Airline / Lowest / Cheapest / Prices
HTML Syntax :
TABLE , ROW-COLUMN (<TR><TD>)
Top three cheapest flights are :
Go Air 4435 Rs at 5:05 AM
Deccan 4449 Rs at 4:15 AM
Spice Jet 4859 Rs at 8:00 AM
Request Generation
& Execution
User
Interaction
(Iterative)
©2008 IBM Corporation
IBM Easy Web Browser
©2008 IBM Corporation25
Overview
• Having difficulty viewing Web pages? Easy Web Browsing is a solution that helps bridge the digital divide
for novice computer users, people who are experiencing vision loss, second-language learners, seniors, and persons with reading challenges
• Highlights– Installs by automatically downloading from Web site.
– Reads text aloud with adjustable speed and volume control.
– Allows users to customize size and color of Web content.
– Ruler function that helps users find and follow their reading position.
– Highlight function focusing on the reading text with four patterns of marking.
– Customizable line and word spacing features that enhance readability.
©2008 IBM Corporation26
Easy Web Browsing
IBM Easy Web Browsing display on a client's personal computer
©2008 IBM Corporation27
Summary
• Accessibility web sites are required for PWD’s but they offer seniors, novices and non native speakers assistance as well.
• Financial, Retailers, Travel and Government Industries are interested in Web & Kiosk accessibility.
• IBM’s EWB is a quick and reasonable solution to make web sites and Kiosk more accessible for consumers, citizens and travelers.
• It also drag new opportunities by combined offering
• Both the customers and the end users are satisfied with this solution
©2010 IBM Corporation28
Reading Companion
IBM’s multi-million-dollar investment in literacy, using voice recognition technology over the web to help children and adults learn to read.
– Anytime, anywhere web access, providing feedback and as-needed assistance
– More than 1,380 schools and nonprofit organizations -- about half of which are schools -- in 25 countries and approximately 56,200 users are participating in this grant program.850 schools & nonprofit organizations in 26 countries, benefitting more than 40,000 children and adults
– Evaluation showed:• Child: higher test scores on word recognition and reading
comprehension• Adult: Increased English communication skills and literacy; positive
job outcomes for some learners
“Reading Companion has opened new cultural horizons for our children. With such a wide choice of books to increase their vocabulary and improve their comprehension skills, they’re developing a true love for reading.”
Patricia Diaz Covarrubias, Executive Director, Christel House de Mexico, A.C.
readingcompanion.org
© 2009 IBM Corporation
IBM Research
29
IBM: Employing Diversity & Excellence
Meet IBMer Dimitri Kanevsky:
• Deaf
• Master Inventor in IBM Research
• 2002 Science Accomplishment for Maximization Algorithms
• Generated 80 IBM patents
© 2009 IBM Corporation
IBM Research
30
Meet IBMer Mike Squillace:
• Blind
• Joined IBM in 2002
• Sun Certified Java Programmer
• PhD in Philosophy and B.S in Computer Science
• Developed Patents for multiple GUI architectures and defining GUIs via mark up languages & reflection
IBM: Employing Diversity & Excellence
© 2009 IBM Corporation
IBM Research
31
Meet IBMer Chieko Asakawa:
• Blind
• Joined IBM Research in 1985
• An IBM Fellow
• Member of Women in Technology Hall of Fame
• Developed Digital Braille System & 3 key applications
IBM: Employing Diversity & Excellence
© 2009 IBM Corporation
IBM Research
32
Recognition by the Hon President of India
in 2007 & 2009for providing technology
for people on the other side of the digital world to make
complete knowledge society.
© 2009 IBM Corporation
IBM Research
33
Thank You
© 2009 IBM Corporation
IBM Research
34
aDesigner
• Characteristics- Visualization of blind usability- Simulation of low-vision users’ view
- Weak eyesight, color vision deficiency, cataracts.
- Checking compliance items- WCAG, Section 508, IBM CI162, JIS,
etc.
• Award- Wall Street Journal Technology
Innovation Award 2004 (Runner-up)• Status
- Opensourced as a basis of Eclipse.org ACTF (Accessibility Tools Framework)
© 2009 IBM Corporation
IBM Research
35
Blind Usability Visualization Example
Original
Inaccessible With skip-link
With heading Tags
Easy to find main contents •Headers can use as TOC•Easy to navigate through the page
© 2009 IBM Corporation
IBM Research
36
Low Vision Simulation
Low vision simulation. In this example, Color Vision Deficiency (Deutan) and cataract are simulated.
Problem map that indicates the positions of problems.
The original Web page which people without low vision view.
Simulating the experience of users who have low vision
Summary Report
Setting panel(Eyesight, color vision deficiencies, crystalline lens transparency)
© 2009 IBM Corporation
IBM Research
37
IBM Research OverviewFamous for its science and vital to IBM
1945 1st IBM Research Lab in NY
(Columbia U) Zürich 1955
Austin 1995Tokyo 1982
1952San JoseCalifornia
Almaden 1986Watson 1961
Haifa 1972
♦Corporate funded research agenda
♦Technology transfer
Centrally funded
1970's 1980's
♦Collaborative team
♦Shared agenda
♦Effectiveness
Joint programs
Innovation that MattersInnovation that Matters
1990's
♦ Work on customer problems
Research in the marketplace FOAKFirst of a Kind
2000's
♦ Create business advantage for customers
eBusiness research
EBO Emerging Business Opportunities
ODIS On Demand Innovation Services
Technology Transfer
Business
TechnologySociety
New Insights
Beijing 1995
Delhi 1998
© 2009 IBM Corporation
IBM Research
38
History of Innovations: 60 years of stellar research
© 2009 IBM Corporation
IBM Research
39
Innovations: 10 Years of India Research Lab
© 2009 IBM Corporation
IBM Research
40
Technical Competencies
Business Areas
Computer Science• Distributed Systems – system mgmt., middleware• Information Management – data mining, machine learning• Interaction Technologies – speech• Programming Technologies – parallel and hi-perf. prog.• Software Engineering – model-driven, distributed dev.
Service Delivery
Focus Areas
InfrastructureServices
ApplicationServices
ContactCenter
Services
Emerging Solutions
TelecomOthers
(Banking, etc.)
Math Science• Operations Research• Algorithms• Optimization• Game Theory
Service Science• Service Engineering• Service Productivity• Service Management• Service Quality• Service Supply Chains
Software
Systems
© 2009 IBM Corporation
IBM Research
41
IBM Research Websitehttp://www.research.ibm.com
IBM Research - Indiahttp://www.ibm.com/in/research
© 2009 IBM Corporation
IBM Research
42
Easy Web Browsing – UI technologies (1)
Easy operations and Easy-to-use operation panel– No URL input field, could only surf within specified domains.– Operation Panel
• Navigation (Home/Back/Stop)• Voice speed/volume• Zoom• Line Spacing• Ruler• Color setting• Print• Detail Setting• Help
© 2009 IBM Corporation
IBM Research
43
Easy Web Browsing – UI technologies (2)
Read aloud with speed control Character enlarging (/w screen magnifier)
Accessibility at IBM means enabling IT hardawa,
© 2009 IBM Corporation
IBM Research
44
Easy Web Browsing – UI technologies (3)
Background color change–Color vision deficiency–Cataract–Weak sighted
1. Black text on a white background with blue for links for normal display
2. Yellow text on a blue background with white for links
3. Black text on a light yellow background with blue for links
4. Yellow text on a black background with white for links
© 2009 IBM Corporation
IBM Research
45
Automatic language switch (panel, TTS etc) according to the lang attribute of the Web page .– Support for thirteen languages : Chinese (Simplified), Chinese Traditional (Taiwan, Hong Kong), English
(US and UK), French, German, Italian, Japanese, Korean, Spanish, and Portuguese (Brazil, Portugal).
Easy Web Browsing – UI technologies (4)
English Japanese
© 2009 IBM Corporation
Sensei : A web application for spoken language assessment
© 2009 IBM Corporation
IBM Research
47
Sensei
An automated tool for assessing spoken English skills
– Evaluates pronunciation, grammar, comprehension
– Uses advanced speech processing techniques
– Provides scores for each of the categories in real time
The tool is Web enabled
– Can be used for remote hiring/assessment
– Can be used for training
– Centralized database/content update Can help children learn English language
© 2009 IBM Corporation
IBM Research
48
Evaluation of Syllable Stress Lexical stress evaluation
– Important for spoken English comprehension• Meaning changes with stress pattern (PROject,
proJECT, conTENT, CONtent)– Different stress point for different words (aVAilable,
Industry) Primary features – pitch, duration & energy Challenges
– Every word has a different stress pattern– Stress can also change depending upon context– Relative importance of the features varies for different
words and speakers– primary syllable can be inherently low in energy or short in
duration
Acoustic Feature Used
(Syllable level)
General behavior
(if stressed)
1. Average Fundamental Frequency (F0)
Higher
2. Average energy higher
3. Duration Longer
4. Average filtered energy
(above 4 kHz)
Higher
5. Average energy * duration
Higher
6. F0 * duration higher
7. F0 ratio
(of next syllable to current)
Lower
8. Energy ratio
(of next syllable to current)
lower
Word Dependent Classifiers– A separate classifiers is trained for each of the words– Performs better than the word independent models
Single class classifiers– Estimating the multi-dimensional shape
corresponding to the correct class (spanned only by the correct utterances)
Word Independent Classifiers– Classify each individual syllables into
stressed/unstressed– Combine soft decisions to determine correctness at
word level
• Human Assessors Repeatability is 86%
• Human Assessors Reproducibility is 64%
• Human Assessors Accuracy is 85%
• Sensei Accuracy is 81 %
• Sensei Reproducibility is 100%
Accuracy with Human
© 2009 IBM Corporation
IBM Research
49
Evaluation of Spoken Grammar
Evaluate spoken grammar skills of the candidate– Not possible to evaluate free speech – low recognition accuracy, LM
bias– Prompts and answer (make it interactive)– Prompts designed to test various parameters
• Tenses, articles, propositions, subject-verb agreement Challenges
– Correct and incorrect answers acoustically close to each other– Multiple correct answers are possible– Incomplete recordings (last word chopped), response outside speech
grammar– Content challenges
• Effectiveness of questions
• Both the dogs is barking (x)• Both the dogs are barking ()• Those dogs are barking ()• Both the dogs were barking ()
Possible responses
Grammatically
correct or incorrect
sentence
Candidate records
correct sentence
1: assigned for
correct sentence
0: assigned for
incorrect sentence
Prompt: Both the dogs is barking
Correct answers: Both the dogs are barking Both the dogs were barking
SpeechRecognition
• Human Assessors Repeatability is 94%• Human Assessors Reproducibility is 82%• Human Assessors Accuracy is 95%• Sensei Accuracy is 85 %• Sensei Reproducibility is 100%
Accuracy with Human
© 2009 IBM Corporation
IBM Research
50
Evaluation of Articulation
0.5 1 1.5 2 2.5 3 3.5 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
confidence score per time
Fra
ctio
n of
tot
al w
ords
dec
oded
Selected Vs. Rejected Speakers
selectedrejected
Impact Sounds– S,sh, z,sh, v,w, t,d, AO (ball)– Correct pronunciation of words
Different from speech recognition– Recognition should discard pronunciation variation
Customization of acoustic models– Models trained from models speakers– US/UK models adapted to model speakers– Indian English models adapted to model speakers
Features– phone confidence scores.– Word endings– Duration of phones
Challenges– Subjectivity in human ratings – Lack of model speakers data (only few model speakers)– Other considerations in human rating – stress, fluency,
etc.
Decision Level Accuracy with Human
Accuracy with Human
© 2009 IBM Corporation
IBM Research
51
Combined Scores by Sensei and Assessors
Assessor 1 Assessor 2 Assessor 3
Sensei 0.78 0.75 0.79
Assessor 1 - 0.89 0.93
Assessor 2 - 0.90
Assessor 3 -
Correlation between Sensei and Average Assessor Score = 0.80
Correlation between individual assessors
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Avg Assessor Score ->
Sen
sei S
core
->
S = 0.3*Articulation + 0.2*SylStress + 0.5*Grammar
© 2009 IBM Corporation
IBM Research
52
Sensei Assessment Tool
© 2009 IBM Corporation
IBM Research
53
Sensei Assessment Tool
© 2009 IBM Corporation
IBM Research
54
Accessibilitymetadata
Accessibilitymetadata
New methodology to make webpages on the Internet more accessible by gathering users’ voices and by using the power of the open community.
Generate
Visuallyimpaired users
SocialAccessibility
Service
Notification
SubmitLoad
2) Respond by using authoring tools.
Sightedvolunteers
“Yes we can”3) Access the improved page!
Social computing + accessibility
Any visually impaired user can join the improvement process through various collaboration mechanisms.
Any Web user can improve accessibility of any webpage on the Internet without changing the original content.
1) Encounter a problemin Web content.
alttext = “Yes we can”
Report
Social Accessibility