Upload
jesse-wang
View
1.786
Download
1
Embed Size (px)
DESCRIPTION
My talk for Tsinghua University Alumni in Seattle area for centennial celebration tech talk
Citation preview
Semantic WikisSemantic WikisSocial Semantic Web In Action
2011-03-25Specially Prepared for Tsinghua University Alumniin greater Seattle area for centennial celebration
2
About Me: Jesse Wang 王 (嘉 )欣About Me: Jesse Wang 王 (嘉 )欣
1996
1997 2005
1988 1998
3
Who is VulcanWho is Vulcan
4
What does Vulcan doWhat does Vulcan do
Vulcan Inc. was established in 1986 by investor and philanthropist Paul G. Allen, co-founder of Microsoft, to manage his
business and philanthropic efforts. Allen is chairman of Vulcan and his sister, Jody
Allen, is president and CEO.
5
It all began with a vision…It all began with a vision…
6
Now the Vision Continues as Project HaloNow the Vision Continues as Project Halo
Project Halo is a staged, long-range research effort by Vulcan Inc. towards the development of a "Digital Aristotle"—a reasoning system capable of answering novel questions and solving advanced problems in a broad range of scientific
disciplines and related human affairs. The project focuses on creating two primary functions: a tutor capable of instructing and assessing students in those
subjects, and a research assistant with broad, interdisciplinary skills to help scientists and others in their work.
Automatic Question Answering System
7
Project Halo’s Focus AreasProject Halo’s Focus Areas
• Automated User-Centered Reasoning and Acquisition System
• Text book you can talk to
AURA
• Semantic Inference with Large Knowledge-base
• Non-monotonic rule system / RIF
SILK
• Semantic MediaWiki +• Knowledge authoring with SMEs
SMW+
Plus other related semantic technologies and commercial efforts
Question Interpretation
Advanced Reasoning
Knowledge Acquisition
8
Project Halo’s GoalsProject Halo’s Goals
Address the core problems in Knowledge Bases– scale– brittleness
Have high impact
KB E
ffort (co
st, p
eople
,…)
KB size (number of assertions, complexity…)
Vulcan
Now
Future
9
Crowdsourcing for Better Knowledge AcquisitionCrowdsourcing for Better Knowledge Acquisition
11
Wiki as a Crowdsourcing ToolWiki as a Crowdsourcing Tool
Consensus
This distinguishes wikis from other publication tools
12
Consensus in Wikis Comes fromConsensus in Wikis Comes from
Collaboration– ~17 edits/page on average in
Wikipedia (with high variance)– Wikipedia’s Neutral Point of View
Convention– Users follow customs and
conventions to engage with articles effectively
13
Software Support Makes Wikis SuccessfulSoftware Support Makes Wikis Successful
Trivial to edit by anyone Tracking of all changes, one-
step rollback Every article has a “Talk” page
for discussion Notification facility allows
anyone to “watch” an article Sufficient security on pages,
logins can be required A hierarchy of administrators,
gardeners, and editors Software Bots recognize certain
kinds of vandalism and auto-revert, or recognize articles that need work, and flag them for editors
14
Success of WikisSuccess of Wikis
One of human’s greatest inventions
15
Wikis are Great, But…Wikis are Great, But…
Wiki Clock?
How About Hidden Goodies in the Wiki?How About Hidden Goodies in the Wiki?
Wikipedia has articles about…
•… all cities•… their populations•… their mayors•… the skyscrapers
So can I ask for a list of the world’s 5 largest cities with a female mayor?Or Skyscrapers in Shanghai with 50+ floors and built after 2000?
16
17
Enters Semantics…Enters Semantics…To answer questions like:• The female majors of top 10 cities,
sorted by population, starting year, age…
• All skyscrapers in China (Japan, Thailand,…) of 50 (40/60/70) floors or more, and built in year 2000 (2001/2002) and after, sorted by built year, floors…, grouped by cities, regions…
• Median (average) base annual salary of CEOs of Fortune 100 companies in America (Europe, Asian,…)
• All Porsche Vehicles Made in Germany that accelerate from 1-100 km/h less than 4 seconds
• Sci-Fi movies made after year 2000 that cost less than $10M and gross more than $30M
• A map showing where all Mercedes-Benz vehicles are manufactured
• And many more
18
What is a Semantic WikiWhat is a Semantic Wiki
A wiki that has an underlying model of the knowledge described in its pages.
To allow users to make their knowledge explicit and formal Semantic Web Compatible
Semantic Wiki
19
Two PerspectivesTwo Perspectives
Wikis for Metadata
Metadata for Wikis
Characteristics of Semantic WikisCharacteristics of Semantic Wikis
Semantic Wikis
20
List of Semantic WikisList of Semantic Wikis
AceWikiArtificialMemoryWagn - Ruby on Rails-basedKiWi – Knowledge in a WikiKnoodl – Semantic Collaboration tool and application platformMetaweb - the software that powers FreebaseOntoWikiOpenRecordPhpWiki
Semantic MediaWiki - an extension to MediaWiki that turns it into a semantic wikiSwirrl - a spreadsheet-based semantic wiki applicationTaOPis - has a semantic wiki subsystem based on Frame logicTikiWiki CMS/Groupware integrates Semantic links as a core featurezAgile Wikidsmart - semantically enables Confluence
21
22
Basics of Semantic WikisBasics of Semantic Wikis
Still a wiki, with regular wiki features– Category/Tags, Namespaces, Title, Versioning, ...
Typed Content (built-ins + user created, e.g. categories)– Page/Card, Date, Number, URL/Email, String, …
Typed Links (e.g. properties)– “capital_of”, “contains”, “born_in”…
Querying Interface Support– E.g. “[[Category:Member]] [[Age::<30]]” (in SMW)
24
SMW Markup SyntaxSMW Markup Syntax
[[Property::Value | Display]]
Tsinghua is a university located in [[Has location::Beijing]], with
[[Has population::27,000]] students.
In page "Property:Has location":
[[Has type::Page]]
In page "Property:Has population":
[[Has type::number]]
26
Define ClassesDefine Classes
On Page Beijing One possible solution:
– Beijing is a [[Is a::city]]
Beijing is a city in [[Has country::China]], with population [[Has population::2,200,000]].
[[Category::Cities]]
Categories are used to define classes because they are better for class inheritance.
The Jin Mao Tower (金茂大厦 ) is an 88-story landmark supertall skyscraper in …
[[Categories: 1998 architecture | Skyscrapers in Shanghai | Hotels in Shanghai | Skyscrapers over 350 meters | Visitor attractions in Shanghai | Landmarks in Shanghai | Skidmore, Owings and Merrill buildings]]
Category:Skyscrapers in China Category: Skyscrapers by country
27
Database-style Query over Wiki DataDatabase-style Query over Wiki Data
{{#ask:[[Category:Skyscrapers]][[Located in::China]][[Floor count::>50]][[Year built::<2000]] …
}}
Example: Skyscrapers in China higher than 50 stories, built before
2000
ASK/SPARQL query target
Data via DBpedia
29
What is the Promise of Semantic Wikis?What is the Promise of Semantic Wikis?
Semantic Wikis promise Consensus over Data
Combine low-expressivity data authorship with the best features of traditional wikis
User-governed, user-maintained, user-defined
Easy to use as an extension of text authoring
The ultimate data aggregator
31
One Key Helpful Feature of Semantic WikisOne Key Helpful Feature of Semantic Wikis
Semantic Wikis are “Schema-Last”Databases require DBAs and schema design;
Semantic Wikis develop and maintain the schema in the wiki
32
Semantic MediaWiki in 2010Semantic MediaWiki in 2010
Open source (GPL) Well documented Active mailing list Commercial support available World-wide community Regular Conferences
– Next SMWCon 4/28-30, 2011 Arlington, VA
http://semantic-mediawiki.org/Very stable SMW core
Mature while still growing, slowly but steadily
33
SMW ExtensionsSMW Extensions
• Halo Extensions, Semantic Forms, Semantic Notification, …
Data I/O
• Semantic Toolbar, Semantic Drilldown, Enhanced Retrieval, Search…
Query and Browsing
• Semantic Result Printers, Tree View, Exhibit, Flash charts…
Visualization
• HaloACL, Deployment, Triplestore Connector, Simple Rules…• Semantic WikiTags and Subversion Integration extensions • Upcoming Linked Data Extension, with R2R and SILK from F.U.Berlin
Other useful extensions
37
Wikis Can Help Information ManagementWikis Can Help Information Management
Business Intelligence Finding Expertise Internal Encyclopedia Documentation Enterprise Search
Crowd Sourcing is a Great Solution!
Research = Locate and Find Data ?
38
Example I: KnowIT in Johnson & JohnsonExample I: KnowIT in Johnson & Johnson
Most Frequently Asked Questions: (J&J example)– What are the directions between two J&J sites?– What is the meaning of KOL ? HLM ? DRU ?– What data sources can we use to compare biological pathways?– Can you give us a list of R&D applications, related servers and
stakeholders and send us an update every six months?
Capture Facts About Things– Definitions, concepts, questions– Locations – Data sources– Organizations and people– Technologies and systems
39
System ArchitectureSystem Architecture
41
Example II: Knowledge Encapsulation FrameworkExample II: Knowledge Encapsulation Framework
Allow modelers to exploit the ‘information resources’ they have and discover new, potentially relevant material across new media types
KEF aims to provide:– an effective method for storing, retrieving, reviewing and
annotating your documents– an environment where you can share these materials with team
members and discuss– a mechanism to discover new, related information for social and
traditional media– a means to link this material to model representations to aid
analysis and game-play Achieved by a semantic wiki enabled with an NLP pipeline
42
43
45
Example 3: Ultrapedia – An Analytical Semantic WikipediaExample 3: Ultrapedia – An Analytical Semantic Wikipedia
Ultrapedia: An SMW demo built to explore general knowledge acquisition in a wiki
Wikipedia merged with the power of a database– Data extracted from Wikipedia Infobox and Table data; stored in RDF– For Authors: tools to create more compelling articles
• Great visualizations: charts, tables, timelines, photos, analytics• Always up-to-date across the Encyclopedia• Encourage data consistency and find data errors• Link in other web data sources
– For Readers: • Enhanced articles and data interaction• Faceted navigation• Sophisticated queries (both standing and ad-hoc)
Maintenance via the Wikipedia update process– Data is from the article text, with simple ways for article authors to maintain and
extend it.– Authors and readers always in the loop for merging, updating, validating, mapping
Graph Views of the Acceleration DataGraph Views of the Acceleration Data
Dynamic Mapping and ChartingDynamic Mapping and Charting
52
Information Discovery via VisualizationInformation Discovery via Visualization
55
Video: Semantic Wikis for A New ProblemVideo: Semantic Wikis for A New Problem
Social tag-based characterization
Keyword search over tag data
Inconsistent semantics
Easy to engineer
Increasing technical complexity → ← Increasing User Participation
Algorithm-based object characterization
Database-style search
Consistent semantics Extremely difficult to
engineer
Social database-style characterization
Database search + wiki text search
Semantic consistency via wiki mechanisms
Easy to engineer
Semantic Entertainment
Wiki
56
Semantic Seahawks Football WikiSemantic Seahawks Football Wiki
57
Based on Simple Templates and FormsBased on Simple Templates and Forms
Semantic Entertainment: Query Result Highlight ReelSemantic Entertainment: Query Result Highlight Reel
Commercial Look/Feel
Play-by-play video search
Highlight reel generation
Search on crowd-defined patterns (“touchdowns with big hits”)
Tree-based navigation widget
Very favorable economics
Demo
60
The InspirationThe Inspiration
We started with a
We built a
We now have an
wiki
web site
application
61
We CAN Build Applications (Fairly) EasilyWe CAN Build Applications (Fairly) Easily
With all the extensions of Semantic MediaWiki.
• Halo Extensions, Semantic Forms, Semantic Notification, …
Data I/O
• Semantic Toolbar, Semantic Drilldown, Enhanced Retrieval, Search…
Query and Browsing
• Semantic Result Printers, Tree View, Exhibit, Flash charts…
Visualization
• HaloACL, Deployment, Triplestore Connector, Simple Rules…• Semantic WikiTags and SVN Integration extensions • Upcoming Linked Data Extension, with R2R and SILK from FUB
Other useful extensions
Social Semantic Web Applications
65
Collaborative Proposal Management at BT with SMW+Collaborative Proposal Management at BT with SMW+
Active Bid Viewer Service Desk Selector
67
Social Semantic Web ApplicationsSocial Semantic Web Applications
Omitting x examples, y pictures and z lines of text…
68
Case Study 2 and Demo: Project Management with SMW+Case Study 2 and Demo: Project Management with SMW+
Automatically populate tables
Just the data you want, At the level you want Calendars and
timelines Workflows Personal menus Form-oriented inputs Notifications via
email/RSS MS Office integration SVN integration
Vulcan Project Management Wiki (Story)Vulcan Project Management Wiki (Story)
Template and style sheet
customizations
Related content
automatically included
70
Vulcan Project Management Wiki (Task)Vulcan Project Management Wiki (Task)
Color codes to indicate types
and status
SVN Integration automatically “Completed”
task and relate to repository
71
Vulcan Project Management Wiki (Visualizations)Vulcan Project Management Wiki (Visualizations)
Demo
72
Screenshot of a Sprint pageScreenshot of a Sprint page
http://wiking.vulcan.com/dev/index.php/Sprint_101020
Data automatically generated via template queries on page
73
Requirements for Wiki “Developers”Requirements for Wiki “Developers”
One need not– Write code like a hardcore programmer– Design, setup RDBMS or make frequent schema changes– Possess knowledge of a senior system admin
Instead one need– Configure the wiki with desired extensions– Design and evolve the data model (schema)– Design Content
• Customize templates, forms, styles, skin, etc.
The bar is dramatically lowered to build applications – “Source code” is part of the open content of wiki too!
74
Effectiveness of SMW as a Platform ChoiceEffectiveness of SMW as a Platform Choice
Packaged Software
☺Very quick to obtainN Hard to customizeN Expensive
Microsoft Project Version One Microsoft
SharePoint
Custom Development
N Slow to develop☺Extremely flexibleN High cost to develop and maintain
.NET Framework J2EE, … Ruby on rails
SMW + Extensions
☺ Still quick to program☺ Easy to customize☺ Low-moderate cost
Vulcan Project Wiki B.L.S. RPI map
79
ConclusionsConclusions
Semantic MediaWiki+ (http://smwforum.ontoprise.com) – Open-source, growing semantic wiki software system– Wiki-style text + semantic markups– Collaborative, user-governed subject models and data curation– Simple and extensible data models with easy import/export
SMW+ has many government and industry users– People built applications with it
Knowledge Management viacrowds can work– A way to leverage and exploit
web-collected data– A lightweight collaborative
knowledge management tool
A new platform for lightweight web application development
KB E
ffort (co
st, p
eople
,…)
KB size (number of assertions, complexity…)
Vulcan
Now
Future
AcknowledgementAcknowledgement
Paul Allen
Mark Greaves
Andrew J Cowell
Laurent Alquier
Li Ding and Bao Jie
University of Karlsruhe
Tommy Lu
Ontoprise GmbH
William Smith
Ed Swing
TeamMersion LLC
Jesse Wang
80
Thank you!
81
Backups starts here
(End of Slides)
82
Case Study: Battle-space Luminary System Case Study: Battle-space Luminary System
Discover when New Information represents a change in understanding of entities– Discovery of explicit entity links, implicit relationships
Large Volumes of Data in various formats– Unstructured news articles– Tactical Reports, Field Intelligence– Structured Database Information
Use Wiki Pages to represent current knowledge about an entity – “what we know” Domain Ontology to represent domain of information – “what we want to know” Issue Alerts when Significant Events occur
– New information according to category– Changing information on topics of interest– Need to send information to various devices – cell phones, email, etc.
83
System DesignSystem Design
Wiki Configuration– Semantic MediaWiki: Large developer community, active development, open
source. Wikipedia uses MediaWiki, so scalability and performance are important.
– Semantic Results Format: Provides various rich media displays of semantic information, including graphs, timelines, maps
– Semantic Forms: Provides convenient user interface for entering semantic data into wiki, avoiding cumbersome wikitext
– Semantic Notifications: Enables sending of notifications when results of semantic query change.
Domain Ontology– Created OWL Ontology for Terrorism
Semantic Parsing, Extraction, Reasoning– Java Process using various Open-Source Toolkits– Rapid plugin of new technologies– Multiple Data Sources supported
84
Sample Content PageSample Content Page
85
Wiki Content DesignWiki Content Design
Use Templates to Ensure Consistent Look-and-Feel– Templates Correspond to Ontology Classes– Fields within Templates correspond to Properties within Ontology– Rich Content Visualizations derived in consistent way
Hierarchical Categories match Class Hierarchy within Ontology– Ensures Validity for Properties– Category included on each Template page to ensure consistency
Forms Provide ability for users to enter data directly into wiki without knowing Wiki Text– Each form corresponds to a Template– Fields within forms correspond to the fields/properties within the Template– GUI can include auto-completion– Created Page immediately linked semantically to rest of Wiki
86
Sample VisualizationsSample Visualizations
Visualizations automatically created
w/o user edit(tables, timelines,
maps, social networks…)
UI enables notifications based
on results of query – message sent when
visualization changes
Wikipedia for Porsches (Acceleration Data Example)Wikipedia for Porsches (Acceleration Data Example)
Information Need: All Porsche models that accelerate 0-100kph in under 5, 6, and 7 seconds
More Porsche Acceleration Data in WikipediaMore Porsche Acceleration Data in Wikipedia
Main PageUltrapedia Main PageUltrapedia Main Page
Tree View Control Abstract/Summary quick preview
Semantics for Improved Wiki NavigationSemantics for Improved Wiki Navigation
The Porsche 996 Acceleration Table In UltrapediaThe Porsche 996 Acceleration Table In Ultrapedia
Same Table as a QuerySame Table as a Query
Which Porsches accelerate fast?Dynamically-Generated Tables for QueriesDynamically-Generated Tables for Queries
Information Need: All Porsche models that accelerate 0-100kph in under 5, 6, and 7 seconds
Graph Views of the Acceleration DataGraph Views of the Acceleration Data
External Data via a Live Ebay QueryExternal Data via a Live Ebay Query
Linking to External Ebay DataLinking to External Ebay Data
Mercedes-Benz E-class W212 Gallery SectionPhotos in Wiki Articles as DataPhotos in Wiki Articles as Data
Volkswagen Production Timeline ViewTimelines from DataTimelines from Data
Dynamic Mapping and ChartingDynamic Mapping and Charting
Editing Wiki Data In PlaceEditing Wiki Data In Place
Return