Upload
eliana-maldonado
View
29
Download
0
Embed Size (px)
DESCRIPTION
人工知能学会研究会資料 SIG-SWO-A201-02. OWL はオントロジーの課題を 本当に解決するのか?. 山口 高平 (静岡大学). Semantic Web の9階層. 知識工学のトレンド 90-: 概念化の明示的仕様 ( Tom Gruber オントロジーの定義) オントロジー記述言語 (Ontolingua) 知識交換言語 (KIF) Generic Ontology CYC, WordNet, EDR … PSM Task Ontology オントロジー構築方法論 .... オントロジーに関する 知識工学と次世代 Web の流れ. - PowerPoint PPT Presentation
Citation preview
OWLはオントロジーの課題を
本当に解決するのか?山口 高平
(静岡大学)
人工知能学会研究会資料SIG-SWO-A201-02
Semantic Web の9階層
オントロジーに関する知識工学と次世代 Web の流れ
知識工学のトレンド
• 90-: 概念化の明示的仕様( Tom Gruber オントロジーの定義)オントロジー記述言語 (Ontolingua)知識交換言語 (KIF)Generic OntologyCYC, WordNet, EDR…PSMTask Ontologyオントロジー構築方法論...
次世代Webのトレンド
• 95-97: XML as arbitrary structures
• 97-98: RDF
• 98-99: RDFS (schema) as a frame-like system
• 00-01: DAML+OIL
• 02-07: OWL
OWL への道のり
• On-To-Knowledge が OIL を定義• 重要 EU プロジェクトのいくつかが OIL を採
用• DAML プロジェクトと OIL がリンク• 定義: DAML+OIL• 改定: DAML+OIL• W 3 C へ DAML+OIL を提出• WebOnt Working Group の立ち上げ
⇒ OWL のドラフト
Jan’00Med’00Sep’00Dec’00Mar’01Aug’01Oct’01
Jan’00Med’00Sep’00Dec’00Mar’01Aug’01Oct’01
RDF ( S) と OWL の関連
• class-def• subclass-of• slot-def• subslot-of• domain• range
• class-def• subclass-of• slot-def• subslot-of• domain• range
• class-expressions• AND, OR, NOT
• slot-constraints• has-value, value-type• cardinality
• slot-properties• trans, symm
• class-expressions• AND, OR, NOT
• slot-constraints• has-value, value-type• cardinality
• slot-properties• trans, symm
RDF(S)DAML+OILDAML+OIL
≒≒OWLOWL
Ontologies by Keyword
Keyword URI
academic department
http://www.cs.umd.edu/projects/plus/DAML/onts/cs1.0.daml
academic department
http://www.cs.umd.edu/projects/plus/DAML/onts/cs1.1.daml
Academic Positions http://www.daml.ri.cmu.edu/ont/homework/cmu-ri-employmenttypes-ont.daml
access control primitives
http://www.w3.org/2000/10/swap/pim/doc.rdf
acronym http://orlando.drc.com/daml/Ontology/Thesaurus/CALL/current/
activity http://www.kestrel.edu/DAML/2000/12/OPERATION.daml
Actors http://opencyc.sourceforge.net/daml/cyc.daml
Actors http://www.cyc.com/2002/04/08/cyc.daml
Actors http://www.cyc.com/cyc-2-1/cyc-vocab.daml
address book http://www.w3.org/2000/10/swap/pim/contact.rdf
agenda http://www.daml.org/2001/10/agenda/agenda-ont
DAML Ontology Library (Ontology's by Keyword) http://www.daml.org/ontologies/keyword.html
<rdf:RDF xmlns="http://www.daml.org/2001/08/baseball/baseball-ont#" xmlns:daml="http://www.daml.org/2001/03/daml+oil#" xmlns:oiled="http://img.cs.man.ac.uk/oil/oiled#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <daml:Ontology rdf:about=""> <rdfs:comment>baseball</rdfs:comment> <rdfs:comment>baseball ontology</rdfs:comment>
<daml:versionInfo>"1.0"</daml:versionInfo> </daml:Ontology> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#PostSeasonGame"><rdfs:label>PostSeasonGame</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:55:22 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Game"></rdfs:Class> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Inning"> <rdfs:label>Inning</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:22:49 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#AggregateEvent"></rdfs:Class> </rdfs:subClassOf> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#number"></daml:onProperty> <daml:toClass> <rdfs:Class rdf:about="#http://www.w3.org/TR/xmlschema-2/#string"></rdfs:Class> </daml:toClass> </daml:Restriction> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#League">
<rdfs:label>League</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:35:38 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#division"></daml:onProperty> <daml:toClass> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Division"></rdfs:Class> </daml:toClass> </daml:Restriction> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Person">
<rdfs:label>Person</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>13:00:13 26.08.2001</oiled:creationDate> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Player"> <rdfs:label>Player</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:25:42 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Employee"></rdfs:Class> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Relief"> <rdfs:label>Relief</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>18:31:03 28.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#LineupEvent"></rdfs:Class> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Season">
<rdfs:label>Season</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:13:04 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#year"></daml:onProperty> <daml:toClass> <rdfs:Class rdf:about="#http://www.w3.org/TR/xmlschema-2/#string"></rdfs:Class> </daml:toClass> </daml:Restriction> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Series">
<rdfs:label>Series</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:56:25 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#home"></daml:onProperty> <daml:toClass rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#Team"/> </daml:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#away"></daml:onProperty> <daml:toClass
Baseball Ontology
ソフトウェア開発プロセスのためのオントロジー開発 (人手)経験
IssueIssue ApproachApproach
A Process-Centered Software Engineering A Process-Centered Software Engineering Environment Using OntologiesEnvironment Using Ontologies
GoalGoal
Real software processeschange dynamically...
need the facility to correspondto changing process dynamically
very difficult to maintain
develop software process ontologiesthrough the domain analysis usingtwo case studies
First,
Afterwards,design the system which generatesoftware processes interactivelywith a user
Development of Software Process Ontologies
Step1. Extract real software processes and objects fromthe two cases
InventInvent
JoinJoin
AppendAppend
ExtractExtract
C
A+BA
B
inputinput
referencereference
Process
B’orA’
Aαα+
outputoutput
Step2. Organize extracted processes and objects into distinctgroups such that there are more similarity in meaning,while defining the software process scheme
Step3. Give detailed definitions to each processes using software process scheme
activityactivity
InventInvent
JoinJoin
AppendAppend
ExtractExtract
Invent with referencesInvent with references
Invent without referencesInvent without references
Join without referencesJoin without references
Extract without referencesExtract without references
Join with referencesJoin with references
Extract with referencesExtract with references
EvaluateEvaluateReviewReviewExamineExamine
Focus on input,output,and reference roles
Focus on input,output,and reference roles
Focus on reference,tool and agent rolesFocus on reference,tool and agent roles
Process OntologyProcess Ontology
Software Process Ontologies(conceptual hierarchy)
Object Ontology includingabout 400 concepts
Process Ontology including about 250 concepts
Process Ontology Object Ontology
GoalObjectGoal
Object??
??
??
??
??Start
ObjectStart
Object
Retrieve processessatisfied with constraints
Software Process Automation using ontologies
A query about SE P
What SEP between the following input and output ?What SEP between the following input and output ?
Input : Requirement Specification from S/W Subsystems Output : Detailed Design of Each Component of S/W Subsystems
Case Study
Initial Software Process Plan CandidatesInitial Software Process Plan Candidates
Requirement Specification
fromS/W Subsystems
Detailed Design ofEach Components of
S/W Subsystems
Design BasicComponents of
S/W Subsystems
Design BasicComponents ofI/F Subsystems
Design LogicalAspects of DB
Review AllBasic
DesignsTogether
Evaluate AllBasic Designs
Design Physical
Aspects of DB
Specify TestingS/W Integration
Design DetailedComponents of
S/W Subsystems
Design DetailedComponents ofI/F Subsystems
Specify TestingUnits of S/W
Evaluate AllDetailed Designs
Review AllDetailed Designs
Together
OutputOutput
InputInput
Interim Software Process Plan CandidatesInterim Software Process Plan Candidates
Requirement Specification
fromS/W Subsystems
Detailed Design ofEach Components of
S/W Subsystems
Design BasicComponents of
S/W Subsystems
Design BasicComponents ofI/F Subsystems
Design LogicalAspects of DB
Evaluate AllBasic Designs
Design Physical
Aspects of DB
Specify TestingS/W Integration
Design DetailedComponents of
S/W Subsystems
Design DetailedComponents ofI/F Subsystems
Specify TestingUnits of S/W
Evaluate AllDetailed Designs
Review AllDetailed Designs
Together
OutputOutput
InputInput
Final Software Process Plan CandidatesFinal Software Process Plan Candidates
Requirement Specification
fromS/W Subsystems
Detailed Design ofEach Components of
S/W Subsystems
Design BasicComponents of
S/W Subsystems
Design BasicComponents ofI/F Subsystems
Design LogicalAspects of DB
Evaluate AllBasic Designs
Design Physical
Aspects of DB
Specify TestingS/W Integration
Design DetailedComponents of
S/W Subsystems
Design DetailedComponents ofI/F Subsystems
Specify TestingUnits of S/W
Evaluate AllDetailed Designs
Review AllDetailed Designs
Together
OutputOutput
InputInput
Evaluation
Issues and Future work
•The environment can’t allocate human and time resource•The environment has not the facility that supports the user to add user’s specific processes•The methodology for building ontologies has no theoretical aspects,such as soundness
Applicability
It turned out that the environment generated SPP candidates good for the user’s query with user interaction
計算機可読型辞書とテキストコーパスからの
オントロジー開発 (半自動)経験
How to Build up Domain Ontologies with Less Cost
• Taking Existing Information Resources
DODDLE Project (a Domain Ontology rapiD DeveLopment Environment)
exploiting a MRDConstructing up just a conceptual hierarchy
Why not taking MRD to build up domain ontologies ?• Good News:
MRD have many concepts.
• Bad News:The concepts have been defined from the point of natural language processing (common sense) and so there are some concept drift between MRD and specific domain ontologies.
DomainDomainBBDomain Domain AA
Concept Drift
DomainMRD
Reusable Part
No reusable partbecause ofconcept drift
Selection of Best-match
Spell-Matched Results
Spell-Match
Concept HierarchyConcept Hierarchy
Domain Expert
Extension of Modification
•Matched-Results Analysis •Trimmed- Results Analysis
Trimmed Model
Trimming
Initial Model
A Set of A Set of Input Input TermsTerms
WordNet
DODDLEDODDLE
Best-Match
Internal Node
Root
MoveMove
Move
StayStay
Matched-Results Analysis
A
B C
D
Trimmed part
Initial Model
00
3
B C D
Trimmed Model
A
B C
AUser re-constructsthe sub-tree.
D
Trimmed-Results Analysis
Acquiring not only Conceptual Hierarchy but Concept Definitions
DODDLE II Project (Extending DODDLE)
Info. Resource:Domain-Specific TextsTechnique: Co-occurence information WordSpace
non-TaxonomicRelationships
TaxonomicRelationships
TaxonomicRelationshipAcquisition
Module
Concept SpecificationTemplate
Concept SpecificationTemplate
extension & modification
WordNet
MRD
DODDLE-DODDLE-IIIIOverviewOverview
Domain SpecificTexts
Domain SpecificTexts
DomainOntology
non-TaxonomicRelationship
LearningModule
a Set of Domain Termsa Set of Domain Terms
• Words and phrases in texts can be expressed by vector representation containing co-occurrence statistics
Extracting Concept Relationshipsfrom Domain-Specific Texts
WordSpace (Marti A. Hearst, Hinrich Schutze)
… wi … wj …C1 … wk …… wi … wj …C2 … wk …
Context Similarity between concepts C1 and C2
• Inner products among the vectors work as the similarity between the words and phrases.
high value with similar words coming up around the concepts
A sum of w in Texts
A sum of W in same synset
Constructing WordSpace1. Extract Word 4-grams with high-
frequency2. Collocation Matrix3. Context Vectors4. Word Vectors ( WordSpace is a set of word
vectors )5. Vector representations of all concepts…
.. f8 f4 f3 f7 f8
f4 f1 f3 f8 f9 f2
f5 f1 f7 f1 f5 ..…
….. f8 f4 f3 f7 f8
f4 f1 f3 f4 f9 f2
f5 f1 f7 f1 f5 ..…
Collocation Matrix
4-gram vector4-gram vector
f1
f1
fn
fnf2
f4 2
…
…
..
..
w
w
Texts (4-gram array) Texts (4-gram array) Texts (4-gram vector array)
Vector representation for each 4-gram
….. f8 f4 f3 f7 f8
f4 f1 f3 f4 w f2
f5 f1 f7 f1 f5 ..…
A sum of 4-gram vectors around w
WordNet synset
w1
w3
w2
w4
C
Word Vector W
Context vector w
Vector representation of a concept C
collocation scope context scope
non-TAXONOMY?
: Ca
TAXONOMY
: Cb
non-TAXONOMY?
: Cc
non-TAXONOMY?
: Cd
Concept Specification Template
Ci
ConstructingConcept Specification Template
Ci Ca
Ci Cb
Ci Cc
Ci Cd
a Set of Concept Pairsa Set of Concept Pairsfrom non-TR Learning Modulefrom non-TR Learning Module
Ci
Cb
Ckancestor, descendant and sibling
Taxonomic RelationshipsTaxonomic Relationshipsfrom TR Acquisition Modulefrom TR Acquisition Module
DODDLE-IIDODDLE-IIalphaalpha on Perl/Tkon Perl/Tk ((UNIX UNIX platformplatform))
The Target Domain
Contracts for the International Sale of Goods(CISG)
The Target Domain
Contracts for the International Sale of Goods(CISG)
A Case Study
Input Terms to TR module 46 Legal Terms from CISG Part-II
Domain-Specific Texts to NonTR modulefull text of CISG (about 10,000 words)
Results
46Terms
The paths from Spell Matched Nodes
to the Root of WordNet
Trimmed Model Legal Ontology
Trimming
Initial Model
56Terms 61Terms
377Terms113Terms
Select Best Matches
Modification
Taxonomic Relationships for input termsconstructed from TR Acquisition Module
Support ratio: Support ratio: How match is included in final domain ontology the intermediate products at each DODDLE activity.
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
Com
pone
nt r
ate
(%)
Trimmed Model Extended TrimmedModel
Internal DomainOntology1
Internal DomainOntology2
Domain Ontology
DODDLE's process
WN's structure(52.2%) Small DT(8.0%) Strategy1 (5.3%)Strategy2 (6.9%) Correction by user (21.5%) Addition by user (6. %)1
Support Ratio
ExtractingConcept Relationships
Extraction frequency of 4-gram (times) 8
Collocation scope (4-grams) 10
Context scope (4-grams) 60
setting value for WordSpace
the concept pairs extracted according to context similarity (threshold 0.9993)
Domain Experts Modifying Concept Specification Templates
assentnon-TAXONOMY? : offeror
TAXONOMY : act
non-TAXONOMY? : effect
non-TAXONOMY? : offer
non-TAXONOMY? : person
non-TAXONOMY? : offeree
non-TAXONOMY? : withdrawal
non-TAXONOMY? : time
TAXONOMY : proposal
Concept Specification Template
assent
non-Taxonomic Relationshipsfrom the template
ex) non-Taxonomic Relationships for “assent”
: offerLEGAL-SEQUENCE
: personAGENT
: withdrawalANTONYM
DE Modification
non-taxonomic relationship:
person, offer, withdrawal
taxonomic relationship:
act, proposal
inheritance: offeror, offeree
unnecessary: effect, time
Recall & Precision of Concept Specification Templates
recall
precision
The number of extracted concept pairs
ConclusionsDODDLE II: A Domain Ontology Construction SupportEnvironment using MRD and Domain-Specific Texts
Legal experts appreciate DODDLE II to some extent.
•How to Decide Threshold for CS• How to Identify NTR•Another DM method instead of WordSpace• Another Case Studies with Large Scale
Future WorkFuture WorkFuture WorkFuture Work
OWLとオントロジー開発
• OWL とオントロジー開発⇔ UML とソフトウェア開発
• OWL + RDF(S) + RDF⇒ 中味のあるオントロジー
• オントロジー構築支援ツールが必要• オントロジーの重量⇔スケーラビリテ
ィ