42
OWL ははははははははははは ははははははははは山山 山山 山山山山山 () 山山山山山山山山山山山 SIG-SWO-A201-02

OWL はオントロジーの課題を 本当に解決するのか?

Embed Size (px)

DESCRIPTION

人工知能学会研究会資料 SIG-SWO-A201-02. OWL はオントロジーの課題を 本当に解決するのか?. 山口  高平 (静岡大学). Semantic Web の9階層. 知識工学のトレンド 90-: 概念化の明示的仕様 ( Tom Gruber オントロジーの定義) オントロジー記述言語 (Ontolingua) 知識交換言語 (KIF) Generic Ontology CYC, WordNet, EDR … PSM Task Ontology オントロジー構築方法論 .... オントロジーに関する 知識工学と次世代 Web の流れ. - PowerPoint PPT Presentation

Citation preview

Page 1: OWL はオントロジーの課題を 本当に解決するのか?

OWLはオントロジーの課題を

本当に解決するのか?山口  高平

(静岡大学)

人工知能学会研究会資料SIG-SWO-A201-02

Page 2: OWL はオントロジーの課題を 本当に解決するのか?

Semantic Web の9階層

Page 3: OWL はオントロジーの課題を 本当に解決するのか?

オントロジーに関する知識工学と次世代 Web の流れ

知識工学のトレンド

• 90-: 概念化の明示的仕様( Tom Gruber オントロジーの定義)オントロジー記述言語 (Ontolingua)知識交換言語 (KIF)Generic OntologyCYC, WordNet, EDR…PSMTask Ontologyオントロジー構築方法論...

次世代Webのトレンド

• 95-97: XML as arbitrary structures

• 97-98: RDF

• 98-99: RDFS (schema) as a frame-like system

• 00-01: DAML+OIL

• 02-07: OWL

Page 4: OWL はオントロジーの課題を 本当に解決するのか?

OWL への道のり

• On-To-Knowledge が OIL を定義• 重要 EU プロジェクトのいくつかが OIL を採

用• DAML プロジェクトと OIL がリンク• 定義: DAML+OIL• 改定: DAML+OIL• W 3 C へ DAML+OIL を提出• WebOnt Working Group  の立ち上げ

⇒ OWL のドラフト

Jan’00Med’00Sep’00Dec’00Mar’01Aug’01Oct’01

Jan’00Med’00Sep’00Dec’00Mar’01Aug’01Oct’01

Page 5: OWL はオントロジーの課題を 本当に解決するのか?

RDF ( S) と OWL の関連

• class-def• subclass-of• slot-def• subslot-of• domain• range

• class-def• subclass-of• slot-def• subslot-of• domain• range

• class-expressions• AND, OR, NOT

• slot-constraints• has-value, value-type• cardinality

• slot-properties• trans, symm

• class-expressions• AND, OR, NOT

• slot-constraints• has-value, value-type• cardinality

• slot-properties• trans, symm

RDF(S)DAML+OILDAML+OIL

≒≒OWLOWL

Page 6: OWL はオントロジーの課題を 本当に解決するのか?

Ontologies by Keyword

Keyword URI

academic department

http://www.cs.umd.edu/projects/plus/DAML/onts/cs1.0.daml

academic department

http://www.cs.umd.edu/projects/plus/DAML/onts/cs1.1.daml

Academic Positions http://www.daml.ri.cmu.edu/ont/homework/cmu-ri-employmenttypes-ont.daml

access control primitives

http://www.w3.org/2000/10/swap/pim/doc.rdf

acronym http://orlando.drc.com/daml/Ontology/Thesaurus/CALL/current/

activity http://www.kestrel.edu/DAML/2000/12/OPERATION.daml

Actors http://opencyc.sourceforge.net/daml/cyc.daml

Actors http://www.cyc.com/2002/04/08/cyc.daml

Actors http://www.cyc.com/cyc-2-1/cyc-vocab.daml

address book http://www.w3.org/2000/10/swap/pim/contact.rdf

agenda http://www.daml.org/2001/10/agenda/agenda-ont

DAML Ontology Library (Ontology's by Keyword) http://www.daml.org/ontologies/keyword.html

Page 7: OWL はオントロジーの課題を 本当に解決するのか?

<rdf:RDF xmlns="http://www.daml.org/2001/08/baseball/baseball-ont#" xmlns:daml="http://www.daml.org/2001/03/daml+oil#" xmlns:oiled="http://img.cs.man.ac.uk/oil/oiled#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <daml:Ontology rdf:about=""> <rdfs:comment>baseball</rdfs:comment> <rdfs:comment>baseball ontology</rdfs:comment>

<daml:versionInfo>"1.0"</daml:versionInfo> </daml:Ontology> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#PostSeasonGame"><rdfs:label>PostSeasonGame</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:55:22 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Game"></rdfs:Class> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Inning"> <rdfs:label>Inning</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:22:49 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#AggregateEvent"></rdfs:Class> </rdfs:subClassOf> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#number"></daml:onProperty> <daml:toClass> <rdfs:Class rdf:about="#http://www.w3.org/TR/xmlschema-2/#string"></rdfs:Class> </daml:toClass> </daml:Restriction> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#League">

<rdfs:label>League</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:35:38 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#division"></daml:onProperty> <daml:toClass> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Division"></rdfs:Class> </daml:toClass> </daml:Restriction> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Person">

<rdfs:label>Person</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>13:00:13 26.08.2001</oiled:creationDate> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Player"> <rdfs:label>Player</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:25:42 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Employee"></rdfs:Class> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Relief"> <rdfs:label>Relief</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>18:31:03 28.08.2001</oiled:creationDate> <rdfs:subClassOf> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#LineupEvent"></rdfs:Class> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Season">

<rdfs:label>Season</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:13:04 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#year"></daml:onProperty> <daml:toClass> <rdfs:Class rdf:about="#http://www.w3.org/TR/xmlschema-2/#string"></rdfs:Class> </daml:toClass> </daml:Restriction> </rdfs:subClassOf> </rdfs:Class> <rdfs:Class rdf:about="http://www.daml.org/2001/08/baseball/baseball-ont#Series">

<rdfs:label>Series</rdfs:label> <rdfs:comment></rdfs:comment> <oiled:creationDate>12:56:25 26.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#home"></daml:onProperty> <daml:toClass rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#Team"/> </daml:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <daml:Restriction> <daml:onProperty rdf:resource="http://www.daml.org/2001/08/baseball/baseball-ont#away"></daml:onProperty> <daml:toClass

Baseball Ontology

Page 8: OWL はオントロジーの課題を 本当に解決するのか?

ソフトウェア開発プロセスのためのオントロジー開発 (人手)経験

Page 9: OWL はオントロジーの課題を 本当に解決するのか?

IssueIssue ApproachApproach

A Process-Centered Software Engineering A Process-Centered Software Engineering Environment Using OntologiesEnvironment Using Ontologies

GoalGoal

Real software processeschange dynamically...

need the facility to correspondto changing process dynamically

very difficult to maintain

develop software process ontologiesthrough the domain analysis usingtwo case studies

First,

Afterwards,design the system which generatesoftware processes interactivelywith a user

Page 10: OWL はオントロジーの課題を 本当に解決するのか?

Development of Software Process Ontologies

Step1. Extract real software processes and objects fromthe two cases

InventInvent

JoinJoin

AppendAppend

ExtractExtract

C

A+BA

B

inputinput

referencereference

Process

B’orA’

Aαα+

outputoutput

Step2. Organize extracted processes and objects into distinctgroups such that there are more similarity in meaning,while defining the software process scheme

Step3. Give detailed definitions to each processes using software process scheme

Page 11: OWL はオントロジーの課題を 本当に解決するのか?

activityactivity

InventInvent

JoinJoin

AppendAppend

ExtractExtract

Invent with referencesInvent with references

Invent without referencesInvent without references

Join without referencesJoin without references

Extract without referencesExtract without references

Join with referencesJoin with references

Extract with referencesExtract with references

EvaluateEvaluateReviewReviewExamineExamine

Focus on input,output,and reference roles

Focus on input,output,and reference roles

Focus on reference,tool and agent rolesFocus on reference,tool and agent roles

Process OntologyProcess Ontology

Page 12: OWL はオントロジーの課題を 本当に解決するのか?

Software Process Ontologies(conceptual hierarchy)

Object Ontology includingabout 400 concepts

Process Ontology including about 250 concepts

Page 13: OWL はオントロジーの課題を 本当に解決するのか?

Process Ontology Object Ontology

GoalObjectGoal

Object??

??

??

??

??Start

ObjectStart

Object

Retrieve processessatisfied with constraints

Software Process Automation using ontologies

Page 14: OWL はオントロジーの課題を 本当に解決するのか?
Page 15: OWL はオントロジーの課題を 本当に解決するのか?

A query about SE P

What SEP between the following input and output ?What SEP between the following input and output ?

Input : Requirement Specification from S/W Subsystems Output : Detailed Design of Each Component of S/W Subsystems

Case Study

Page 16: OWL はオントロジーの課題を 本当に解決するのか?

Initial Software Process Plan CandidatesInitial Software Process Plan Candidates

Requirement Specification

fromS/W Subsystems

Detailed Design ofEach Components of

S/W Subsystems

Design BasicComponents of

S/W Subsystems

Design BasicComponents ofI/F Subsystems

Design LogicalAspects of DB

Review AllBasic

DesignsTogether

Evaluate AllBasic Designs

Design Physical

Aspects of DB

Specify TestingS/W Integration

Design DetailedComponents of

S/W Subsystems

Design DetailedComponents ofI/F Subsystems

Specify TestingUnits of S/W

Evaluate AllDetailed Designs

Review AllDetailed Designs

Together

OutputOutput

InputInput

Page 17: OWL はオントロジーの課題を 本当に解決するのか?

Interim Software Process Plan CandidatesInterim Software Process Plan Candidates

Requirement Specification

fromS/W Subsystems

Detailed Design ofEach Components of

S/W Subsystems

Design BasicComponents of

S/W Subsystems

Design BasicComponents ofI/F Subsystems

Design LogicalAspects of DB

Evaluate AllBasic Designs

Design Physical

Aspects of DB

Specify TestingS/W Integration

Design DetailedComponents of

S/W Subsystems

Design DetailedComponents ofI/F Subsystems

Specify TestingUnits of S/W

Evaluate AllDetailed Designs

Review AllDetailed Designs

Together

OutputOutput

InputInput

Page 18: OWL はオントロジーの課題を 本当に解決するのか?

Final Software Process Plan CandidatesFinal Software Process Plan Candidates

Requirement Specification

fromS/W Subsystems

Detailed Design ofEach Components of

S/W Subsystems

Design BasicComponents of

S/W Subsystems

Design BasicComponents ofI/F Subsystems

Design LogicalAspects of DB

Evaluate AllBasic Designs

Design Physical

Aspects of DB

Specify TestingS/W Integration

Design DetailedComponents of

S/W Subsystems

Design DetailedComponents ofI/F Subsystems

Specify TestingUnits of S/W

Evaluate AllDetailed Designs

Review AllDetailed Designs

Together

OutputOutput

InputInput

Page 19: OWL はオントロジーの課題を 本当に解決するのか?

Evaluation

Issues and Future work

•The environment can’t allocate human and time resource•The environment has not the facility that supports the user to add user’s specific processes•The methodology for building ontologies has no theoretical aspects,such as soundness

Applicability

It turned out that the environment generated SPP candidates good for the user’s query with user interaction

Page 20: OWL はオントロジーの課題を 本当に解決するのか?

計算機可読型辞書とテキストコーパスからの

オントロジー開発 (半自動)経験

Page 21: OWL はオントロジーの課題を 本当に解決するのか?

How to Build up Domain Ontologies with Less Cost

• Taking Existing Information Resources

DODDLE Project (a Domain Ontology rapiD DeveLopment Environment)

exploiting a MRDConstructing up just a conceptual hierarchy

Page 22: OWL はオントロジーの課題を 本当に解決するのか?

Why not taking MRD to build up domain ontologies ?• Good News:

MRD have many concepts.

• Bad News:The concepts have been defined from the point of natural language processing (common sense) and so there are some concept drift between MRD and specific domain ontologies.

Page 23: OWL はオントロジーの課題を 本当に解決するのか?

DomainDomainBBDomain Domain AA

Concept Drift

DomainMRD

Reusable Part

No reusable partbecause ofconcept drift

Page 24: OWL はオントロジーの課題を 本当に解決するのか?

Selection of Best-match

Spell-Matched Results

Spell-Match

Concept HierarchyConcept Hierarchy

Domain Expert

Extension of Modification

•Matched-Results Analysis •Trimmed- Results Analysis

Trimmed Model

Trimming

Initial Model

A Set of A Set of Input Input TermsTerms

WordNet

DODDLEDODDLE

Page 25: OWL はオントロジーの課題を 本当に解決するのか?

Best-Match

Internal Node

Root

MoveMove

Move

StayStay

Matched-Results Analysis

Page 26: OWL はオントロジーの課題を 本当に解決するのか?

A

B C

D

Trimmed part

Initial Model

00

3

B C D

Trimmed Model

A

B C

AUser re-constructsthe sub-tree.

D

Trimmed-Results Analysis

Page 27: OWL はオントロジーの課題を 本当に解決するのか?
Page 28: OWL はオントロジーの課題を 本当に解決するのか?

Acquiring not only Conceptual Hierarchy but Concept Definitions

DODDLE II Project (Extending DODDLE)

Info. Resource:Domain-Specific TextsTechnique: Co-occurence information WordSpace

Page 29: OWL はオントロジーの課題を 本当に解決するのか?

non-TaxonomicRelationships

TaxonomicRelationships

TaxonomicRelationshipAcquisition

Module

Concept SpecificationTemplate

Concept SpecificationTemplate

extension & modification

WordNet

MRD

DODDLE-DODDLE-IIIIOverviewOverview

Domain SpecificTexts

Domain SpecificTexts

DomainOntology

non-TaxonomicRelationship

LearningModule

a Set of Domain Termsa Set of Domain Terms

Page 30: OWL はオントロジーの課題を 本当に解決するのか?

• Words and phrases in texts can be expressed by vector representation containing co-occurrence statistics

Extracting Concept Relationshipsfrom Domain-Specific Texts

WordSpace (Marti A. Hearst, Hinrich Schutze)

… wi … wj …C1 … wk …… wi … wj …C2 … wk …

Context Similarity between concepts C1 and C2

• Inner products among the vectors work as the similarity between the words and phrases.

high value with similar words coming up around the concepts

Page 31: OWL はオントロジーの課題を 本当に解決するのか?

A sum of w in Texts

A sum of W in same synset

Constructing WordSpace1. Extract Word 4-grams with high-

frequency2. Collocation Matrix3. Context Vectors4. Word Vectors ( WordSpace is a set of word

vectors )5. Vector representations of all concepts…

.. f8 f4 f3 f7 f8

f4 f1 f3 f8 f9 f2

f5 f1 f7 f1 f5 ..…

….. f8 f4 f3 f7 f8

f4 f1 f3 f4 f9 f2

f5 f1 f7 f1 f5 ..…

Collocation Matrix

4-gram vector4-gram vector

f1

f1

fn

fnf2

f4 2

..

..

w

w

Texts (4-gram array) Texts (4-gram array) Texts (4-gram vector array)

Vector representation for each 4-gram

….. f8 f4 f3 f7 f8

f4 f1 f3 f4 w f2

f5 f1 f7 f1 f5 ..…

A sum of 4-gram vectors around w

WordNet synset

w1

w3

w2

w4

C

Word Vector W

Context vector w

Vector representation of a concept C

collocation scope context scope

Page 32: OWL はオントロジーの課題を 本当に解決するのか?

non-TAXONOMY?

: Ca

TAXONOMY

: Cb

non-TAXONOMY?

: Cc

non-TAXONOMY?

: Cd

Concept Specification Template

Ci

ConstructingConcept Specification Template

Ci Ca

Ci Cb

Ci Cc

Ci Cd

a Set of Concept Pairsa Set of Concept Pairsfrom non-TR Learning Modulefrom non-TR Learning Module

Ci

Cb

Ckancestor, descendant and sibling

Taxonomic RelationshipsTaxonomic Relationshipsfrom TR Acquisition Modulefrom TR Acquisition Module

Page 33: OWL はオントロジーの課題を 本当に解決するのか?

DODDLE-IIDODDLE-IIalphaalpha on Perl/Tkon Perl/Tk ((UNIX UNIX platformplatform))

Page 34: OWL はオントロジーの課題を 本当に解決するのか?

The Target Domain

Contracts for the International Sale of Goods(CISG)

The Target Domain

Contracts for the International Sale of Goods(CISG)

A Case Study

Input Terms to TR module 46 Legal Terms from CISG Part-II

Domain-Specific Texts to NonTR modulefull text of CISG (about 10,000 words)

Page 35: OWL はオントロジーの課題を 本当に解決するのか?

Results

46Terms

The paths from Spell Matched Nodes

to the Root of WordNet

Trimmed Model Legal Ontology

Trimming

Initial Model

56Terms 61Terms

377Terms113Terms

Select Best Matches

Modification

Page 36: OWL はオントロジーの課題を 本当に解決するのか?

Taxonomic Relationships for input termsconstructed from TR Acquisition Module

Page 37: OWL はオントロジーの課題を 本当に解決するのか?

Support ratio: Support ratio: How match is included in final domain ontology the intermediate products at each DODDLE activity.

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

100.0

Com

pone

nt r

ate

(%)

Trimmed Model Extended TrimmedModel

Internal DomainOntology1

Internal DomainOntology2

Domain Ontology

DODDLE's process

WN's structure(52.2%) Small DT(8.0%) Strategy1 (5.3%)Strategy2 (6.9%) Correction by user (21.5%) Addition by user (6. %)1

Support Ratio

Page 38: OWL はオントロジーの課題を 本当に解決するのか?

ExtractingConcept Relationships

Extraction frequency of 4-gram (times) 8

Collocation scope (4-grams) 10

Context scope (4-grams) 60

setting value for WordSpace

the concept pairs extracted according to context similarity (threshold 0.9993)

Page 39: OWL はオントロジーの課題を 本当に解決するのか?

Domain Experts Modifying Concept Specification Templates

assentnon-TAXONOMY? : offeror

TAXONOMY : act

non-TAXONOMY? : effect

non-TAXONOMY? : offer

non-TAXONOMY? : person

non-TAXONOMY? : offeree

non-TAXONOMY? : withdrawal

non-TAXONOMY? : time

TAXONOMY : proposal

Concept Specification Template

assent

non-Taxonomic Relationshipsfrom the template

ex) non-Taxonomic Relationships for “assent”

: offerLEGAL-SEQUENCE

: personAGENT

: withdrawalANTONYM

DE Modification

non-taxonomic relationship:

person, offer, withdrawal

taxonomic relationship:

act, proposal

inheritance: offeror, offeree

unnecessary: effect, time

Page 40: OWL はオントロジーの課題を 本当に解決するのか?

Recall & Precision of Concept Specification Templates

recall

precision

The number of extracted concept pairs

Page 41: OWL はオントロジーの課題を 本当に解決するのか?

ConclusionsDODDLE II: A Domain Ontology Construction SupportEnvironment using MRD and Domain-Specific Texts

Legal experts appreciate DODDLE II to some extent.

•How to Decide Threshold for CS• How to Identify NTR•Another DM method instead of WordSpace• Another Case Studies with Large Scale

Future WorkFuture WorkFuture WorkFuture Work

Page 42: OWL はオントロジーの課題を 本当に解決するのか?

OWLとオントロジー開発

• OWL とオントロジー開発⇔ UML とソフトウェア開発

• OWL + RDF(S) + RDF⇒ 中味のあるオントロジー

• オントロジー構築支援ツールが必要• オントロジーの重量⇔スケーラビリテ