Exception-enriched Rule Learning from Knowledge
Graphs
Mohamed Gad-Elrab1, Daria Stepanova1, Jacopo Urbani 2, Gerhard Weikum1
1Max-Planck-Institut fΓΌr Informatik, Saarland Informatics Campus, Germany
2 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
21st October 2016
Knowledge Graphs (KGs)
2
β’ Huge collection of < π π’πππππ‘, ππππππππ‘π, ππππππ‘ > triples
β’ Positive facts under Open World Assumption (OWA)
β’ Possibly incomplete and/or inaccurate
Mining Rules from KGs
3
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
Mining Rules from KGs
4
π: πππ£ππ πΌπ π, π β ππ πππππππππ π, π , πππ£ππ πΌπ(π, π)
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
[GalΓ‘rraga et al., 2015]
Mining Rules from KGs
5
π: πππ£ππ πΌπ π, π β ππ πππππππππ π, π , πππ£ππ πΌπ(π, π)
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
1 2
Our Goal
6
π: πππ£ππ πΌπ π, π β ππ πππππππππ π, π , πππ£ππ πΌπ π, π , πππ‘ ππ π΄(π, πππ )
Amsterdam
isMarriedToJohn Kate
Chicago
isMarriedToBrad Anna
Berlin
hasBrother
isMarriedToDave ClaraisMarriedToBob Alice
Researcher
Berlin Football
1 2
Problem Statement
β’ Quality-based theory revision problem
β’ Givenβ’ Knowledge graph πΎπΊ
β’ Set of Horn rules π π»
β’ Find the nonmonotonic revision π ππ of π π»
β’ Maximize top-k avg. confidence
β’ Minimize conflicting prediction
7
Problem Statement
β’ Quality-based theory revision problem
β’ Givenβ’ Knowledge graph πΎπΊ
β’ Set of Horn rules π π»
β’ Find the nonmonotonic revision π ππ of π π»
β’ Maximize top-k avg. confidence
β’ Minimize conflicting prediction
8
Unknown
Problem Statement: Conflicting Predictions
β’ Defining conflicts
β’ Measuring conflicts (auxiliary rules)
9
π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
π2ππ’π₯: πππ‘_ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄(π)
π = π1: ππππππ πππΌππ½π π β ππ πΊπππ π , ππ π΅ππ πππππ½ππ΄ππππ (π)π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
{ππ πΊπππ π , ππ π΅ππ πππππ½ππ΄ππππ π , βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄ π }
Output: {(ππππππ πππΌππ½π π , π2: πππ‘ ππππππ πππΌππ½π π )}
π =
Problem Statement: Conflicting Predictions
β’ Defining conflicts
β’ Measuring conflicts (auxiliary rules)
10
π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
π2ππ’π₯: πππ‘_ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄(π)
π = π1: ππππππ πππΌππ½π π β ππ πΊπππ π , ππ π΅ππ πππππ½ππ΄ππππ (π)π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
{ππ πΊπππ π , ππ π΅ππ πππππ½ππ΄ππππ π , βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄ π }
Output: {(ππππππ πππΌππ½π π , π2: πππ‘ ππππππ πππΌππ½π π )}
π =
Problem Statement: Conflicting Predictions
β’ Defining conflicts
β’ Measuring conflicts (auxiliary rules)
11
π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
π2ππ’π₯: πππ‘_ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄(π)
π = π1: ππππππ πππΌππ½π π β ππ πΊπππ π , ππ π΅ππ πππππ½ππ΄ππππ (π)π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
{ππ πΊπππ π , ππ π΅ππ πππππ½ππ΄ππππ π , βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄ π }
(ππππππ πππΌππ½π π , πππ‘_ππππππ πππΌππ½π π )
π =
Approach Overview
Step 1β’ Mining Horn Rules
Step 2β’ Extracting Exception Witness Set (EWS)
Step 3β’ Constructing Candidate Revisions
Step 4β’ Selecting the Best Revision
12
bornInUSA livesInUSA stateless emigrant singer poet
p1 β β
p2 β β
p3 β β
p4 β β β
p5 β β β
p6 β β
p7 β β
p8 β β β β
p9 β β β
p10 β β β β
p11 β β β
Step 1: Mining Horn Rules
13
ππ: ππππππ°ππΌπΊπ¨ πΏ β πππππ°ππΌπΊπ¨(πΏ)
No
rmal
Ab
-no
rmal
bornInUSA livesInUSA stateless emigrant singer poet
p1 β β
p2 β β
p3 β β
p4 β β β
p5 β β β
p6 β β
p7 β β
p8 β β β β
p9 β β β
p10 β β β β
p11 β β β
ππ: ππππππ°ππΌπΊπ¨ πΏ β πππππ°ππΌπΊπ¨(πΏ)
No
rmal
Ab
-no
rmal
Step 2: Extracting Exception Witness Set (EWS)
14πΈπππ = {ππππππππ‘ π , ππππ‘ π }
Step 3: Constructing Candidate Revisions
β’ Horn rules
β’ Rule revisions
15
π1: πππ£ππ πΌππππ΄ π β πππππΌππππ΄ π , πππ‘ ππππ‘ ππ1: πππ£ππ πΌππππ΄ π β πππππΌππππ΄ π , πππ‘ ππππππππ‘ πβ¦
π = π1: πππ£ππ πΌππππ΄ π β πππππΌππππ΄(π)π2: ππππππππ‘ π β π π‘ππ‘ππππ π (π)
πΈππ = {ππππ‘ π , ππππππππ‘ π ,β¦ }πΈππ = {π1 π , π2(π), β¦ }
Step 4: Selecting the Best Revision
Finding globally best revision is expensive!
β’ NaΓ―ve rankerβ’ For each rule, pick the revision that maximizes confidence
β’ Works in isolation from other rules
β’ Partial materialization rankerβ’ πΎπΊs are incomplete!
β’ Augment the original πΎπΊ with predictions of other rules
β’ Rank revisions on avg. confidence of the rule and its auxiliary.
16
β’ Partial materialization
Ranking Ruleβs Revisions
17
bornInUSA livesInUSA stateless emigrant singer poet
p1 β β
p2 β β
p3 β β
p4 β β β
p5 β β β
p6 β β
p7 β β
p8 β β β β
p9 β β β
p10 β β β β
p11 β β β
ππ: ππππππ°ππΌπΊπ¨ πΏ β πππππ°ππΌπΊπ¨(πΏ)
No
rmal
Ab
-no
rmal
β’ Partial materialization
bornInUSA livesInUSA stateless emigrant singer poet
p1 β β
p2 β β
p3 β β
p4 β β β β
p5 β β β β
p6 β β β
p7 β β β
p8 β β β β
p9 β β β β β
p10 β β β β
p11 β β β β
ππ: ππππππ°ππΌπΊπ¨ πΏ β πππππ°ππΌπΊπ¨(πΏ)
No
rmal
Ab
-no
rmal
Ranking Ruleβs Revisions
18
Ranking Ruleβs Revisions
β’ Ordered partial materialization rankerβ’ Only rules with higher quality
β’ Ordered weighted partial materialization rankerβ’ KG fact weight = 1
β’ Predicted facts inherit their weights from the rules
19
Experiments
β’ Ruleset quality
20*Higher is better
0.60
0.65
0.70
0.75
0.80
20 40 60 80 100
Avg
. Co
nfi
den
ce
Top-K (%) Rules
YAGO3
Horn Naive Ordered & Weighted PM
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
20 40 60 80 100A
vg. C
on
fid
ence
Top-K (%) Rules
IMDB
Horn Naive Ordered & Weighted PM
Facts 10M 2MRules 10K 25K
Experiments
β’ Predictions consistency
21*Lower is better
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
200 400 600 800 1000
Co
nfl
ict
Rat
io
Top-K Rules
YAGO3
Naive Ordered & Weighted PM
0
0.1
0.2
0.3
0.4
0.5
200 400 600 800 1000C
on
flic
t R
atio
Top-K Rules
IMDB
Naive Ordered & Weighted PM
Experiments
β’ Examples
22
ππ πππ’ππ‘πππ(π) β ππ πΌππ΄π’π π‘πππ(π), ππ πΌππΌπ‘πππ¦ π , πππ‘ ππ π ππ£ππ (π)
ππ πππππ‘πππππ΄ π β πππππΌππππ΄ π , ππ πΊππ£ π , πππ‘ ππ πππππ‘ππ’πππ‘ππ πππ(π)
πππππΌππππ΄ π β πππ‘πππΌππππ£ππ π , πππππ‘πππππ£ππ π , πππ‘ π€πππΉπππππππ(π)
Summary
β’ Conclusionβ’ Quality-based theory revision under OWA
β’ Partial materialization for ranking revisions
β’ Comparison of ranking methods on real life KGs
β’ Outlookβ’ Extending to higher arity predicates
β’ Binary predicates [Tran et al., to appear ILP2016]
β’ Evidence from text corpora
β’ Exploiting partial completeness
23
References
β’ [Angiulli and Fassetti, 2014] Fabrizio Angiulli and Fabio Fassetti. Exploiting domain knowledge to detect outliers. Data Min. Knowl. Discov., 28(2):519β568, 2014.
β’ [Dimopoulos and Kakas, 1995] Yannis Dimopoulos and Antonis C. Kakas. Learning non-monotonic logic programs: Learning exceptions. In Machine Learning: ECML-95, 8th European Conference on Machine Learning, Heraclion, Crete, Greece, April 25-27, 1995, Proceedings, pages 122β137, 1995.
β’ [Galarraga et al., 2015] Luis Galarraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek. Fast Rule Mining in Ontological Knowledge Bases with AMIE+. In VLDB Journal, 2015.
β’ [Law et al., 2015] Mark Law, Alessandra Russo, and Krysia Broda. The ILASP system for learning answer set programs, 2015.
β’ [Leone et al., 2006] Nicola Leone, Gerald Pfeifer, Wolfgang Faber, Thomas Eiter, Georg Gottlob, Simona Perri, and Francesco Scarcello. 2006. The DLV system for knowledge representation and reasoning. ACM Trans. Comput. Logic 7, 3 (July 2006), 499-562.
β’ [Suzuki, 2006] Einoshin Suzuki. Data mining methods for discovering interesting exceptions from an unsupervised table. J. UCS, 12(6):627β653, 2006.
β’ [Tran et al., 2016] Hai Dang Tran, Daria Stepanova, Mohamed H. Gad-Elrab, Francesca A. Lisi, Gerhard Weikum. Towards Nonmonotonic Relational Learning from Knowledge Graphs. ILP2016, London, UK, to appear.
β’ [Katzouris et al., 2015] Nikos Katzouris, Alexander Artikis, and Georgios Paliouras. Incremental learning of event definitions with inductive logic programming. Machine Learning, 100(2-3):555β585, 2015.
24
Related Work
β’ Learning nonmonotonic programsβ’ E.g., [Dimopoulos and Kakas, 1995], ILASP [Law et al., 2015],
ILED [Katzouris et al., 2015], etc.
β’ Outlier detection in logic programsβ’ E.g., [Angiulli and Fassetti, 2014], etc.
β’ Mining exception rulesβ’ E.g., [Suzuki, 2006], etc.
25
Problem Statement: Ruleset Quality
β’ Independent Rule Measure (ππ)β’ Support: π π’ππ π» β π΅ = π π’ππ(π» βͺ π΅)
β’ Coverage: cππ£ π» β π΅ = π π’ππ(π΅)
β’ Confidence: cπππ π» β π΅ =π π’ππ(π»βͺπ΅)
π π’ππ(π΅)
β’ Lift: ππππ‘ π» β π΅ =ππππ(π»βπ΅)
π π’ππ(π»)
β’ β¦
β’ Average Ruleset Quality
26
πππ π ππ, πΊ =
π βπ ππ
ππ(π, πΊ)
π ππ
Problem Statement: Conflicting Predictions
β’ Measuring conflicts (auxiliary rules)
27
π2: ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , πππ‘ ππ’ππππ βπππππ΄(π)
π2ππ’π₯: πππ‘_ππππππ πππΌππ½π π β βππ π½ππΆπππππ ππ π , ππ’ππππ βπππππ΄(π)
πππππππππ‘ π ππ, πΊ =|{(π π , πππ‘_π π ), β¦ }|
{πππ‘_π π , β¦ }
Propositionalization
β’ Unary predicates
28
Binary Predicates
β’ hasType(enstein, scientist)
β’ isMarriedTo(elsa, einstein)
β’ bornIn(einstein, um)
Unary
β’ isAScientist(einstein)
β’ isMarriedToEinstein(elsa)
β’ bornInUlm(einstein)
Abstraction
β’ isAScientist(einstein)
β’ isMarriedToScientist(elsa)
β’ bornInGermany(einstein)
Experiments
β’ Input
β’ Experiment statistics
29
YAGO3 IMDB
Input Facts 10M 2M
Horn Rules 10K 25K
Revised Rules 6K 22K
General-purpose KG Domain-specific KG (Movies)
Experiments
β’ Predictions assessmentβ’ Run DLV on YAGO and π π» then π ππ seperately
β’ Sample facts such that fact π β ππ΄πΊππ»\ππ΄πΊπππ
β’ 73% of the sampled facts were found to be erroneous
30
checked predictions
Ranking Ruleβs Revisions
β’ Partial materialization rankerβ’ Augment the original πΎπΊ with predictions of other rules
β’ Rank revisions on Avg. confidence of the π and πππ’π₯
31
π ππππ ππ , πΎπΊβ =
ππππ ππ , πΎπΊβ + ππππ(ππ
ππ’π₯ , πΎπΊβ)
2
where ππ is the rule π with exception π & πΎπΊβis the augmented πΎπΊ.