21
Theoretical Analysis of Multi- Instance Leaning 张张张 张张张 [email protected] 张张张张张张张张张张张张张张张张 2002.10.11

Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Embed Size (px)

DESCRIPTION

Introduction Origin Multi-instance learning originated from the problem of “ drug activity prediction ”, and was first formalized by T. G. Dietterich et al. in their seminal paper “ Solving the multiple-instance problem with axis-parallel rectangles ” (1997) Later in 2001, J. D. Zuker and Y. Chevaleyre extended the concept of “ multi-instance learning ” to “ multi-part learning ”, and pointed out that many previously studied problems are “ multi-part ” problems rather than “ multi-instance ” ones.

Citation preview

Page 1: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Theoretical Analysis of Multi-Instance Leaning

张敏灵 周志华 [email protected]

南京大学软件新技术国家重点实验室2002.10.11

Page 2: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Outline Introduction Theoretical analysis

PAC learning model PAC learnablility of APR Real-valued multi-instance learning

Future work

Page 3: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Introduction Origin

Multi-instance learning originated from the problem of “drug activity prediction”, and was first formalized by T. G. Dietterich et al. in their seminal paper “Solving the multiple-instance problem with axis-parallel rectangles”(1997)

Later in 2001, J. D. Zuker and Y. Chevaleyre extended the concept of “multi-instance learning” to “multi-part learning”, and pointed out that many previously studied problems are “multi-part” problems rather than “multi-instance” ones.

Page 4: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Introduction-cont’d

Comparisons

Fig.1. The shape of a molecule changes as it rotates it’s bonds

Fig.2. Classical and multi-instance learning frameworks

Drug activity prediction problem

Page 5: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Introduction-cont’d Experiment data

Dataset

#dim #bags #posbags

#neg bags

#instan-ces

#instances/bag

max min ave

musk1 166 92 47 45 476 40 2 5.17

musk2 166 102 39 63 6598 1044 1 64.69

APR(Axis-Parallel Rectangles) algorithms

Fig.3. APR algorithms

GFS elim-count APR(standard)

GFS elim-kde APR(outside-in)

Iterated discrim APR(inside-out)

musk1: 92.4%

musk2: 89.2%

Page 6: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Introduction-cont’d Various algorithms

APR (T. G. Dietterich et al.1997) MULTINST (P. Auer 1997) Diverse Density (O. Maron 1998) Bayesian-kNN, Citation-kNN (J. Wang et al. 20

00) Relic (G. Ruffo 2000) EM-DD (Q. Zhang & S. A. Goldman 2001) ……

Page 7: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Introduction-cont’d Comparison on benchmark data sets

Algorithms Musk1(%correct)

Musk2(%correct)

iterated-discrim APR 92.4 89.2

Citation-kNN 92.4 86.3

Diverse Density 88.9 82.5

RELIC 83.7 87.3

MULTINST 76.7 84.0

BP 75.0 67.7

C4.5 68.5 58.8

Fig.4. A comparison of several multi-instance learning algorithm

Page 8: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Introduction-cont’d Application area Drug activity prediction (T. G. Dietterich et al. 1997) Stock prediction (O. Maron 1998) Learn a simple description of a person from a series

of images (O. Maron 1998) Natural scene classification (O. Maron & A. L. Ratan

1998) Event prediction (G. M. Weiss & H. Hirsh 1998) Data mining and computer security (G. Ruffo 2000) …… Multi-instance learning has been regarded as the

fourth machine learning framework parallel to supervised learning, unsupervised learning, and reinforcement learning.

Page 9: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Theoretical analysis PAC learning model

Definition and it’s properties VC dimension

PAC learnability of APR Real-valued multi-instance learning

Page 10: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Theoretical Analysis - PAC model Computational learning theory

L. G. Valiant (1984) A theory of learnable Deductive learning

Used for constructing a mathematical model of a cognitive process.

W PActual

example MCoded

example0/1

Fig.5. Diagram of a framework for learning

Page 11: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC model-cont’d Definition of PAC learning We say that a learning algorithm L is a pac(probably

approximately correct) learning algorithm for the hypothesis space H if, given

A confidence parameter δ (0< δ<1); An accuracy parameter ε (0< ε<1);

then there is a positive integer mL = mL (δ,ε) such that For any target concept t ∈H For any probability distribution µ on X

whenever m mL , µm{s ∈ S(m,t) | er µ(L(s) , t)< ε}>1- δ

Page 12: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC model-cont’d Properties of a pac learning algorithm

It is probable that a useful training sample is presented. One can only expect that the output hypothesis is

approximately correct. mL depends upon δ and ε, but not on t and µ.

If there is a pac learning algorithm for a hypothesis space H, then we say that H is pac-learnable.

Efficient pac learning algorithm If the running time of a pac learning algorithm L is

polynomial in 1/ δ and 1/ ε, then L is said to be efficient. It is usually necessary to require a pac learning algorithm

to be efficient.

Page 13: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC model-cont’d VC dimension

VC (Vapnik-Chervonenkis) dimension of a hypothesis space H is a notion originally defined by Vapnik and Chervonenkis(1971), and was introduced into computational learning theory by Blumer et al.(1986)

VC dimension of a hypothesis space H, denoted by VCdim(H), describes the ‘expressive power’ of H in a sense.Generally, the greater of VCdim(H), the greater ‘expressive power’ of H, so H is more difficult to learn.

Page 14: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC model-cont’d Consistency If for any target concept t∈H and any training sample

s=((x1,b1),(x2,b2), . . ., (xm,bm)) for t, the corresponding hypothesis L(s)∈H agrees with s, i.e. L(s)(xi)=t(xi)=bi, then we say that L is a consistent algorithm.

VC dimension and pac learnability

L is a consistent learning algorithm for H

H has finite VC dimensionH is pac-learnable

Page 15: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Theoretical Analysis - PAC learning of APR Early work While T. G. Dietterich et al. have proposed three APR

algorithms for multi-instance learning, P. M. Long & L. Tan (1997) had some theoretical analysis of the pac learnability of APR and showed that if,

Each instance in a bag is draw from a product distribution.

All instance in a bag are drawn independently.2 6

10( log )d n ndO 5 12

220( log )d n ndO

then APR is pac learnable under the multi-instance learning framework with sample complexity and time complexity .

Page 16: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC learning of APR-cont’d A hardness result

3 2

2( )d nO

2 2

2( log )d n dO

Via the analysis of VC dimension, P. Auer et al.(1998) gave a much more efficient pac learning algorithm than with sample complexity and time complexity .

More important, they proved that if the instances in a bag are not independent, then learning APR under multi-instance learning framework is as hard as learning DNF formulas, which is a NP-Complete problem.

Page 17: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC learning of APR-cont’d A further reduction

A. Blum & A. Kalai (1998) further studied the problem of pac learning APR from multi-instance examples, and proved that

If H is pac learnable from 1-sided (or 2-sided) random classification noise, then H is pac learnable from multi-instance examples.

Via a reduction to the “Statistical Query” model ( M. Kearns 1993), APR is pac learnable from multi-instance examples with sample complexity and with time complexity .

2

2( )d nO

3 2

2( )d nO

Page 18: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

PAC learning of APR-cont’d Summary

  Sample

complexity

Time

complexityConstrains

Theoretical tools

P. M. Long et al. product distribution,

independent instances

p-concept,

VC dimension

P. Auer et al.

 independent

instancesVC dimension

A. Blum et al. 

independent instances

statistical query model,

VC dimension

2 6

10( log )d n ndO

5 122

20( log )d n ndO

2 2

2( log )d n dO

3 2

2( )d nO

2

2( )d nO

3 2

2( )d nO

Fig.6. A comparison of three theoretical algorithm

Page 19: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Theoretical Analysis - Real-valued multi-instance learning Real-valued multi-instance learning

It is worthwhile to note that in several applications of the multiple instance problem, the actual predictions desired are real valued. For example, the binding affinity between a molecule and receptor is quantitative, so a real-valued label of binding strength is preferable.

S. Ray & D. Page (2001) showed that the problem of multi-instance regression is NP-Complete, furthermore, D. R. Dooly et al. (2001) showed that learning from real-valued multi-instance examples is as hard as learning DNF.

Nearly at the same time, R. A. Amar et al.(2001) extended the KNN, Citation-kNN and Diverse Density algorithms for real-valued multi-instance learning, they also provided a flexible procedure for generating chemically realistic artificial data sets and studied the performance of these modified algorithms on them.

Page 20: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Future work Further theoretical analysis of multi-instance

learning. Design multi-instance modifications for neural

networks, decision trees, and other popular machine learning algorithms.

Explore more issues which can be translated into multi-instance learning problems.

Design appropriate bag generating methods. ……

Page 21: Theoretical Analysis of Multi-Instance Leaning 张敏灵 周志华 南京大学软件新技术国家重点实验室 2002.10.11

Thanks