22
Neural Network: Department of Statistics and Information Science Dongguk University E-mail:[email protected] 2008 9 0-0

Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

  • Upload
    vodien

  • View
    219

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

Neural Network: 신경망모형

김진석

Department of Statistics and Information Science

Dongguk University

E-mail:[email protected]

2008년 9월

0-0

Page 2: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

차례

제 1 절 신경망모형의 구조 0-3

제 2 절 신경망모형의 적합 0-62.1 Learning Algorithms . . . . . . . . . . . . . . . . . . . . . 0-72.2 Some issues in learning networks . . . . . . . . . . . . . 0-7

제 3 절 R 실습 0-83.1 Edgar Anderson’s Iris Data . . . . . . . . . . . . . . . . . 0-9

제 4 절 나무모형과 신경망 비교 0-17

0-1

Page 3: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

A Neural network is a multi-stage regression or classification model.

• Regression problem

• Classification problem

y1i

y2i

x1h

x2h

y1h

y2h

x1o

x2o

Input layer hidden layer output layer

y1o

y2o

y3i

w11

w21

v11

v21

그림 1: Feed-forward neural network

0-2

Page 4: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

제 1절 신경망모형의구조

Input variable의 수가 p, hidden unit의 수가 M 그리고 output unit의 수가 K인 모형 :

fk(x) = gk

[c2(σ(c1(x))

)]= gk

[v0k +

M∑j=1

vjkσ(w0 +

p∑i=1

wijxi

)]

여기서 c1, c2를 combination function, 그리고 σ,gk를 activation func-tion이라고 부른다.

0-3

Page 5: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

위의 예에서 입력노드에서 j번째 hidden node (은닉노드)로 가는

combination function은 아래와 같은 형태를 가진다.

c1(y) = w0 +p∑

i=1

wijyi.

또한 은닉노드에서 출력노드 k로의 combination function은

c2(yh1 , . . . , y

hM ) = v0 +

M∑j=1

vjkyhj .

히든노드 j 에서 activation function은 다음과 같은 형태를 가지며,

σ(xhj ) =

exp(xhj )

1 + exp(xhj ), logistic function,

0-4

Page 6: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

출력노드에서의 activation function 형태는 아래와 같다.

gk(x) =

exp(xk)∑Kl=1 exp(xl)

softmax,

xk identity,

I(xk > 0) threshold.

표 1: Combination and activation functions for network unitsinput units hidden units output units

combination identity linear linear(c(x))

activation identity logistic logistic(f(x)) linear

softmax

0-5

Page 7: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

제 2절 신경망모형의적합

1-hidden layer 인 경우, 추정해야 할 모수(weights, parameters)는 아

래와 같다.

{w0j ,wj : j = 1, . . . ,M}M × (p+ 1)

{v0k,vk : k = 1, . . . ,K}K × (M + 1)

위의 모수 추정은 아래의 목적함수를 최소화 하는 모수값을 찾으면 된다.목적함수(Objective function, or Error functions):

E =N∑

i=1

K∑k=1

(yik − fk(xi))2, for regression

= −N∑

i=1

K∑k=1

yik log fk(xi), for classification

0-6

Page 8: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

2.1 Learning Algorithms

• Backpropagation Algorithm

• Steepest decent

• DFP(Davidon-Fletcher-Powell)

• BFGS(Broyden-Fletcher-Goldfarb-Shanno) algorithm

2.2 Some issues in learning networks

• Starting values

• Overfitting - regularization(stoping rule, weight decay)

• Scaling of input variables

0-7

Page 9: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

• Number of hidden units and layers

• Multiple minima

제 3절 R실습

library(nnet)

nnet(

formula, # A formula of the form class ~ x1 + x2 + ...

weights, # weights for each example. Defaults is 1.

size, # number of units in the hidden layer

data, # Data frame

linout, # switch for linear output units. Default logistic.

entropy, # switch for entropy, Default by least-squares.

softmax, # switch for softmax

0-8

Page 10: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

decay, # parameter for weight decay. Default 0.

maxit, # maximum number of iterations. Default 100.

trace, # switch for tracing optimization. Default TRUE.

)

3.1 Edgar Anderson’s Iris Data

This famous (Fisher’s or Anderson’s) iris data setThe measurements in cm of the variables sepal(꽃받침) length and

width and petal(꽃잎) length and width, respectively, for 50 flowersfrom each of 3 species of iris.

• Target: Species (setosa, versicolor, and virginica).

• Inputs: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width.

0-9

Page 11: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

> iris[sample(1:150, 6),]

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

110 7.2 3.6 6.1 2.5 virginica

68 5.8 2.7 4.1 1.0 versicolor

143 5.8 2.7 5.1 1.9 virginica

6 5.4 3.9 1.7 0.4 setosa

103 7.1 3.0 5.9 2.1 virginica

102 5.8 2.7 5.1 1.9 virginica

pdf("iris.pdf")

plot(iris[,1:3], col=as.integer(iris$Species),

pch=substring((iris$Species),1,1))

dev.off()

Partition data into training data and test data

0-10

Page 12: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

Sepal.Length

2.0 2.5 3.0 3.5 4.0

ss

sss

s

ss

s

s

s

ss

s

s ss

s

s

ss

s

s

ss

s sss

ss

sss

ss

s

s

s

ss

s s

s ss

s

s

ss

v

v

v

v

v

v

v

v

v

vv

vv v

v

v

vv

v

vv

vv

vvv

v v

vv

vvvv

v

v

vv

vvv

vv

v

v vv

v

v

v

v

v

v

vv

v

v

v

v

v

vvv

v v

vv

vv

v

v

v

v

vv

v

v vv

vv

v

vvv

v

vvv

vvv

v

vvvv

vv

v

4.5

5.5

6.5

7.5

sssss

s

ss

s

s

s

ss

s

s sss

s

ss

s

s

ss

ssss

ss

sss

ss

s

s

s

ss

ss

s sss

s

ss

v

v

v

v

v

v

v

v

v

vv

vv v

v

v

vv

v

vv

vv

vvv

vv

vv

vvv

v

v

v

vv

vv v

vv

v

vvv

v

v

v

v

v

v

vv

v

v

v

v

v

vvv

vv

vv

vv

v

v

v

v

vv

v

vvv

vv

v

vvv

v

vvv

vv

v

v

vvvvvv

v

2.0

2.5

3.0

3.5

4.0

s

sss

s

s

s s

ss

s

s

ss

s

s

s

s

ss

s

ss

ss

s

s ssss

s

s s

ss

ss

s

ss

s

s

s

s

s

s

s

s

s vv v

v

vv

v

v

vv

v

v

v

vvvv

v

v

v

v

v

v

v v vv

vv

vvv

v v

v

v

v

v

v

vv

v

v

v

v

vv v

v

v

v

v

vv v v

v

v

v

v

v

v

v

v

v

vv

v

v

v

v

v vv

v v

vv

vv

v

v

vvv

v

v

vv vv v

v

vv

v

v

v

v

vSepal.Width

s

sss

s

s

ss

ss

s

s

ss

s

s

s

s

ss

s

ss

ss

s

sssss

s

ss

ss

ss

s

ss

s

s

s

s

s

s

s

s

s vv v

v

vv

v

v

vv

v

v

v

vvvv

v

v

v

v

v

v

vvv vvv

vvv

v v

v

v

v

v

v

v v

v

v

v

v

vvv

v

v

v

v

vvv v

v

v

v

v

v

v

v

v

v

vv

v

v

v

v

v vv

v v

vv

vv

v

v

vvv

v

v

vv vvv

v

vv

v

v

v

v

v

4.5 5.5 6.5 7.5

ssss ss

s ss s ssss s

sssss ss

s

ssss ssss ss sss sss ssssss

s ss ss

vvv

vvv v

v

v

vv

vv

v

v

vvv

vv

v

v

vvv v

vvv

vvv v

vv v v

vvv

v vv

v

vvv v

v

v

v

v

vv v

v

v

vv

v

vvv

vvvv

vv

v

v

v

v

v

vv

vv

v vv

v

vv

vv

vv

vvvvv

vvvv vv

v

ss ss ss

sss s ssss s

ssssss s

s

sss sssss s sss

s sss sss ss

ss ss ss

vvv

vvv v

v

v

vv

vv

v

v

vvv

vv

v

v

v vvv

v vv

vvv v

vv vv

vvv

v vv

v

v vvv

v

v

v

v

vvv

v

v

vv

v

vv vv v vv

vv

v

v

v

v

v

vv

v v

v vv

v

vv

vv

vv

vvvvv

vvvv v vv

1 2 3 4 5 6 7

12

34

56

7

Petal.Length

그림 2: Iris data

0-11

Page 13: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

data(iris)

# use half the iris data

samp <- c(sample(1:50,25), sample(51:100,25), sample(101:150,25))

iris.tr<-iris[samp,]

iris.te<-iris[-samp,]

Neural Network modeling

ir1 <- nnet(Species~., data=iris.tr, size = 2, decay = 5e-4)

# weights: 19

initial value 83.513437

iter 10 value 1.427673

iter 20 value 0.620816

iter 30 value 0.526732

iter 40 value 0.484637

0-12

Page 14: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

iter 50 value 0.445681

...

iter 90 value 0.370717

iter 100 value 0.369270

final value 0.369270

stopped after 100 iterations

View Neural network model

> names(ir1)

"value" : 에러함수의 값

"wts" : 모수추정치

"fitted.values" : output추정치

"residuals" : 잔차

> summary(ir1)

0-13

Page 15: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

a 4-2-3 network with 19 weights

options were - softmax modelling decay=5e-04

b->h1 i1->h1 i2->h1 i3->h1 i4->h1 #은닉노드 1

-6.48 -5.24 -3.83 8.15 6.37

b->h2 i1->h2 i2->h2 i3->h2 i4->h2 #은닉노드 2

0.39 0.61 1.79 -3.02 -1.30

b->o1 h1->o1 h2->o1 #출력노드 1

-2.48 -1.83 9.14

b->o2 h1->o2 h2->o2 #출력노드 2

5.96 -9.13 -7.81

b->o3 h1->o3 h2->o3 #출력노드 3

-3.54 10.84 -1.26

예측함수 : Predict new examples by a trained neural net.

predict(

0-14

Page 16: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

object, # nnet class (모형).

newdata,# test data set (matrix or data frame)

type, # type = "raw", the matrix of values

# returned by the trained network;

# type = "class", the corresponding class .

...

)

Model 평가 : Test error

y<-iris.te$Species

p<- predict(ir1, iris.te, type = "class")

tt<-table(y, p)

p

0-15

Page 17: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

y setosa versicolor virginica

setosa 25 0 0

versicolor 0 21 4

virginica 0 0 25

Hidden unit의 수에 따른 Test error

test.err<-function(h.size)

{

ir <- nnet(Species~., data=iris.tr, size = h.size,

decay = 5e-4, trace=F)

y<-iris.te$Species

p<- predict(ir, iris.te, type = "class")

err<-mean(y != p)

0-16

Page 18: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

c(h.size, err)

}

out<-t(sapply(2:10, FUN=test.err))

pdf("nntest.pdf")

plot(out, type="b", xlab="The number of Hidden units",

ylab="Test Error")

dev.off()

제 4절 나무모형과신경망비교

library(tree)

ir.t <- tree(Species~., data=iris.tr, minsize=2, mincut=1)

0-17

Page 19: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

● ●

● ● ● ● ● ●

2 4 6 8 10

0.01

40.

016

0.01

80.

020

0.02

20.

024

0.02

6

The number of Hidden units

Tes

t Err

or

그림 3: Hidden unit의 수에 따른 Test error

0-18

Page 20: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

cvt<-cv.tree(ir.t, FUN=prune.misclass)

for(i in 2:5)

{

cvt$dev<-cvt$dev + cv.tree(ir.t, FUN=prune.misclass)$dev

cvt$dev <- cvt$dev/5

}

K<-cvt$size[which.min(cvt$dev)]

ir.tp<-prune.misclass(ir.t, best=K)

pdf("tree_iris.pdf")

par(mfrow=c(1,2))

plot(cvt); plot(ir.tp); text(ir.tp)

dev.off()

Comparison Tree model with NN

> p<- predict(ir.tp, iris.te, type = "class")

0-19

Page 21: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

> err<-mean(iris.te$Species != p)

[1] 0.06666667

> test.err(2) ## NN model with 2 hidden units

[1] 2.00000000 0.05333333

0-20

Page 22: Neural Network: 망모형 - Datamining Lab. at Dongguk universitydatamining.dongguk.ac.kr/lectures/2010-2/dm/dm_nn.pdf ·  · 2011-01-063.1 Edgar Anderson’s Iris Data This famous

size

mis

clas

s

05

1015

1 2 3 4 5

25 1 −Inf

|Petal.Length < 2.3

Petal.Length < 4.95

Petal.Width < 1.65Sepal.Length < 6

setosa

versicolorversicolorvirginica

virginica

그림 4: Tree model with Iris data

0-21