A Study of Automatic Medical Image Segmentation using … · 2019-06-28 · MR image segmentation, the obstacles for wide applications are the much-increased computational power for

A Stu dy of Autom atic M edical

Image Segm entation using

In dep en dent Comp onent Analy sis

연세대학교 대학원

생체공학협동과정

전기전자공학전공

배 수 현

A S tudy of A utom atic

M e dic al Im ag e S e g m ent ation u s ing

Independent Com ponent A naly s is

지도 유 선 국 교수

이 논문을 박사 학위논문으로 제출함

200 1년 6월 일


생 체 공 학 협 동 과 정


배 수 현

배수현의 박사 학위논문을 인준함

심사위원 인

심사위원 인

심사위원 인

심사위원 인

심사위원 인


200 1년 6월 일

D e d ica t e d to M y P ar ents

이 논문을 저를 위해 항상

기도하시던 부모님께 바칩니다

백록담 (白鹿潭) - 한라산 소묘 정지용

1

절정에 가까울수록 뻐꾹채 꽃키가 점점 소모(消耗)된다. 한마루 오르면 허

리가 스러지고 다시 한마루 위에서 모가지가 없고 나중에는 얼굴만 갸웃

내다본다. 화문(花紋)처럼 판(版) 박힌다. 바람이 차기가 함경도 끝과 맞서

는 데서 뻐꾹채 키는 아주 없어지고도 팔월 한철엔 흩어진 성신(星辰)처럼

난만(爛漫)하다. 산그림자 어둑어둑하면 그렇지 않아도 뻐꾹채 꽃밭에는 별

들이 켜 든다. 제자리에서 별이 옮긴다. 나는 여기서 기진했다.

2

암고란(巖古蘭), 환약(丸藥)같이 어여쁜 열매로 목을 축이고 살아 일어섰다.

3

백화(白樺) 옆에서 백화가 촉루가 되기까지 산다. 내가 죽어 백화처럼

흴 것이 숭없지 않다.

4

귀신도 쓸쓸하여 살지 않는 한모롱이, 도체비꽃이 낮에도 혼자 무서워 파

랗게 질린다.

5

바야흐로 해발 육천 척(尺) 위에서 마소가 사람을 대수롭게 아니 여기고

산다. 말이 말끼리, 소가 소끼리, 망아지가 어미소를, 송아지가 어미 말을

따르다가 이내 헤어진다.

6

첫새끼를 낳노라고 암소가 몹시 혼이 났다. 얼결에 산길 백 리를 돌아 서

귀포로 달아났다. 물도 마르기 전에 어미를 여윈 송아지는 움매 - 움매 -

울었다. 말을 보고도 등산객을 보고도 마구 매어달렸다. 우리 새끼들도 모

색(毛色)이 다른 어미한테 맡길 것을 나는 울었다.

7

풍란(風蘭)이 풍기는 향기, 꾀꼬리 서로 부르는 소리, 제주 휘파람새 휘파

람부는 소리, 돌에 물이 따로 구르는 소리, 먼데서 바다가 구길 때 솨 - 솨

- 솔소리, 물푸레 동백 떡갈나무 속에서 나는 길을 잘못 들었다가 다시 칡

덩쿨 기어간 흰돌배기 꼬부랑길로 나섰다. 문득 마주친 아롱점말이 피(避)

하지 않는다.

8

고비 고사리 더덕순 도라지꽃 취 삿갓나물 대풀 석이(石) 별과 같은 방울

을 달은 고산식물을 색이며 취(醉)하며 자며 한다. 백록담 조찰한 물을 그

리어산맥 위에서 짓는 행렬이 구름보다 장엄하다. 소나기 놋낫 맞으며 무

지개에 말리우며 궁둥이에 꽃물 익어 붙인 채로 살이 붓는다.

9

가재도 기지 않는 백록담 푸른 물에 하늘이 돈다. 불구(不具)에 가깝도록

고단한 나의 다리를 돌아 소가 갔다. 쫓겨온 실구름 일말(一抹)에도 백록담

은 흐리운다. 나의 얼굴에 한나절 포긴 백록담은 쓸쓸하다. 나는 깨다 졸다

기도(祈禱)조차 잊었더니라.

CON T E N T S

LIS T OF F IGU RE S ⅲ

LIS T OF T A B LE S ⅴ

A B S T RA CT ⅵ

1 . IN T R OD U CT ION 1

2 . B A S IC CON CEP T S OF IN D EP EN D E N T

COM P ON EN T A N A LY S IS 4

2.1 Principal Component Analysis 7

2.2 Independent Component Analysis 9

2.3 Overcomplete Representations of Independent Component Analy sis

19

2.4 Medical Image Segmentation T echniques 26

3 . M E D ICA L IM A GE S E GM EN T A T ION U S IN G

IN D EP E N D EN T COM P ON E N T A N A LY S IS 32

3.1 Acquiring CT Image 32

3.2 Acquiring MR Image 36

3.3 Methodology 39

4 . RE S U LT S & A N A LY S IS 48

4.1 Simple T est Example 48

4.2 Evaluations of Automatic Segmentation with T est Data 53

4.3 Automatic Segmentation with Medical Images 60

4.4 Evaluations of Medical Image Segmentation 67

- i -

5 . CON CLU S ION 78

REF E RE N CE S 80

A B S T RA CT (In K ore an ) 87

- ii -

LIS T OF F IGU RE S

2.1 Example 2- D data distribution and the corresponding principal

component and independent component axes [28]. 5

2.2 A simple example of principal component analysis 8

2.3 T he instantaneous mixing and unmixing model 13

2.4 Optimal information flow in sigmoidal neurons 14

2.5 Illustration of basis vector s in a two- dimensional data space with

tw o spar se sources (top) or three spar se sources (bottom ) 22

2.6 A hypothetical frequency distribution f (I ) of intensity values I (x , y )

for fat , muscle and bone, in a CT image. Low intensity values

correspond to fat tissues, whereas high intensity values correspond

to bone 29

3.1 A simplified x - ray beam I 0 attenuated through a pixel and result s

of the attenuated beam I . A tw o- dimensional matrix of linear

attenuation coefficient s of the image. 35

3.2 Selected image data which were used in this experiment . 40

3.3 Flow chart of Medical Image Segmentation Using independent

component analysis 41

3.4 Examples of three levels of kurtosis . Each of the distributions as

the same variance 43

4.1 T est example of independent component analy sis 51

4.2 T he simple test image consist ing of two cylinder s and one ellipse

52

- iii -

4.3 T he result of automatic segmentation 52

4.4 T he selected original data used in evaluation 54

4.5 Small squares extracted from mixed test data 56

4.6 Large squares extracted from mixed test data 57

4.7 Plot PE and UMA values about small squares 59

4.8 Plot of PE and UMA values about large squares 59

4.9 T he kurtosis graph of the input data set 62

4.10 T he input image and the segmentation result 63

4.11 T he segmentation result using threshold 64

4.12 T he segmentation result of CT image 65

4.13 Result of 16 selected axial CT image segmentation 66

4.14 Volume rendered image using the result of Figure 4.6 67

4.15 T he result of manual segmentation 69

4.16 T he sensit ivity (T rue Positivity Rate) comparison between

independent component analysis method and thresholding method 70

4.17 T he result of "False Positive Rate" comparison 73

4.18 T he result of mislabelling rate comparison 76

- iv -

L IS T OF T A B LE S

4.1 Probability of Error and Ultimate Measurement Accuracy value of

independent components . 58

4.2 T he kurtosis of input data set 61

4.3 T he result of Paired- t test about sensitivity with 0.05 statistical

significance 71

4.4 T he result of Paired- t test about "False Positive Rate" with 0.05

statistical significance 74

4.5 T he result of Paired- t test about mislabelling rate with 0.05

statistical significance 77

- v -

A B S T RA CT

A S tu dy o f A u tom at ic M e dic al Im ag e S eg m ent at ion

u s in g In depen dent Com pon en t A n aly s i s

Soohyun Bae

Dept . of Biomedical Engineering

T he Graduate School

Yonsei University

Medical image segmentation is a process that partit ions original

medical images into some homogeneous regions meaningful for the

computer aided diagnosis and visualization . T his dissertation proposes

an automatic medical image segmentation technique based on

independent component analysis . Independent component analysis is a

method for solving the blind source separation problem . It finds a linear

coordinate system such that the resulting signals are as statistically

independent as possible.

Use of independent component analysis as an automatic medical

image segmentation technique allow s for more accurate segmentation of

medical images under the assumption that medical images consist of

some statistically independent part s .

- v i -

T he proposed method is applied to CT images and numerically

synthesized test data to demonstrate the performance of automatic

segmentation . T he performance evaluation methods which w ere chosen

in this dissertation were Probability of Error (PE ) and Ultimate

Measurement Accuracy (UMA ) methods . T he result of automatic

segmentation was also compared to a general segmentation method

using threshold based on sensitivity (T rue Posit ive Rate),

specificity (1- False Positive Rate) and mislabelling rate . Statistical

Paired- t test was done about the evaluation result . For the test data,

most of the PE and UMA values are close to zero, the T PR(T rue

Posit ive Rate)s are over 95 percent , FPR(False Positive Rate)s are 1

percent . T he mislabelling rate is near 1 percent . It means that the

automatic method demonstrated in this dissertation has a good result . A

statistical Paired- t test of T PR, FPR and mislabelling rate using 0.05

statistical significance has p value much low er than statist ical

significance. It means that the result of automatic method proposed in

this dissertation has better result than the general method.

Key w ords : Segmentation , Independent Component Analy sis, Medical

Image, Statistically independent component

- v ii -

CH A P T E R 1

IN T R OD U CT ION

T he properties of medical images like Computed T omography (CT ),

Magnetic Resonance Imaging (MRI) over other diagnostic imaging

modalities are their high spatial resolution and excellent discrimination

of soft t is sues, bones and other internal organs . T hese kind of medical

images provide rich information about anatomical structure, enabling

quantitat ive pathological or clinical studies ; the derivation of

computerized anatomical atlases ; as w ell as pre- and intra- operative

guidance for therapeutic intervention .

Advanced applications that use the morphologic contents of medical

images frequently require segmentation of the imaged volume into each

organ types . T his problem has received considerable attention - the

comprehensive survey article by Besdek et al.[1] list s 90 citations .

Segmentation of medical images is a prerequisit e for a variety of image

analysis and visualization tasks . It is often performed by commercial

softw are using existing algorithms such as region growing,

thresholding , boundary detection , and morphological filtering [2]- [7].

However , incomplete segmentation frequently occurs because of several

difficulties . First , partial volume artifact s due to a small gap between

voxels compromise medical images resolution . It leads to ambiguous

boundaries and mixed voxels . Second, adjacent structures connected to

the medical images have similar intensity distributions . A s a result , the

- 1 -

medical images cannot be easily separated from other interconnected

structures . Hence, localization of the features from a large number of

sectional images represent s a major challenge.

T he manual modification method is the most common w ay of

segmentation in cases of incomplete automatic segmentation [3]- [6][8]. It

is a time- consuming and tedious task because extensive user interaction

is involved to modify the incorrectly segmented border on each

sectional image. It also requires experienced users to carefully define

the features on medical images because of the complexity of that ones .

Nevertheless, resultant rendered surfaces still appear uneven . Even

though the result s of done by the experienced user s don 't have the

consistency .

Many segmentation approaches exist to eliminate the labor - intensive

work of the manual method. One of them is the deformable active

contour model, which is also referred to as the snake model[8]- [13]. It

was successfully applied to a variety of image segmentation tasks, but

it still requires user - interaction to locate the initial contour and adjust

internal parameter s of the model. Another approach is the

knowledge- based automatic segmentation tuned specifically to a

particular human organ [14]. How ever , it is generally thought to be

extremely difficult to formalize complex knowledge of the human

anatomy . A promising alternative is the interactive segmentation ,

facilit ating user s to guide the segmentation process based on 3D visual

feedback [15][16]. Instead of relying on poorly represented knowledge, it

utilizes an expert ' s judgement . Although the feasibility and utility of the

- 2 -

interactive segmentation approach have been demonstrated in CT and

MR image segmentation , the obstacles for wide applications are the

much - increased computational power for real- time 3D visualization and

the need for appropriate operator s to resolve mixed structures .

In this paper , I propose a new automatic approach to segment the

features from medical images using Independent Component

Analysis (ICA ) or Blind Source Separation (BSS ). Independent component

analysis decompose each signal of an ensemble into components (also

called 'basis vector s ' ) that are as independent as possible by a linear

transformation of the signals [17]- [19]. T he amplitude of a particular

component is extracted by a corresponding weight vector (also called a

' filter ' [17]). Using this kind of characteristics of independent component

analysis , I propose a new automatic algorithm to segment medical

images .

T his thesis is organized as follow s . In Chapter 2, the basic concept s

of independent component analysis , principal component analysis and

some evaluation methods are presented. Chapter 3 describes how the

medical image can be acquired, the assumptions chosen in this thesis

and the methodology will be described. Chapter 4 show s the result of

my experiment and the evaluation of the result will be illustrated. In

Chapter 5, the conclusion and the future work will be described.

- 3 -

CH A P T E R 2

B A S IC CON CE P T S OF IN D E PE N D E N T

COM P ON E N T A N A L Y S IS

T he goal of blind signal separation is to recover independent sources

given only sensor observations that are linear mixtures of independent

source signals [26]. T he term blind indicates that both the source signals

and the w ay the signals w ere mixed are unknown . Independent

component analysis is a method which can be used to solve the blind

signal separation problem . It is a w ay to find some w eigh vector which

make the resulting signals are as statistically independent from each

other as possible. T here have been some algorithms doing source

separation like Principal Component Analysis (PCA ). Principal component

analysis is correlation - based transformation . It is perhaps the oldest and

best - known technique in multivariate analysis [32][51]. It is very similar

to SVD(Singular Value Decomposition ) algorithm which separates the

second- order dependencies . In contrast to correlation - based

transformations , independent component analysis not only decorrelates

the signals (second- order statistics ) but also reduces higher - order

statistical dependencies [26]. Independent component analysis is a

generalization of principal component analy sis that separates the

high - order dependencies in the input , in addition to the second- order

dependencies [27]. Principal component analysis is a way of encoding

second- order dependencies in the data by rotating the axes to

- 4 -

correspond to directions of maximum covariance. Consider a set of data

point s derived from tw o underlying distributions as shown in Figure

2.1.

Figure 2.1 Example 2- D data distribution and the corresponding

principal component and independent component

axes [28].

Principal component analysis models the data as a multivariate

Gaussian and w ould place an orthogonal set of axes such that the two

distributions would be completely overlapping . Independent component

analysis does not constrain the axes to be orthogonal, and attempts to

place them in the directions of statistical dependencies in the data . Each

- 5 -

weight vector in independent component analysis att empts to separate a

portion of the dependencies in the input , so that the dependencies are

separated from betw een the elements of the output . T he projection of

the two distributions onto the independent component analysis axes

would have less overlap, and the output distributions of the two w eight

vectors would be highly kurtotic[29], which means that the output

distributions are in the state that a large concentration of values are

near zero, with rare occurrences of large positive or negative values in

the tails . T his is equivalent to the minimum entropy codes discussed by

Barlow [24].

Bell and Sejnow ski[19] recently developed an algorithm for

separating the statistically independent sources of mixed signal through

unsupervised learning . T he algorithm is based on the principle of

maximum information transfer betw een sigmoidal functions . T his

algorithm is the generalization of Linsker ' s information maximization

principle[30] to the multi- unit case and maximizes the joint entropy of

the output unit s . Another way of describing the difference betw een

principal component analysis and independent component analysis is

therefore that principal component analysis maximizes the joint

variance (covariance) of the outputs , whereas independent component

analysis maximizes the joint entropy of the outputs .

- 6 -

2 .1 Prin c ipal Com pon ent A n aly s i s

Principal component analy sis [31][32] is a well- established technique

for dimensionality reduction, that is , representing a set of m

dimensional vectors with n<m components for each vector so that

information is lost "as litt le as possible." It w as first introduced by

Pear son (1901), who used it in a biological context to recast linear

regression analysis into a new form . It w as then developed by

Hotelling (1933) in work done on psychometry . It appeared once again

and quite independently in the setting of probability theory , as

considered by Karhunen (1947); and w as subsequently generalized by

Loéve(1963)[31]. Examples of it s many applications include data

compression , image processing , visualization , exploratory data analysis,

pattern recognit ion and time series prediction . In terms of linear algebra

the problem of dimensionality reduction consist s of finding a new basis

for the data so that if w e drop out or zeroing some of the components

in the new basis , the reconstruction error is as small as possible.

Considering the simple example of Figure 2.2, in Figure 2.2(a) w e have

a two dimensional data set in which there is a significant

correlation (linear dependence) betw een the component s . If w e project the

data onto the subspace indicated by the solid line in the figure - that

is, represent the data with just one number , the value of projection onto

the subspace - we get the data set of Figure 2.2(b ).

- 7 -

Figure 2.2 A simple example of principal component analysis

(a) A data set and it s principal axes

(b ) Reduction of the data to one dimension

T he most common derivation of principal component analysis is in

terms of a standardized linear projection which maximizes the variance

in the projected space[33]. For a set of observed n - dimensional data

vectors {t n }, n {1, , N }, the q principal axes w j , j {1 , , q}, are

those orthonormal axes onto which the retained variance under

projection is maximal. It can be shown that the vector s w j are given

by the q dominant eigenvectors (i.e . those with the largest associated

eigenvalues j ) of the sample covariance matrix

S = 1N n

( t n - t ) ( t n - t ) T , where t is the data sample mean, such that

Sw j = jw j . T he q principal components of the observed vector t n are

- 8 -

given by the vector x n = W T ( t n - t ) , where W = ( w 1 , w 2 , , w q ) . T he

variables x j are then uncorrelated such that the covariance matrix

1N n

x n x nT is diagonal with element s j .

A complementary property of principal component analysis , and that

most closely related to the original discussions of Pearson is that , of all

orthogonal linear projections x n = W T ( t n - t ) , the principal component

projection minimizes the squared reconstruction errorn

||t n - t n | | 2 ,

where the optimal linear reconstruction of t n is given by t n = Wx n + t .

How ever , a notable feature of these definit ions of principal

component analysis (and one remarked upon in many text s ) is the

absence of an associated probabilistic model for the observed data [33].

2 .2 In depen dent Com pon en t A n aly s i s

T wo events A and B are called independent if

(2.1)P ( A B ) = P ( A )P ( B )

Noting that the conditional probability P ( B A ) is given by

(2.2)P ( B A ) = P ( A B )P ( A )

,

we can see that independence implies P ( B A ) = P ( B ) , if P ( A ) 0 .

- 9 -

A ssume that w e have an input signal which is composed of some

mixed source signals and the mixing process and any prior information

about the input signal are unknown . If the probabilities of the output

signals can be made to satisfy Eg . (2.1), the output signals are

independent components of the input signal. Making the output signals

as independent as possible is the goal of independent component

analysis and it can be accomplished by maximizing the mutual

information of the output signals . T he mutual information between X

and Y is the sum of marginal entropies minus the joint entropy . T his

is defined as

(2.3)

I ( Y , X ) = H ( X ) + H (Y ) - H ( X , Y )

= H ( X ) - H ( X Y )

= H (Y ) - H (Y X )

where X is the input , Y is the output and H ( Y ) is the entropy of

the output , while H ( Y X ) is whatever entropy the output has which

didn ' t come from the input [19]. Entropy is defined as

(2.4)H ( X ) =x X

P (x ) log 1P (x )

,

where the ensemble X is a random variable x with a set of possible

outcomes . For P (x ) = 0 , the entropy is zero by definition . H ( X ) is

always greater or equal to zero. H ( X , Y ) is the joint entropy of tw o

- 10 -

variables, interpreted as the redundancy betw een X and Y or ,

alternatively , as the reduction in uncertainty of one variable(e .g . X )

due to the observation of the other variable Y . Independent component

analysis attempt s to maximize the joint entropy of suitably transformed

component maps using neural netw ork , and in so doing reduces the

redundancy between the distributions of map values for different

components . T his, in effect , result s in blind separation of the mixed

signal into spatially independent components .

T he basic problem of independent component analysis is how to

maximize the mutual information that the output Y of a neural

netw ork processor contains about it s input X . T o solve the problem ,

we consider here only the gradient of information theoretic quantities

with respect to some parameter , , and assume that Y is the function

of , in our netw ork

(2.5)I ( Y , X ) = H ( Y )

because H ( Y X ) does not depend on . T his can be seen by

considering a system which avoids infinities :

(2.6)Y = G ( X ) + N ,

where G is a nonlinear squashing function and N is additive noise on

the output s . In this case H (Y X ) = H ( N ) [34]. Whatever the level of

- 11 -

this additive noise, maximization of the mutual information , I ( Y , X ) , is

equivalent to the maximization of the output entropy , H (Y ) , because

H ( N ) = 0 . Bell and Sejnow ski ' s independent component analysis

algorithm [19] is an unsupervised learning rule that w as derived from

the principle of optimal information transfer through sigmoidal neurons

based on Eq. (2.5) and Eq. (2.6). T hat is described in Figure 2.3.

Independent sources s become mixed by A . T he observed sources are

x . T he goal is to learn W that invert s the mixing A and u are the

estimates of the recovered sources . T he infomax approach is one w ay

to find the unmixing system W . It requires a nonlinear transfer function

g ( u ) .

Consider the case of a single input , x , and output , y , passed

through a nonlinear squashing function , g .

(2.7)u = x + 0

(2.8)y = g (u ) = 11 + e - u

- 12 -

Figure 2.3 T he instantaneous mixing and unmixing model.

Independent sources s become mixed by A . T he

observed sources are x . T he goal is to learn W that

invert s the mixing A and u are the estimates of the

recovered sources . T he infomax approach is one way to

find the unmixing sy stem W . It requires a nonlinear

transfer function g ( u ) .

A s illustrated in Figure 2.4, the optimal w eight on x for

maximizing transfer is the one that best matches the probability density

of x to the slope of the nonlinearity . T he optimal produces the

flattest possible output density , which in other words , maximizes the

entropy of the output .

- 13 -

Figure 2.4 Optimal information flow in sigmoidal neurons (a) Input

x having density function f x ( x ) , in this case a

gaussian , is passed through a non - linear function g (x ) .

T he information in the resulting density , f y (y ) depends

on matching the mean and variance of x to the

threshold, 0 , and slope, , of g (x ) (see Schraudolph et

al 1991). (b ) f y (y ) is plotted for different values of the

w eight . T he optimal weight , opt transmit s most

information [19].

T he derivation of independent component analysis is as follow s .

When g (x ) is monotonically increasing or decreasing (i.e ., has a unique

inver se), the probability density function of the output , f y (y ) , can be

written as a function of the probability density function of the input ,

f x ( x ) [50]

- 14 -

(2.9)f y (y ) =f x ( x )

| yx |

(2.10)

H (y ) = - E [ ln f y (y ) ]

= --

f y ( y ) ln f y ( y ) dy

Substituting Eq. (2.9) into Eq. (2.10) gives Eq. (2.11).

(2.11)H ( y ) = E [ ln | yx |] - E [ ln f x ( x ) ]

T he second term on the right (the entropy of x ) may be considered to

be unaffected by alterations in a parameter determining g ( x ) .

T herefore in order to maximize the entropy of y by changing , w e

need only concentrate on maximizing the fir st term, which is the

average log of how the input affect s the output as shown in Eq. (2.12)

(2.12)

H = ( ln | yx |)

= ( yx

) - 1 ( yx

)

In the case of the logistic transfer function like Eq. (2.7) and (2.8), each

terms in Eq. (2.12) are like Eq. (2.13) and (2.14).

- 15 -

(2.13)yx

= y ( 1 - y )

(2.14)( yx

) = y ( 1 - y )( 1 + x ( 1 - 2y ) )

Dividing Eq. (2.14) by (2.13) gives the learning rule for the logistic

function , as calculated from the general rule of Eq. (2.12):

(2.15)1 + x ( 1 - 2y )

(2.16)0 1 - 2y

T he optimal w eight is found by gradient ascent on the entropy of the

output , y with respect to .

When there are multiple input s and outputs , maximizing the joint

entropy of the output encourages the individual outputs to move

towards statistical independences . It is very similar to the derivation of

single input and single output . When the form of the nonlinear transfer

function g is the same as the cumulative density functions of the

underlying independent components (up to scaling and translation ) it can

be shown that maximizing the mutual information betw een the input X

and the output Y also minimizes the mutual information betw een the

u i [18][34]. Many natural signals , such as sound sources , have been

shown to have a super - Gaussian distribution , meaning that the kurtosis

- 16 -

of the probability distribution exceeds that of a Gaussian [19]. For

mixtures of super - Gaussian signals, the logistic transfer function has

been found to be sufficient to separate the signals [19].

T he update rule for the w eight matrix , W , for multiple inputs and

outputs is given by

(2.17)W = - ( H ( Y )

W)WT W

= - ( I + yu T )W

where y =y i

y i

u i=

u iln

y i

u iand is a learning rate. T he

learning rate is decided empirically and typically near 0.001. T he W T W

term in equation (2.17), first proposed by Amari et al.[20], avoids

matrix inver sions and speeds convergence. During training , the learning

rate is reduced gradually until the weight matrix W stops changing

appreciably (e.g . root mean square change for all elements< 10 - 6 ). We

employed the logistic transfer function , Eq. (2.8), giving y = 1 - 2y i .

T he algorithm include a "sphering" step prior to learning [18]. T he row

means are subtracted from the dataset , X , and then X is passed

through the zero- phase whitening filt er , W z , which is twice the inverse

square root of the covariance matrix :

(2.18)W z = 2 <XX T >- 1

2

- 17 -

T his removes both the fir st and the second- order statistics of the data;

both the mean and covariances are set to zero and the variances are

equalized. T he full transform from the zero- mean input w as calculated

as the product of the sphering matrix and the learned matrix ,

W I = WW z . T he pre- whitening filt er in the independent component

analysis algorithm has the Mexican - hat shape of retinal ganglion cell

receptive fields which remove much of the variability due to

lighting [18].

T he difference betw een independent component analysis and principal

component analysis is illustrated as follow s . Consider a set of data

point s derived from tw o underlying distributions as shown in Figure

2.1. Principal component analysis encodes second order dependencies in

the data by rotating the axes to correspond to directions of maximum

covariance. Principal component analy sis models the data as a

multivariate Gaussian and w ould place an orthogonal set of axes such

that the projections of the two distributions w ould be completely

overlapping . Independent component analysis does not constrain the

axes to be orthogonal, and attempts to place them in the directions of

statistical dependencies in the input , so that the distributions onto the

independent component analysis axes w ould have less overlap, and the

output distributions of the two weight vector s w ould be kurtotic[29].

- 18 -

2 .3 Ov erc om ple t e R epre s ent at ion s o f In depen den t

Com pon en t A n aly s i s

T he goal of independent component analysis is to find a linear

transformation W of the dependent sensor signals X that makes the

outputs as independent as possible:

(2.19)U (t ) = WX = WA S (t )

where U is an estimate of the sources . T he sources are perfectly

recovered when W is the inverse of A up to permutation and scale

change.

(2.20)P = R S = WA

where R is a permutation matrix and S is the scaling matrix . T he

tw o matrices define the performance matrix P so that if P is

normalized and reordered a perfect separation leads to the identity

matrix . For the linear mixing and unmixing model, the following

assumptions should be chosen [27][35][36]:

1. T he number of sensors is greater than or equal to the number

of sources N M

2. T he sources S (t ) are at each time instant mutually independent .

- 19 -

3. At most one source is normally distributed.

4. No sensor noise or only low additive noise signals are permitted.

Assumption 1 is needed to make A a full rank matrix . Assumption 2

is the basis of independent component analy sis and can be expressed as

follow s:

(2.21)p ( S (t ) ) =M

i = 1p ( S i ( t ) )

Assumption 3 and 4 is necessary to recover sources using the infomax

condition, in which the mutual information betw een outputs is only

minimized for the low noise case.

A ssumption 1 means that the standard formulation of independent

component analysis requires at least as many sensors as sources . T hat

is one of the major drawback of independent component analysis but

also one distinct feature of independent component analysis . Lewicki

and Sejnow ski have proposed a generalized independent component

analysis method for learning overcomplete representations of the data

that allow s for more basis vector s than dimensions in the input [28].

T his technique overcomes assumption 1 and 4 assuming a linear mixing

model with addit ive noise. T his approach provides a natural solution to

decomposition by finding the maximum aposteriori representation of the

data . T he prior distribution on the basis function coefficient s removes

the redundancy in the representation and leads to representations that

- 20 -

are sparse and are a nonlinear function of the data .

T he goal of this overcomplete representations is described in Figure

2.5. In a two- dimensional data space, the observations X in Figure

2.5(a , b) w ere generated by a linear mixture of 2 independent random

spar se sources . In this space, Figure 2.5(a ) show s orthogonal basis

vectors (principal component analysis ) and Figure 2.5(b ) show s

independent basis vectors .

If the 2- dimensional observed data are generated by 3 sparse

sources as shown in Figure 2.5(c, d) the complete independent

component analysis representation Figure 2.5(c) cannot model the data

adequately but the overcomplete independent component analy sis

representation Figure 2.5(d) finds 3 basis vector s that fit the underlying

distribution of the data .

T he observed M - dimensional data x = [x 1 , , x M ] T may be modeled

as a linear overcomplete mixing matrix , A, (M×N ) with additive noise.

(2.22)x = A s + n

where s = [ s 1 , , sN ] T are the sources and n is assumed to be a

white Gaussian noise with variance 2 so that

(2.23)log P (x A , s ) - 12 2 (x - A s )

- 21 -

Figure 2.5 Illustration of basis vector s in a two- dimensional data

space with tw o sparse sources (top) or three spar se

sources (bottom ). (a ) Principal component analysis finds

orthogonal basis vectors and (b ) Independent

component analysis representation finds independent

basis vectors . (c) Independent component analysis

cannot model the data distribution adequately with

three sources but (d) the overcomplete independent

component analy sis representation finds 3 basis vector s

that match the underlying data distribution [28]

- 22 -

It is also assumed that the sources s i are mutually independent , so

that the joint probability distribution has the form Eq. (2.21), and each

source s i has a sparse distribution , such as the Laplacian density .

(2.24)P ( s i) ex p ( - |s i |)

Given the above model and assumptions, the goal is to infer both

the basis vectors A and the sources s given the mixtures x .

Due to the additive noise and the rectangular mixing matrix A , the

solution for s cannot be found by the pseudoinverse s = A + x . A

probabilistic approach to estimating the sources is based on finding the

maximum a posteriori value of s :

(2.25)s = m ax

sP ( s x ,A )

= m axs

P ( x A , s )P ( s )

Given basis vector s A, and ovservation x , Eq. (2.25) can be

optimized by gradient ascent on the log posterior distribution .

In the case of zero noise and P ( s ) Gaussian, maximizing Eq. (2.25)

is equivalent to m in s | |s | | 2 subject to x = A s . T he solution can be

obtained with the pseudoinverse, s = A + x , and is a linear function of

A and s . In the case of zero noise and P ( s ) Laplacian like Eq.

- 23 -

(2.24), maximizing Eq. (2.25) is equivalent to m in s ||s || 1 subject to

x = A s . Unlike the Gaussian prior , this solution cannot be obtained by

a simple linear operation .

T he objective for learning the basis vector s, A , is to maximize the

probability of the data which requires marginalizing over all possible

sources :

(2.26)P (x A ) = P (x A , s )P ( s )ds

For general overcomplete bases, this integral is intractable. For the

special case of zero noise and A invertible (a complete basis ), the

integral in Eq. (2.25) is solvable and leads to the standard independent

component analysis learning algorithm [19][26][37]. Lewicki and

Sejnow ski approximated Eq. (2.25) by fitting a multivariate Gaussian

around s . T he basis vector s were learned by performing gradient

ascent on the approximation to log P (x A ) . T he brief derivation of the

basis vector s as follow s [28]. For a set of indepen dent dat a v ectros

x 1:K = x 1 , , x K , ex pan d the post er ior probability den sity by a saddle point

approx im at ion :

(2.27)

log P (x 1:K |A ) = log k P (x k |A )

K L2

log2

+ K M2

log (2 )

+K

k = 1[ log P ( s k ) -

2( x k - A s k ) 2 - 1

2log det H ( s k ) ]

- 24 -

T he derivative of Eq. (2.27) is like Eq. (2.28).

(2.28)

log P ( x | A ) =

log P ( s) -2

(x - A s) 2 - 12

log det H

T he first and third terms in Eq. (2.28) is like in Eq. (2.29) and (2.30).

T he seconde term in Eq. (2.28) doesn 't have gradient component so it

vanishes to zero.

(2.29)log P ( s )A

= - W T z s T

(2.30)log det HA

= 2 A H - 1 - 2 W T y sT

Gathering above equation leads to

(2.31)A A A T

Alog P ( x A )

- A ( ( s ) sT

+ I)

where ( s i ) =log P ( s i )

s i

is called the score function, and I is the

identity matrix . T he prefactor A A T produces the natural gradient

extension [20] which speeds convergence. Note that A in Eq. (2.31) is

not restricted to be a square matrix .

- 25 -

In the case of a Laplacian prior on s m ,

(2.32)

log P ( s )m

log P ( sm )

= M log -M

m = 1|s m |

Because log P ( s ) is piece- wise linear in s , the curvature is zero, and

there is a discontinuity in the derivative at zero. T o approximate the

volume contribution for P ( s ) , w e use

(2.33)s mlog P ( s ) - t anh ( s m )

which is equivalent to using the approximation P ( s m ) cosh - / ( s m ) .

For large this approximates the true Laplacian prior wile staying

smooth around zero. T his leads to the following diagonal expression for

the second- derivative matrix :

(2.34)2

s 2m

log P ( s m ) = - sech 2 ( s m )

2 .4 M e dic al Im ag e S e g m ent at ion T e chniqu e s

Since the advent of computed tomography, methods for automatically

detecting anatomical object s in CT and MR images have been active

- 26 -

areas of research in the medical imaging community . But the

anatomical object s of medical images are complex and in some cases

the anatomical scale of object s is small with the field view of the

images . As a result , the spatial contrast of the features of interest is

very poor and the features become very noisy and ambiguous .

T his section is intended to provide a review of exist ing image

segmentation methods and provide a motivation for the medical image

segmentation using independent component analysis .

If an image has been preprocessed appropriately to remove noise and

artifact s , segmentation is often the key step in interpreting the image.

Image segmentation is a process in which regions or features sharing

similar characteristics are identified and grouped together . Image

segmentation may use statistical classification, thresholding , edge

detection , region detection , or any combination of these techniques . T he

output of the segmentation step is usually a set of classified elements ,

such as tissue regions or tis sue boundaries .

Most segmentation techniques are either region - based or edge- based.

Region- based techniques rely on common patterns in intensity values

within a cluster of neighboring pixels . T he cluster is referred to as the

region , and the goal of the segmentation algorithm is to group regions

according to their anatomical or functional roles . Edge- based techniques

rely on discontinuities in image values between distinct region , and the

goal of the segmentation algorithm is to accurately demarcate the

boundary separating these regions .

T hresholding is one of the most important and the simplest

- 27 -

approaches to image segmentation . It is used extensively in many

image processing applications [39]- [43]. It is one of the region - based

segmentation .

T hresholding is based on the notion that regions corresponding to

different tis sue types can be classified by using a range function

applied to the intensity values of image pixels . T he assumption is that

different t is sue types will have a distinct frequency distribution and can

be discriminated on the basis of the mean and standard deviation of

each distribution which is shown in figure 2.6. For example, given a

tw o- dimensional image I (x , y) , we can define a simple threshold rule to

classify bone or a compound threshold rule to classify soft tissue:

Simple: (if I (x , y) > I 0 ⇒ Bone)

Compound : (if I 0 < I (x , y ) < I 1 ⇒ Soft tissue)

In practice, the type of thresholding just described can be expected

to be successful in highly controlled environments . One of the areas in

which this is often possible is in industrial inspection applications,

where illumination control is usually feasible . T hresholding has been

used by Chow and Kaneko to segment ventricles form cineangiograms

of the human heart [44]. T hey chose this technique because of the

strong bimodal distribution of image values corresponding to regions

that are interior and exterior to the ventricles .

- 28 -

Figure 2.6 A hypothetical frequency distribution f (I ) of intensity

values I (x , y ) for fat , muscle and bone, in a CT image.

Low intensity values correspond to fat tissues , whereas

high intensity values correspond to bone. Intermediate

intensity values correspond to muscle tissue. F + and

F - refer to the false positive and false negatives ; T +

and T - refer to the true positives and true negatives .

- 29 -

T he major drawback to threshold- based approaches is that they

often lack the sensitivity and specificity needed for accurate

segmentation . Sensitivity is defined as the true positivity rate for a

function or a test that must detect the presence or absence of some

intrinsic property [45][46][47]. Hence, the purpose of the test is to

determine, as accurately as possible, the presence or absence of this

intrinsic property . Formally , the sensitivity of segmentation test is

defined as follow s:

(2.35)SE N SIT IV IT Y = T R U E +IN T R IN SIC +

where T R U E + is defined as the number of samples that have the

intrinsic property and were categorized by the test as positive, and

IN T R IN SIC + is defined as the total number of elements that have the

intrinsic property (regardless of the outcome of the test ).

In Figure 2.6, the area under the Bone curve and to the right of the

Bone T hreshold line is classified as T + because this test will correctly

categorize such elements as being bone. T he area under the Bone curve

and to the left of the Bone T hreshold line is classified af F - because

this test will incorrectly classify such element s as no being bone. In

this example, the sensitivity is a measure of how w ell this test can

categorize as bone those tissues that are truly bone.

On the other hand, specificity is defined as the complement of the

false posit ive rate:

- 30 -

(2.36)SP E C IF IC IT Y = 1 - F A L SE -IN T R IN SIC -

where F A L SE - is defined as the total number of samples that do not

have the intrinsic value but w ere categorized incorrectly as true, and

IN T R IN SIC - is defined as the total number of elements that do not

have the intrinsic property [45][46].

In Figure 2.6, the area under the Muscle curve that lies to the right

of the Bone T hreshold line is classified af F + because this test will

incorrectly categorize such elements as being bone. Conver sely , the area

under the Bone curve that lies to the left of the Bone T hreshold line is

classified as F - because this test will correctly categorize such

elements as not being bone. T he specificity of this test is a measure of

how well this test will reject from the bone category those elements

that truly are not bone. Because their denominator s are different , the

sensitivity and specificity measures are not complements of one another .

In fact , they measure tw o distinct aspect s of any segmentation test : the

ability to correctly reject false properties and the ability to correctly

accept true properties .

- 31 -

CH A P T E R 3

M E D ICA L IM A GE S E GM E N T A T ION U S IN G

IN D E PE N D E N T COM P ON E N T A N A L Y S IS

CT and MR is the widely used medical image because of it s high

spatial resolution and excellent discrimination of soft tissues , bones and

other internal organs compared with other medical images . Medical

images provide rich information about anatomical structure, enabling

quantitat ive pathological or clinical studies ; the derivation of

computerized anatomical atlases ; as w ell as pre- and intra- operative

guidance for therapeutic intervention .

T o understand the design specifications and the trade- offs made in

the development of the medical image segmentation using independent

component analy sis, it is helpful first to review the techniques acquiring

medical image like CT and MR.

3 .1 A c quirin g CT Im ag e

T he purpose of the CT scanner is to acquire a large

number (100- 1200) of CT projections around the patient [48]. Unlike film,

which acquires a two- dimensional image, the detector or detector array

on a CT scanner only acquires data along a thin line. So for one

complete set of CT projections goind all the w ay around the patient ,

- 32 -

only a single CT image can be computed. Before the acquisit ion of the

next slice, the table that the patient is lying on is moved slightly in the

cephalic or cranial direction , which positions a different axial slice of

tissue in the path of the x - ray beam for the next series of acquired

projections .

T he basic principle of the x - ray CT involves x - ray generation ,

detection , digitization , processing , and computer image reconstruction .

X- ray s passing through a body are attenuated at different rates by

different tissues . T he number or data by the analog to digital

converter s (ADC). T he digital data are fed into a computing device for

image reconstruction .

T he photon density that emerges when a narrow beam of

monoenergetic photons with energy E and intensity I 0 passes through

a homogeneous absorber of thickness x can be expressed as :

(3.1)I = I 0 ex p [ - ( , Z , E )x ]

where , , and Z are the linear attenuation coefficient , density of the

absorber , and atomic number , respectively . In the energy region where

most commercial x - ray CT systems are being engaged for medical

tomography, tw o types of interactions are dominant , namely

photoelectric absorption and Compton scattering [38].

In photoelectric interaction the x - ray photon is completely absorbed

by transferring all of it s energy to an electron . In Compton scattering ,

on the other hand, scattered x - rays undergo both a directional and

- 33 -

energy change. If the absorber is not homogeneous , ( , Z ) is simply a

space- variant function dependent on the distributions of the material.

By directing a monochromatic x - ray beam in the y direction , for

example, the output x - ray intensity I ( x ) can be written as Eq. 3.2.

(3.2)I (x ) = I 0 ( x ) ex p [ - ( x , y ) dy ]

where I 0 and (x , y ) are the incident x - ray intensity and x - ray

attenuation coefficient , respectively . For instance, by taking the

logarithm and rearranging Eq. (3.2), one can obtain projection data p (x )

like in Eq. (3.3).

(3.3)p (x ) = - ln [ I (x )

I 0 (x )]

= (x , y ) dy

where p ( x ) is equivalent to a simple integration or summation of the

total attenuation coefficient s along the x - ray path . Eq. (3.4) show s the

digital form of Eq. (3.3).

(3.4)p ( x ) =i = N

i = 1i (x , y )

Eq. (3.4) represents the summation of the attenuation coefficient s of

N - pixels along a given x - ray path .

In x - ray CT the contrast is associated with the different attenuation

- 34 -

coefficient s of the material involved. Since each set of projection data

represents the integral value of the attenuation coefficient s along the

path , the projection data taken at different view s are the basic for

tomographic image reconstruction which described in Figure 3.1.

Figure 3.1 (a) A simplified x - ray beam I 0 attenuated through

a pixel and result s of the attenuated beam I .

(b) A tw o- dimensional matrix of linear attenuation

coefficient s of the image. Attenuated beam

intensities for the corresponding row s (i.e ., I 1 , I 2 ,

are shown at the right .

- 35 -

3 .2 A c quirin g M R Im ag e

T he spin - echo sequences of various types are the most widely used

imaging techniques and their basic form consist s of 90°and 180° rf

pulses . For 3- D imaging , this pulse sequence is repeated both in the z

and y directions , provided the x - gradient is the frequency - encoding or

readout gradient . In conventional spin - echo imaging the repetition is

usually longer than the echo time to allow recovery of the longitudinal

magnetization or T 1 recovery . T he fir st 90° rf pulse rotates or flips

the magnetization of the spins into the transverse plane. Immediately

after the 90° flip, spins in the transverse plane start the T 1 and T 2

relaxations . Although the T 1 relaxation process continues , the addition

of a 180°pulse at the time ofT E

2flips the spins to the opposite side,

and eventually rephases the spins that w ere dephased by T 2 decay

during the time from t = 0 to t =T E

2. In addition to true T 2 decay ,

the transver se component of the magnetization also decays due to the

magnetic field inhomogeneity . T he equivalent decay time due to both

the T 2 relaxation and the field inhomogeneity is known as T *2 which

is given by Eq. (3.5).

(3.5)1

T *2

= 1T 2

+B 0

2

where is gyromagnetic ratio and B 0 is the inhomogeneity of the

- 36 -

magnetic field. It should be noted, how ever , that the decay of the signal

because of the field inhomogeneity can be recovered by the application

of a 180° rf pulse or rf spin echo, which rephases the spins that have

been dephased during the period between the 90° and 180° rf pulses .

Spin dephasing due to T 2 relaxation , however , cannot be recovered.

With the application of three orthogonal gradient the acquired echo

signal can now be expressed as Eq. (3.6).

(3.6)s( t , g y , g z ) =- - -

( x , y , z )ei ( G xx t + g y y T y + g zz T z )

dx dy dz

where (x , y , z ) represents the spin density function including the T 1

and T 2 decays , G x is the readout gradient (constant gradient during

data acquisition ), g y and g z are the phase encoding gradient s in the y

and z directions with varying amplitudes in steps, and T y and T z are

the durations for the G y and G z gradient s, respectively . From equation

(3.6) it can be shown that the echo signal is the 3- D Fourier transform

of the spin density function . Hence the spin density function modulated

by the T 1 and T 2 relaxations can be obtained by the 3- D Fourier

transform of the spin- echo signal and is given by Eq. (3.7).

(3.7)(x , y , z ) = 0 (x , y , z ){ ex p [- T E

T 2 ( x , y , z )]}{1 - ex p [

- T R

T 1(x , y , z )]}

where 0 (x , y , z ) denotes the initial values of the spin density function

- 37 -

at a location (x , y , z ) and the two terms in { } denote the T 1 and T 2

decay s .

With a short echo time, the T 2 - dependent term in Eq. (3.7) will be

unity and Eq. (3.7) becomes

(3.8)(x , y ) 0 (x , y ){1 - ex p [ -T R

T 1(x , y )]}

With a short T R , Eq. (3.8) can be further approximated as

(3.9)(x , y ) 0 ( x , y ) [T R

T 1( x , y )]

T he short T R and short T E sequence, can be therefore considered as

an imaging mode, where the image intensity is inver sely proportional to

the longitudinal relaxation time T 1 . T his mode is often called T 1

- w eighted imaging .

With a long repetition time ( T R T 1 ), the T R dependent term in Eq.

(3.7) will become zero, and Eq. (3.7) approaches Eq. (3.10).

(3.10)(x , y ) 0 (x , y ){ ex p [ -T E

T 2 (x , y )]}

In this mode the signal is usually large because of the long T R . On

the other hand, due to the long echo time, the image is heavily

- 38 -

weighted by T 2 and the signal is reduced significantly . T his mode is

called T 2 - w eighted imaging and the image is somewhat noisy .

3 .3 M eth odolog y

In this section I describe the method of automatic medical image

segmentation using independent component analysis algorithm . T o verify

the performance of the medical image segmentation using independent

component analy sis, computer simulations w ere done with a test data

and the medical image data set . T he data set in this experiment

consist s of 27 axial CT images of patient ' s head, starting from images

below chin and ending at images in the upper portion of the nose. T he

original image data were obtained using a General Electric High- speed

Advantage Computerized T omography under the condition of 120 kVp

and 200mA . No special post - processing was performed on the image

data , other than that of reducing the bit resolution , 8bit s/ pixel for

efficient memory usage. Figure 3.2 show s some selected original images

from the data set . It is possible to assume that the data set consist s of

three part s, bone, soft tis sue and background, and the final goal of my

experiment is to extract bone from the data set . Figure 3.3 show s the

flow chart of bone extraction process .

- 39 -

Figure 3.2 Selected image data which w ere used in this

experiment .

A s described in section 3.1 and 3.2 it can be assumed that different

part s of the medical images have some independent components . In CT

images , bone and soft t is sue have different att enuation coefficient s and

this result s in different CT number or gray value. In MR images ,

- 40 -

different relaxation time result s in different w eighted image. So it is

assumed that internal part s of medical image are statistically

independent and this is the start of my experiment .

Figure 3.3 Flowchart of Medical Image Segmentation Using

independent component analy sis

- 41 -

T o extract the bone regions from each of the 27 axial image slices ,

the prior of the data should be decided and it can be decided by the

probability density function of the data . T he prior can be chosen in

terms of the kurtosis of the distribution where the kurtosis is defined

as the fourth moment according to Eq. (3.11).

(3.11)K U R T OSIS = i(b i - b ) 4

(i

(b i - b ) 2) 2 - 3

where b is the mean value.

If the kurtosis of the data is zero(Gaussian ) or smaller than

zero(sub - Gaussian or platykurtic), the Gaussian prior can be chosen and

if the kurtosis of the data is larger than zero(super - Gaussian or

leptokurtic) the Laplacian prior like in Eq. (3.12) should be chosen .

(3.12)P ( s m ) ex p ( - s m )

T he super - Gaussian has longer tails and sharper peak than a

Gaussian distribution , like Figure 3.4. Compared to a Gaussian,

Laplacian distribution puts greater weight on values close to zero, and

as a result the representations are more spar se[29].

- 42 -

Figure 3.4 Examples of three levels of kurtosis . Each of the

distributions has the same variance. A Gaussian

distribution has minimal redundancy (highest entropy ) for

a fixed variance. T he higher the kurtosis, the higher the

redundancy . With high kurtosis, there is a higher

probability of a low response or a high response with a

reduced probability of a mid- level response.

After choosing the prior , the data set is it erated using independent

component analy sis . In this thesis , the algorithm described in section 2.3

is chosen because the standard independent component analysis

algorithm described in section 2.2 requires the same number of input

and output . It is because to make the linear transform matrix W

- 43 -

square matrix . But the goal of this dissertation is to segment bones

and other part s of medical images from one slice and it requires

overcomplete matrix . Before the data set is iterated using independent

component analysis the data set should be sphered using Eq. (2.10).

T his removes both the first and second- order statistics of the data:

both the mean and covariances are set to zero and the variances are

equalized. T o ensure that the input ensemble w as stationary in

sequence, the sequence index of the signals was permuted. T his means

that at each iteration of the training , the independent component

analysis training system w ould receive input from a random sequence

index point .

In order to evaluate segmentation result s of medical image set ,

SENSIT IVIT Y and SPECIFICIT Y described in Eq. (2.35), (2.36) and

other evaluation functions, named Empirical Discrepancy Methods (EDM ).

In practical segmentation applications, some error s in the segmented

image can be tolerated. On the other side, if the segmenting image is

complex and the algorithm used is fully automatic, the error is

inevitable [52][56]. T he disparity betw een an actually segmented image

and a correctly ideally segmented image(reference image) that is the

best expected result can be used to assess the performance of

algorithm . Both (actually segmented and reference) images are obtained

from the same input image. T he reference image is sometimes called

gold standard.

Weszka and Rosenfeld used an approach to measure the difference

between an ideal(correct ) image and a segmented image[54]. Under the

- 44 -

assumption that the image consist s of object s and background each

having a specified distribution of gray level, they compute for any

given standard segmented value, the probability of misclassifying an

object pixel as background, or vice ver sa. T his probability in turn

provides an index of segmentation result s, which can be used for

evaluating segmentation algorithm . In their work , such a probability is

minimized in the process of selecting an appropriate segmentation .

Recently , a discrepancy measure based on the same principal has been

defined. It is t ermed p robability of error (PE ). For a two- class problem

PE can be calculated by Eq. (3.13)[57].

(3.13)P E = P (O )×P ( B O ) + P ( B )×P ( O B )

where P ( B O ) is the probability of error in classifying object s as

background. P (O B ) is the probability of error in classifying

background as object s . P (O ) and P ( B ) are a priori probabilities of

object s and background in images . In this case, as the PE value close

to 0, the segmentation show s good result .

Image analysis is concerned with the extraction of information from

an image, an image yields data out . Here the data are the measurement

values of object features obtained from segmented images . One

fundamental question in image analysis is whether a measurement made

on the object s from segmented images is as accurate as one made on

the object s from segmented images . According to this measure, a

segmented image has the highest quality if the object features extracted

- 45 -

from it precisely match the features in the original. T he ultimate goal

of image segmentation in the context of image analysis is to obtain

measurements of object features . T he accuracy of these measurements

obtained from the segmented image with respect to the reference image

provides useful discrepancy measures . T his accuracy can be termed

ultimate m easurem ent accuracy (UMA ) to reflect the ultimate goal of

segmentation . Let R f denote the feature value obtained from the

reference image and S f denote the feature value measured from the

segmented image, the UMA is defined as Eq. (3.14).

(3.14)U M A =| R f - S f |

R f

As like PE, the UMA value close to 0 means that the result of

segmentation is good.

Other evaluation function which is called mislabelling rate described

in Eq. (3.15) are chosen to evaluate the result s of segmentation using

independent component analysis .

(3.15)F ( I) = R ×R

i = 1

e 2i

A i

where I is the image to be segmented, R , the number of regions in

the segmented image, A i , the area, or the number of pixels of the ith

region , and e i , the error of region i [49]. T he term R is a global

- 46 -

measure which penalizes small regions or regions with a large error . e i

indicates an appropriate feature whether or not a region is assigned. A

large value of e i means that the feature of the region is not well

captured during the segmentation process . As described in equation

(3.15), the larger value of evaluation function means the bad result of

segmentation .

- 47 -

CH A P T E R 4

RE S U LT & A N A L Y S IS

In this chapter we describe the applications of automatic medical

image segmentation using independent component analysis algorithm .

T o verify the performance of the medical image segmentation using

independent component analysis, computer simulations w ere done with a

test data set and 27 axial CT images . T he performance evaluations

were done using SE N SIT IV IT Y , SP E CIF ICIT Y , mislabelling rate and

EGM which w ere described in Chapter 2 and Chapter 3.

4 .1 S im ple T e s t E x am ple

In the fir st simulation, some time sequence test data w ere used to

verify independent component analysis . T he test data consisted of three

speech signals that w ere obtained from three different per sons . T he

data w as mixed with 3×3 random matrix , each element ranging from 0

to 1. Using independent component analysis algorithm, the original

speech signals were separated shown in Figure 4.1. Figure 4.1(a) show s

the original signals . T hese signals are mixed with random matrix

shown in Figure 4.1(b ). After 1,000 iteration using independent

component analysis the original signals w ere separated from mixed

signals . It show s that stat istically independent component s in mixed

signals are nearly perfectly separated using independent component

- 48 -

analysis .

Figure 4.2 show s a simple synthesized test image consisting of a

combination of two cylinders and one ellipse. T o keep the task simple,

we only considered the three object synthesized, 160×160 image in this

simulation . In this experiment similar to above, Figure 4.2(a ) and figure

4.2(b ) are summed with tw o random coefficient s between 0 and 1.

T he gray value of tw o cylinder s and one ellipse is very similar but

it s are in different part of the images and w e can assume that they are

statistically independent . T he architecture in Figure 2.3 and the

algorithm proposed in section 2.3 w as sufficient to perform the

segmentation . After mixing two image data with random coefficient s,

the unmixing matrix W was trained with independent component

analysis . After about 100,000 iterations, the unmixing matrix converged.

T he learning rate was chosen 0.0001 in fir st 10,000 iterations , 0.005 in

the next 20,000 iterations and 0.001 in the remainder iterations . Figure

4.2 show s the result of the segmentation . Although the input is mixed

with tw o object s, the result of the output is unmixed. It says that tw o

object s are statistically independent .

- 49 -

(a )

(b )

- 50 -

(c )

Figure 4.1 T est example of independent component analy sis

(a ) T he original speech signal obtained from

three different persons

(b ) Mixed signal with 3×3 random matrix

(c) Separated signal using independent component

analy sis .

- 51 -

Figure 4.2 T he simple test image consisting of two

cylinder s and one ellipse. In this

experiment , (a) and (b) are mixed with

randomly .

Figure 4.3 T he result of automatic segmentation .

T he learning process has no prior

information and segmenting the image

into tw o part s , tw o cylinders and one

ellipse.

- 52 -

T here are some differences between the gray value of original image

and unmixed image. Although one might expect that unmixed data

having same gray value of original image would be more efficient the

learning process has no prior information and segmenting the image

into two part s, tw o cylinder s and one ellipse. Figure 4.3(b ) show s there

are some part s of mixed with a ellipse and two cylinders . It is because

in original image the gray values of the ellipse and two cylinders w ere

not different significantly .

4 .2 E v alu at ion s o f A u tom at ic S e g m ent at ion w ith T e s t

D at a

T his section describes an evaluation of segmentation using

independent component analysis . T he original data shown in Figure 4.4

is used. T he original data are mixed using a matrix which have 4

random coefficient s ranging between 0 and 1. T he original data have

gray value and is composed of tw o squares . At first , st andard

deviations of two squares are set to 35, mean value of large

square(Figure 4.4(a)) is set to 70, and mean value of small

square(Figure 4.4(b ))is set to 50. In this experiment the standard

deviations of tw o squares and the mean value of large square are not

changed but the mean value of small square is increased by 10 until

the mean value of small square becomes 190. Figure 4.4 show s the

selected images of which the mean values are 70(Figure 4.4(a)) and

- 53 -

130(Figure 4.4(b)). Figure 4.4(c) show s the graph of probability density

functions when the mean value of large square is 70 and the mean

value of small one is 130.

Figure 4.4 T he selected original data used in evaluation .

(a ) Standard deviation 35, mean 70

(b ) Standard deviation 35, mean 130

(c) T he graph of probability density function

- 54 -

Figure 4.5 and 4.6 show the result of segmentation using

independent component analysis . T he result s are ordered by the mean

value of small square(Figure 4.4(b)). Each result show s that the

independent component in mixing data can be extracted nearly perfectly .

In mixing process , large squares cover all part of small squares . It

makes the independent components which represent small squares have

some false positive value(Figure 4.5). But the independent component

which represent large squares nearly don t have false positive and true

negative values .

T o evaluate the performance of automatic segmentation using

independent component analysis, probability of error (PE ) and ultimate

measurement accuracy (UMA ) values which are described in Eq. (3.13)

and (3.14) are calculated. T hese values are close to 0 if the result s of

segmentation are close to original data . T able 4.1, Figure 4.7 and 4.8

show s the result of evaluation using PE values and UMA values . All

values are close to zero. T his means that the segmentation using

independent component analysis can extract the independent components

from the mixed data and it can be applied to medical image

segmentation .

- 55 -

Figure 4.5 Small squares extracted from mixed

test data .

- 56 -

Figure 4.6 Large squares extracted from mixed

test data .

- 57 -

T able 4.1 Probability of Error and Ultimate Measurement Accuracy

value of independent components .

PE Value of

Small Squares

UMA Value of

Small Squares

PE Value of

Large Squares

UMA Value of

Large Squares

0.256 0.269 0.092 0.093

0.268 0.254 0.091 0.089

0.297 0.287 0.077 0.075

0.269 0.271 0.064 0.068

0.257 0.263 0.071 0.072

0.257 0.258 0.068 0.067

0.248 0.25 0.068 0.069

0.245 0.248 0.065 0.066

0.237 0.238 0.067 0.065

0.256 0.254 0.094 0.082

0.233 0.241 0.098 0.097

0.278 0.267 0.093 0.095

0.258 0.257 0.094 0.097

0.269 0.264 0.094 0.094

0.231 0.237 0.093 0.091

- 58 -

Figure 4.7 Plot PE and UMA values about small squares

Figure 4.8 Plot of PE and UMA values about large squares

- 59 -

4 .3 A ut om at ic S e g m ent at ion w ith M edic al Im ag e s

When segmenting medical images using independent component

analysis algorithm which described in Chapter 3, the prior of the data

set should be chosen . T he process of choosing the prior of the data set

was done using kurtosis . Figure 4.9 and T able 4.2 show s the kurtosis

graph and the kurtosis of the data set , respectively . T he medical image

data set used in this dissertation had high kurtosis, the Laplacian prior

was chosen shown in equation (3.12).

Figure 4.10 show s the input image, a slice from a axial view from

one slice of the data set . T he slice show s chin and cervical spine of

female patient and the result of segmentation . Usually the gray value of

soft t is sue and that of inside the cervical spine are nearly same so it is

very difficult to segment the cervical spine exactly using threshold or

other method like in Figure 4.10. Using automatic method which was

developed in this thesis , the exact part of chin and cervical spine

segmented. Note the significant improvement in the inside area of

cervical spine. In Figure 4.10(a ) and (b ), inside area of cervical spine is

completely absent . In Figure 4.10(c), inside area of cervical spine is

partly absent . In Figure 4.10(d), inside area of cervical spine has

somewhat exact value but the chin has some false positive values and

we can not exactly discriminate that part .

- 60 -

T able 4.2 T he kurtosis of input data set . All of the

kurtosis are higher than 0, and it means that

the input data set have super - Gaussian

probability density function .

Input data (256×256 image) KURT OSIS

CT HEAD00 3.0487CT HEAD01 3.4076CT HEAD02 3.4840CT HEAD03 3.1982CT HEAD04 2.8927CT HEAD05 2.6120CT HEAD06 2.6840CT HEAD07 2.7024CT HEAD08 2.7019CT HEAD09 2.7735CT HEAD10 3.0565CT HEAD11 5.4718CT HEAD12 6.4063CT HEAD13 6.8815CT HEAD14 6.5358CT HEAD15 5.0069CT HEAD16 3.8800CT HEAD17 2.4190CT HEAD18 2.1906CT HEAD19 2.1755CT HEAD20 2.1586CT HEAD21 2.1212CT HEAD22 2.0798CT HEAD23 2.0982CT HEAD24 2.1921CT HEAD25 2.1131CT HEAD26 1.9202

- 61 -

Figure 4.9 T he kurtosis graph of the input data set . All of the

kurtosis are higher than 0, and it means that the input

data set have super - Gaussian probability density

function .

- 62 -

Figure 4.10 T he input image and the segmentation result .

(a) It is a slice from a axial view from a General

Electric High - speed Advantage Computerized

T omography, showing tooth and cervical spine of

female patient .

(b) T he segmentation result . T he tooth and cervical

was extracted from (a) except soft t is sue.

Figure 4.11 show s the segmentation result of the data set . Left

images are original CT and right ones are segmented images by

automatic method using Independent Component Analy sis . Since bones

in CT images have higher gray values than other part s and these are

the dominant part of those images, bones are easily segmented using

independent component analysis . Figure 4.12 show s 16 selected axial

segmented image using independent component analy sis and Figure 4.14

- 63 -

show s the volume rendering image using the result shown in Figure

4.13.

Figure 4.11 T he segmentation result using threshold. T he

original image in Fgure 4.10(a ) had gray values . I

gradually increased the threshold value and

segmented the chin and cervical spine. But the

threshold method didn ' t segment exactly because

some part s of the soft tissues and some part s of

the cervical spine and chin had same gray values .

- 64 -

Figure 4.12 T he segmentation result of CT image. Left

images are original CT and right ones are

segmented images by automatic method using

Independent Component Analy sis .

- 65 -

Figure 4.13 Result of 16 selected axial CT image segmentation

- 66 -

Figure 4.14 Volume rendered image using the result of

Figure 4.6

4 .4 E v alu at ion s of M e dic al Im ag e S e g m en t ation

T his section describes a comparison of segmentation using

independent component analysis to segmentation using thresholding

method. T he manual segmentation result s by radiologist s w as chosen as

the reference of a comparison . Figure 4.16 show s the binary result of

manual segmentation .

At first the sensitivity comparison was done betw een independent

- 67 -

component analy sis method and thresholding method. Sensit ivity which

is defined in equation (2.35) show s the "T rue Posit ive Rate" of

segmented data to reference and higher sensitivity means good

segmentation result . Figure 4.16 show s the result of sensit ivity

comparison . In Figure 4.16, the result of 3 image data using

independent component analysis out of 27 has low er sensitivity than

thresholding method. It is because the 3 image data have tooth and

cervical spine and the tooth part have some metal artifact s . T he metal

artifact s distributed other part s of the soft tis sue and this was not

segmented clearly using independent component analysis . A statist ical

Paired- t test about sensitivity was done using 0.05 statist ical

significance and p value of Paired- t t est was much low er than

statistical significance. T his means that even though the bad result of

three cases using independent component analysis, the segmentation

using independent component analysis gave good result s . T able 4.3

show s the result of Paired- t test about the sensitivity .

- 68 -

Figure 4.15 T he result of manual segmentation . T hese images

used as the reference of a comparison segmentation

using independent component analysis to

segmentation using thresholding method.

- 69 -

Figure 4.16 T he sensitivity (T rue Posit ivity Rate) comparison between

independent component analysis method and thresholding

method. T he result of 3 image data using independent

component analysis out of 27 has low er sensitivity than

thresholding method. It is because the 3 image data have

tooth and cervical spine and the tooth part have some

metal artifact s .

- 70 -

T able 4.3 T he result of Paired- t t est about sensitivity with 0.05

statist ical significance. p value is much lower than

statist ical significance. T his means that even though the

bad result of three cases using independent component

analy sis, the segmentation using independent component

analy sis give good result s .

T rue Positive Rate of

Independent Component

Analysis

T rue Positivity Rate of

T hreshold

Mean 0.982963 0.898519

Variance 0.003691 0.000328

T he number of

Samples27 27

Statistically

Significance0.05

Degree of Freedom 26

p(T < =t ) Value of

Paired- t T est4.8E - 07

Secondly the specificity comparison w as done. Specificity is defined

in equation (2.36) and 1- specificity show s the "False Positive Rate" of

segmented data to reference and lower "False Posit ive Rate" means

good segmentation result . Figure 4.17 show s the result of "False

Posit ive Rate" comparison . In Figure 4.17, the result of 6 image data

using independent component analy sis out of 27 has lower "False

Posit ive Rate" than thresholding method. But other result s have nearly

- 71 -

same "False Positive Rate". It means that in term s of specificity , the

segmentation method using independent component analysis does not

have good result comparing to thresholding method. A statistical

Paired- t t est about "False Posit ive Rate" show s that the p value of

Paired- t test is somewhat higher than statistical significance and it

means that there is no significant difference betw een independent

component analysis and thresholding method. T able 4.4 show s the result

of Paired- t test about the sensitivity .

But in Figure 4.16 and 4.17, the "T rue Posit ive Rate" of independent

component analysis is much higher than thresholding method and the

"False Positive Rate" is lower and w e can infer that the Receiver

Operator Characteristic Curve of the independent component analysis

method much better shape than the thresholding method. T his means

that independent component analysis method usually gives better result s

than thresholding method.

- 72 -

Figure 4.17 T he result of "False Positive Rate" comparison . T he

result of 6 image data using independent component

analysis out of 27 has lower "False Positive Rate" than

thresholding method. But other result s have nearly same

"False Positive Rate". It means that in terms of

specificity , the segmentation method using independent

component analysis does not have good result comparing

to thresholding method.

- 73 -

T able 4.4 T he result of Paired- t t est about "False Positive Rate" with

0.05 statistical significance. p value is higher than statistical

significance. It means that there is no significant difference

betw een independent component analy sis and thresholding

method.

False Positivity Rate of


Analysis

False Positivity Rate of

T hreshold

Mean 0.984074074 0.999259259

Variance 0.00245584 7.12251E - 06

T he number of

Samples27 27

Statistically

Significance0.05


p(T < =t ) Value of

Paired- t T est0.12460627

At last the mislabelling rate comparison was done between

independent component analy sis method and thresholding method.

Mislabelling rate which is defined in equation (3.13). T his is a

combination of the "T rue Positive Rate" and "False Positive Rate".

T here is alw ays a trade- off betw een preserving details and suppressing

noise, which is reflected in the evaluation measure the mislabelling rate.

If there are too many details in the segmented image, the error of each

- 74 -

region may be smaller . But since many small regions are formed and

the number of regions is large, the value of mislabelling rate is large

which indicates that the segmentation result is not good. Figure 4.18

show s the result of mislabelling rate comparison . In Figure 4.18, only

2 cases using independent component analy sis out of 27 has higher

mislabelling rate than thresholding method. On the other hand, the

remainder cases have much smaller rate compared to thresholding . It

means that although the result of "False Positive Rate" comparison is

not good, independent component analysis method will have much good

result compared to thresholding method. A statistical Paired- t test about

mislabelling rate also done using 0.05 statistical significance and p

value of Paired- t t est w as much low er than statistical significance.

T able 4.5 show s the result of Paired- t t est about the mislabelling rate .

- 75 -

Figure 4.18 T he result of mislabelling rate comparison . Only 2 cases

using independent component analy sis out of 27 has higher

mislabelling rate than thresholding method. T he remainder

cases have much smaller mislabelling rate compared to

thresholding . It means that although the result of "False

Positive Rate" comparison is not good, independent

component analysis method will have much good result

compared to thresholding method.

- 76 -

T able 4.5 T he result of Paired- t t est about mislabelling rate with 0.05

statistical significance. p value is higher than statistical

significance. T his means that although the result of "False

Posit ive Rate" comparison is not good, independent

component analysis method will have much good result

compared to thresholding method.

Mislabelling Rate of


Analysis

Mislabelling Rate of

T hreshold

Mean 1.069558847 9.254604451

Variance 13.4430021 11.47477914

T he number of

Samples27 27

Statistically

Significance0.05


p(T < =t ) Value of

Paired- t T est4.86657E - 08

- 77 -

CH A P T E R 5

CON CLU S ION

In this dissertation , an automatic medical image segmentation method

using independent component analysis was demonstrated. T he

performance of this method w as evaluated using PE and UMA values .

T he performance of this method was also compared with the

performance of the general thresholding method using T PR(T rue

Posit ive Rate), FPR(False Positive Rate) and the mislabelling rate. For

the test data all of the result s are close to the original data . T he

segmentation method using independent component analy sis has a T PR

of over 95 percent , a FPR of 1 percent , and a mislabelling rate near 1

percent . It means that the automatic method demonstrated in this

dissertation has a good result . T he segmentation method using

independent component analysis offers several distinct advantages over

other segmentation method. First , before the segmentation there ' s no

need to know a priori informations about the region to be segmented.

Second, the independent component analysis method gives good spatial

resolution compared to the general threshold method - - using this

method more detail part s in medical images could be discriminated.

T he medical image segmentation technique is the start of the

3- dimensional medical image reconstruction technique that is used for

diagnosis , treatment , preoperative planning , and outcomes simulation for

various interventional options . How ever the poor result of segmentation

- 78 -

has been a major obstacles to 3- dimensional medical image

reconstruction . T he segmentation method described in this dissertation

efficiently segmented detailed part of medical image. It can improve the

3- dimensional medical image reconstruction technique.

Independent component analysis has many potential application areas,

including blind separation of electroencephalographic and

magnetoencephalographic data , as well as feature extraction and

analysis of natural images . Blind source separation can be applied to

the noise reduction from the biomedical signals, for example, ocular

noise and 60Hz artifact extraction from EEG, fetal monitoring signal

analysis , EKG signal analysis . Independent component analy sis relies on

several model assumptions which can be applied to each case described

above. But the assumptions may be inaccurate or incorrect . Finding and

making suitable model assumption will be interesting research area .

- 79 -

RE F E RE N CE S

[1] J . Besdek, L. Hall, and L. Clarke, "Review of MR image

segmentation techniques using pattern recognition ," Medical Physics,

vol.20, no.4, pp . 1033- 1048, 1993

[2] G. Wang, M . W . Vannier , M . W . Skinner , W . A . Kalender , A .

Polacin and D. R. Ketten,“"Unwrapping cochlear implants by spiral

CT ,"” IEEE T ransactions of Biomedical Engineering , vol.43, no.9,

pp .891- 900, 1996

[3] H . L. Seldon, "T hree- dimensional reconstruction of temporal bone

from computed tomographic scans on a per sonal computer ,"” Arch.

Otolaryngol. Head Neck Surg ., vol.117, pp .1158- 1161, 1991

[4] H . T akahashi and I. Sando, "Computer - aided 3- D temporal bone

anatomy for cochlear implant surgery ,"” Laryngoscope, vol.100,

pp .417- 421, 1990

[5] R. Frankenthaler , V . Moharir , R. Kikinis, P . V . Kipshagen, F . Jolesz,

C. Umans, and M . P . Fried, "Virtual Otoscopy, Computers in

Otolaryngology ," vol.31, pp.383- 392, 1998

[6] S . K. Yoo, G. Wang, J . T . Rubinstein , M . W . Skinner and M . W .

Vannier , "T hree- dimensional modelling and visualization of the

cochlea on the internet ," IEEE T ran . Info. T ech. in Biomed., June,

2000

- 80 -

[7] T . Himi, A . Kataura, M . Sakata , Y. Odaw ara, J . Satoh , and M .

Sawaishi, "T hree- dimensional imaging of the temporal bone using a

helical CT scan and it s application in patient s with cochlear

implantation ," ORL; Journal of Oto- Rhino- Laryngology & its related

specialt ies , vol.58, pp.298- 300

[8] C. Yuan, E . Lin, J . Millard, and J . Hwang, "Closed contour edge

detection of blood vessel lumen and outer wall boundaries in

black - blood MR images," Magnetic Resonance Imaging , vol.17, no.2,

pp .257- 266, 1999

[9] D. J . Williams, and M . Shah, "A fast algorithm for active contour s

and curvature estimation ," CVGIP : Image Under standing , vol.55, no.1,

pp .14- 26, 1992

[10] G. Xu, E . Segaw a, and S . T suji, "Robust active contours with

insensitive parameter s," Pattern Recognition , vol.27, no.7, pp .879- 884,

1994

[11] L. D. Cohen, "On active contour models and balloons," CVGIP :

Image Under standing , vol.53, no.2, pp.211- 218, 1991

[12] M . Kass, A . Witkin and D. T erzopoulos, "Snakes : active contour

models ," Int . J . Comput . Vision , vol.1, pp.321- 331, 1987

[13] J . Ivins , and J . Porrill, "Statistical snakes : active region models ,"

Image and Vision Computing , vol.13, no.5, pp .431- 438, 1995

[14] M . Sonka, W . Park , and E . A . Hoffman, "Rule- based detection of

intrathoracic airw ay trees," IEEE T rans . Med. Imag ., vol.15, no.3,

pp .314- 326, 1996

- 81 -

[15] K. H . Höhne and W . A . Hanson, "Interactive 3D segmentation of

MRI and CT volumes using morphological operations ," J . Comput .

Assist . T omogr ., vol.16, no.2, pp.285- 294, 1992.

[16] P . Salviroonporn , A . Robatino, J . Zahajszky , R. Kikinis and F . A .

Jolesz, "Real- time interactive three- dimensional segmentation ," Acad.

Radiol., vol.5, no.1, pp .49- 56, 1998

[17] A . J . Bell and T . J . Sejnow ski, "Edges are the ' independent

components ' of natural scenes," In Advances in neural information

processing systems 9, pp.831- 837, Cambridge: MIT Press , 1997

[18] A . J . Bell and T . J . Sejnow ski, "T he 'independent component s ' of

natural scenes are edge filter s," Vision Research , vol.37, no.23,

pp .3327- 3338, 1997

[19] A . J . Bell and T . J . Sejnow ski, "An information- maximization

approach to blind separation and blind deconvolution ," Neural

Computation , vol.7, no.6, pp.1129- - 1159, 1995

[20] S . Amari, A . Cichocki, and H . H . Yang, "A new learning algorithm

for blind source separation ," In Advances in Neural Information

Processing 8, pp .757- 763, Cambridge: MIT Press, 1996

[21] A . Hyvärinen and E . Oja, "Simple neuron models for independent

component analysis," International Journal of Neural Sy stem s, vol.7,

no.6, pp.671- 687, 1996

[22] B. A . Olshausen and D. J . Field, Spar se coding with an

overcomplete basis set : A strategy employed by V 1?," Vision

Research, vol.37, pp .3311- 3325, 1997

- 82 -

[23] B. A . Olshausen and D. J . Field, "Emergence of simple- cell

receptive field properties by learning a sparse code for natural

images ," Nature, vol381, 607- 609, 1996

[24] H . Barlow , "Unsupervised learning ," Neural Computation , vol.1,

pp .295- 311, 1989

[25] H . Barlow , "What is the computational goal of the neocortex ?" In

C. Koch, editor , Large scale neuronal theories of the brain , pp.1- 22,

Cambridge: MIT Press, 1994

[26] T e- Won Lee, Indep endent Comp onent A naly s is : Theory and

A pp lications , Kluwer Academic Publisher s, 1998

[27] P . Comon, "Independent component analysis - a new concept ?"

Signal Processing , vol.36, pp.287- 314, 1994

[28] M . S . Lewicki and T . Sejnow ski, "Learning overcomplete

representation ," Neural Computation, vol.12, no.2, pp.337- 365, 2000

[29] D. J . Field, "What is the goal of sensory coding?" Neural

Computation , vol.6, pp .559- 601, 1994

[30] R. Linsker , "Self- organization in a perceptual network ," Computer ,

vol.21, no.3, pp .105- - 17, 1988.

[31] S . Haykin , N eural N etworks, A Comp rehens ive F oundation ,

Prentice- Hall, 1994

[32] I. T . Jolliffe, P rincip al Comp onent A naly s is , Springer - Verlag , 1986

[33] M . E . T ipping and C. M . Bishop, "Probabilistic Principal Component

Analysis," Journal of the Royal Statistical Society , Series B, 61, Part

3, pp.611- 622, 1999

- 83 -

[34] J . P . Nadal and N. Parga, "Non - linear neurons in the low noise

limit : a factorial code maximizes information transfer ," Netw ork,

vol.5, pp.565- 581, 1994

[35] J . F . Cardoso and B. Laheld, "Equivalent adaptive source

separation ," IEEE T rans . on Signal Processing , vol.45, no.2,

pp .434- 444, 1996

[36] T . W . Lee, M . Girolami, A . J . Bell and T . J . Sejnow ski, "A

Unifying Information - theoretic Framework for Independent

Component Analysis," Computers & Mathematics with Applications,

vol.31, no.11, pp .1- 21, 2000

[37] J . F . Cardoso, "Blind signal processing : statistical principles,"

Proceedings of IEEE, vol.86, no.10, pp .2009 - 2025, 1998

[38] Z. H . Cho, J . P . Jones and M . Singh, F oundations of M edical

Imag ing , John Wiley & Sons , Inc. 1993

[39] R. C. Gonzalez and R. E . Woods, D ig ital Im ag e P rocess ing ,

Addison - Wesley Publishing Company , 1992

[40] J . S . Lim, Two- D im ens ional S ignal and Imag e P rocess ing ,


[41] R. Crane, A s imp lif ied app roach to Imag e P rocess ing ,


[42] N. Lu, F ractal Im ag ing , Academic Press, 1997

[43] E . Gose, R . Johnsonbaugh and S . Jost , Pattern Recognition and

Image Analysis , Prentice- Hall, 1996

- 84 -

[44] C. K. Chow and T . Kaneko, "Automatic boundary detection of the

left ventricle from cineangiograms," Computers and Biomedical

Research, vol.5, no.4, pp .388- 410, 1972

[45] R. Pichumani, Cons truction of a Three - D im ens ional Geom etric

M odel F or S egm entation and Vis ualiz ation of Cervical Sp ine

Imag es , Ph . D. Dissertation , 1997

[46] H . C. Sox Jr ., M . A . Blatt , M . C. Higgins, and K. I. Marton ,

M edical D ecis ion M ak ing . Butterw orths, 1988

[47] T . A . Russ , "Using hindsight in medical decision making, "

Computer Methods and Program s in Biomedicine, vol.32, no.1,

pp .81- 90, 1990

[48] J . T . Bushberg , J . A . Seibert , E . M . Leidholdt Jr . and J . M . Boone,

T he Essential Physics of Medical Imaging , Willians & Wilkins, 1994

[49] J . Liu and Y. H . Yang, "Multiresolution Color Image Segmentation ,"

IEEE T rans . on Pattern Analysis and Machine Intelligence, vol.16,

no.7, 1994

[50] A . Papoulis , P robability , R andom Variables, and S tochas tic

P rocesses , McGraw - Hill, 1991

[51] R. W . Preisendorfer , P rincip al Comp onent A naly s is in M eteorology

and Oceanog rap hy , Elsevier , 1988

[52] Y. J . Zhang, "A survey on evaluation methods for image

segmentation," Pattern Recognition , vol.29, no.8, pp .1335- 1346, 1996

[53] M . D. Levine and A . Nazif, "Dynamic measurement of computer

generated image segmentation ," IEEE T rans . PAMI- 7, pp.155- 164,

1985

- 85 -

[54] J . S . Weszka and A . Rosenfeld, "T hreshold evaluation techniques,"

IEEE T rans . SMC- 8, pp .622- 629, 1978

[55] A . M . Nazif and M . D. Levine, "Low level image segmentation : an

expert sy stem ," IEEE T rans . PAMI- 6, pp.555- 577, 1984

[56] C. N . Graaf, A . S . E . Koster , K. L. Vincken and M . A . Viergever ,

"Validation of the interleaved pyramid for the segmentation of 3D

vector images ," Pattern Recognition , vol.15, pp .467- 475, 1994

[57] S . U. Lee, S . Y. Chung and R. H . Park , "A comparative

performance study of several global thresholding techniques for

segmentation," CVGIP , vol.52, pp .171- 190, 1990

[58] Vikram Chalana and Yongmin Kim , "A Methodology for Evaluation

of Image Segmentation Algorithm s on Medical Images," SPIE

vol.2710, pp.178- 189, 1996

- 86 -

국문요약

In depen dent Com pon ent A n aly s i s 를 이용한 의료영상의

자동 분할에 관한 연구

배 수 현


생체공학협동과정


의료영상의 분할은 의료영상을 컴퓨터 진단 및 가시화에 필요한 같은

성질을 가진 여러 조직으로 나누어주는 방법이다. 본 논문에서는

Independent Component Analy sis를 이용한 의료영상의 자동 분할 방법을

연구하였다. Independent Component Analysis는 사전 정보를 알 수 없는

입력 데이터로부터 출력 데이터를 통계적으로 독립적인 형태로 만들 수 있

는 선형 시스템을 만듦으로서 Blind Source Separation문제를 해결하는 방

법이다.

본 논문에서는 Independent Component Analysis사용한 의료영상의 자

동 분할 방법을 성능평가를 위하여 합성한 테스트 데이터와 CT 영상에 적

용하였다. 그리고 제안된 방법의 성능평가를 위하여 Probability of

Error (PE )와 Ultimate Measurement Accuracy (UMA ) 값을 측정하였다. 또

한 자동 분할 방법과 일반적인 분할 방법의 성능평가를 위하여

- 87 -

sensitivity (T rue Positive Rate), specificity (1- False Positive Rate) 및

mislabelling rate를 측정하였으며, 실험결과의 통계적인 유의성 검증을 위

하여 Paired- t 테스트를 성능평가 결과에 적용하였다. Independent

Component Analysis사용한 의료영상의 자동 분할 방법은 의료영상이 통계

적으로 독립적인 여러 기관으로 구성되어 있다는 가정하에 의료영상의 각

부분을 정확하게 분할할 수 있었다.

본 연구에서 얻어진 결과는 다음과 같다.

(1) 본 논문에서 사용한 테스트 데이터에 대해서 대부분의 PE와 UMA

값은 0에 가까운 값을 가졌다. 이는 본 논문에서 제안한 방법의 분

할 결과가 테스트 데이터를 구성한 원본 데이터와 거의 일치함을

의미한다.

(2) 본 논문에서 제안한 방법을 CT 영상에 적용하였을 경우 T PR (T rue

Positive Rate)은 95%이상, FPR(False Positive Rate)은 1%이하의

값을 가졌다. 그리고 mislabelling rate는 약 1%의 값을 가졌다. 이

는 의료영상의 분할에 본 논문에서 제안한 방법을 적용하였을 경우

분할하고자 하는 영역을 거의 정확하게 분할함을 의미한다.

(3) Paired- t 테스트를 사용하여 5%의 통계적 유의성을 가지고 분석하

였을 때 본 논문에서 제안한 방법은 일반적인 영상 분할 방법보다

좋은 결과를 가져옴을 알 수 있었다.

핵심되는 말: 분할, Independent Component Analysis , 의료영상, 독립성

- 88 -

Documents

A Study of Automatic Medical Image Segmentation using … · 2019-06-28 · MR image segmentation, the obstacles for wide applications are the much-increased computational power for