Upload
warren-lester
View
213
Download
1
Embed Size (px)
Citation preview
Multivariate Statistical Analysis
93751009 呂冠宏93751503 林其緯
Transformations To Near Normality
Why do we need to transform the data??How do we transform the data??
(The univariate case )ExampleHow do we transform the data??
(The multivariate case )Example
Why do we need to transform the data??
Objective
A convenient statistical model
Constant variance Suitable for the graph
For regression or analysis of variance
How (univariate)
Power transformations (byTukey(1957), Box and Cox(1964))
x
xx
ln
1)(
0
0
How (univariate)
nxxx ,,, 21
)(ix
Given the observations
Then the log-likelihood function of the is :
Assumption:
There exist a for which is for some and),( 2N 2
nxxx 21,
n
i
Ji
xxL xnn
n
1
2)(2
2)|( log)(2
1log
2log
2log
21
1
1
2 )( and ),,( where
n
iixJ
How (univariate)
n
ii
n
ii x
nx
n 1
2)(2
1
)( )ˆ(1
ˆ and 1
ˆ
Jnl loglog
2)(
2ˆ
Then we have :
Thus for fixed ,the maximized log-likelihood is,
(expect for a constant)
n
i
J
xniy
1i
ˆy re whe log
2
2
Example
In Example 4.10 (closed door)
We perform a power transformations of the data
Then we must find the value of maximizing the function )(l
Example
Original Q-Q plot Transformed Q-Q plot
ExampleIn Example 4.10 (open door)
We perform a power transformations of the data
Then we must find the value of maximizing the function )(l
Example
Original Q-Q plot Transformed Q-Q plot
How (multivariate)
Power transformations
p
ip
i
i
ip
i
i
i
p
p
x
x
x
x
x
x
x
1
1
1
2
2
1
1
)(
)(2
)(1
)(
2
1
2
1
),,,( 21 ipiii xxxx
),,,( 21 p
How (multivariate)
Given the observations
nxxx
,,, 21
Assumption 1:
There exist a for which is for some and )(ix
)I,( n
2N 2
Then the log-likelihood function of the is :nxxx
,,, 21
n
i
Jii
xxL xxnnp
n
1
)(')(2
2)|( log)()(2
1loglog
2log
21
n
i
p
jij
n
i i
i jxx
xJ
1
1
11
)(2 )(
and ),,( where
How (multivariate)
Then we have :
n
iii xxxxx
1
)()()()(2)( )()'(2n
1ˆ and ˆ
Thus for fixed , the maximized log-likelihood is,
(expect for a constant)
Jnl loglog)(2ˆ
n
i
p
jij
n
i i
i jxx
xJ
1
1
11
)(
)(
where
n
iij
jjp
jp xn
xxxxx1
)()()()(2
)(1
)( 1 and ),,,( where 21
How (multivariate)
Assumption 2:
There exist a for which is for some and )(ix
),( N
Then the log-likelihood function of the is :nxxx
,,, 21
n
i
Jii
xxL xxnnp
n
1
)(1')(2)|( log)()(2
1log
2log
2log 1
n
i
p
jij
n
i i
i jxx
xJ
1
1
11
)(2 )(
and ),,( where
How (multivariate)
Then we have :
Thus for fixed , the maximized log-likelihood is,
(expect for a constant)
Jnl loglog
2)(
ˆ
n
i
p
jij
n
i i
i jxx
xJ
1
1
11
)(
)(
where
'))((n
1ˆ and ˆ1
)()()()()(
n
iii xxxxx
n
iij
jjp
jp xn
xxxxx1
)()()()(2
)(1
)( 1 and ),,,( where 21
Example
In Example 4.10 (closed door and open door)
We perform a power transformations of the data (by assumption 2)
Then we must find the value of maximizing
the function
),( 21 )(l
12)(l
Example
Original chi-square plot Transformed chi-square plot
Example
chi-square plot (assumption 1)
chi-square plot (assumption 2)
Example
罐頭 chi-square plot 課本 chi-square plot
References Box, G. E. P., and Cox, D. R. (1964) “An analysis of transformations.”
Journal of the Royal Statistical Society, 26, 825-840. Hernandez, F., and Johnson, R. A. (1980) “The large-sample behavior
of transformations to normality.” Journal of the American Statistical Association, 75, 855-861.
Sanford, W. (2001) “Yeo-Johnson Power Transformations.” Supported by National Science Foundation Grant DUE 97-52887.