Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Sample CorrelationMathematics 47: Lecture 5
Dan Sloughter
Furman University
March 10, 2006
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 1 / 8
Definition
If X and Y are random variables with means µX and µY and variances σ2X
and σ2Y , respectively, then we call
cov(X ,Y ) = E [(X − µX )(Y − µY )]
the covariance of X and Y .
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 2 / 8
Theorem (Cauchy-Schwarz Inequality)
If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then
(E [XY ])2 ≤ E [X 2]E [Y 2].
Proof.
I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].
I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.
I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.
I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8
Theorem (Cauchy-Schwarz Inequality)
If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then
(E [XY ])2 ≤ E [X 2]E [Y 2].
Proof.
I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].
I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.
I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.
I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8
Theorem (Cauchy-Schwarz Inequality)
If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then
(E [XY ])2 ≤ E [X 2]E [Y 2].
Proof.
I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].
I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.
I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.
I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8
Theorem (Cauchy-Schwarz Inequality)
If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then
(E [XY ])2 ≤ E [X 2]E [Y 2].
Proof.
I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].
I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.
I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.
I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8
Theorem (Cauchy-Schwarz Inequality)
If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then
(E [XY ])2 ≤ E [X 2]E [Y 2].
Proof.
I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].
I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.
I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.
I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8
Theorem (Cauchy-Schwarz Inequality)
If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then
(E [XY ])2 ≤ E [X 2]E [Y 2].
Proof.
I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].
I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.
I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.
I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8
Correlation coefficient
I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have
|cov(X ,Y )| ≤√
E [(X − µX )2]√
E [(Y − µY )2] = σXσY .
I If we let,
ρX ,Y =cov(X ,Y )
σXσY
then −1 ≤ ρX ,Y ≤ 1.
I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.
Definition
We call ρX ,Y the correlation coefficient of X and Y .
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8
Correlation coefficient
I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have
|cov(X ,Y )| ≤√
E [(X − µX )2]√
E [(Y − µY )2] = σXσY .
I If we let,
ρX ,Y =cov(X ,Y )
σXσY
then −1 ≤ ρX ,Y ≤ 1.
I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.
Definition
We call ρX ,Y the correlation coefficient of X and Y .
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8
Correlation coefficient
I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have
|cov(X ,Y )| ≤√
E [(X − µX )2]√
E [(Y − µY )2] = σXσY .
I If we let,
ρX ,Y =cov(X ,Y )
σXσY
then −1 ≤ ρX ,Y ≤ 1.
I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.
Definition
We call ρX ,Y the correlation coefficient of X and Y .
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8
Correlation coefficient
I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have
|cov(X ,Y )| ≤√
E [(X − µX )2]√
E [(Y − µY )2] = σXσY .
I If we let,
ρX ,Y =cov(X ,Y )
σXσY
then −1 ≤ ρX ,Y ≤ 1.
I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.
Definition
We call ρX ,Y the correlation coefficient of X and Y .
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8
Correlation and independence
I Note: if X and Y are independent, then cov(X ,Y ) = 0 (and henceρX ,Y = 0).
I If cov(X ,Y ) = 0, we say X and Y are uncorrelated.
I However, uncorrelated does not necessarily imply independence,although it does if (X ,Y ) has a bivariate normal distribution.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 5 / 8
Correlation and independence
I Note: if X and Y are independent, then cov(X ,Y ) = 0 (and henceρX ,Y = 0).
I If cov(X ,Y ) = 0, we say X and Y are uncorrelated.
I However, uncorrelated does not necessarily imply independence,although it does if (X ,Y ) has a bivariate normal distribution.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 5 / 8
Correlation and independence
I Note: if X and Y are independent, then cov(X ,Y ) = 0 (and henceρX ,Y = 0).
I If cov(X ,Y ) = 0, we say X and Y are uncorrelated.
I However, uncorrelated does not necessarily imply independence,although it does if (X ,Y ) has a bivariate normal distribution.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 5 / 8
Sample correlation
I Now suppose (X1,Y1), (X2,Y2), . . . , (Xn,Yn) are independentidentically distributed pairs of random variables (that is, a randomsample from a bivariate distribution).
I Let
R =1n
∑ni=1(Xi − X )(Yi − Y )√
1n
∑ni=1(Xi − X )2
√1n
∑ni=1(Yi − Y )2
=
∑ni=1 XiYi − nX Y√∑n
i=1 X 2i − nX 2
√∑ni=1 Y 2
i − nY 2
=
∑ni=1 XiYi −
Pni=1 Xi
Pni=1 Yi
n√∑ni=1 X 2
i −(
Pni=1 Xi)
2
n
√∑ni Y 2
i −(
Pni=1 Yi)
2
n
.
Definition
We call R the sample correlation coefficient.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 6 / 8
Sample correlation
I Now suppose (X1,Y1), (X2,Y2), . . . , (Xn,Yn) are independentidentically distributed pairs of random variables (that is, a randomsample from a bivariate distribution).
I Let
R =1n
∑ni=1(Xi − X )(Yi − Y )√
1n
∑ni=1(Xi − X )2
√1n
∑ni=1(Yi − Y )2
=
∑ni=1 XiYi − nX Y√∑n
i=1 X 2i − nX 2
√∑ni=1 Y 2
i − nY 2
=
∑ni=1 XiYi −
Pni=1 Xi
Pni=1 Yi
n√∑ni=1 X 2
i −(
Pni=1 Xi)
2
n
√∑ni Y 2
i −(
Pni=1 Yi)
2
n
.
Definition
We call R the sample correlation coefficient.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 6 / 8
Sample correlation
I Now suppose (X1,Y1), (X2,Y2), . . . , (Xn,Yn) are independentidentically distributed pairs of random variables (that is, a randomsample from a bivariate distribution).
I Let
R =1n
∑ni=1(Xi − X )(Yi − Y )√
1n
∑ni=1(Xi − X )2
√1n
∑ni=1(Yi − Y )2
=
∑ni=1 XiYi − nX Y√∑n
i=1 X 2i − nX 2
√∑ni=1 Y 2
i − nY 2
=
∑ni=1 XiYi −
Pni=1 Xi
Pni=1 Yi
n√∑ni=1 X 2
i −(
Pni=1 Xi)
2
n
√∑ni Y 2
i −(
Pni=1 Yi)
2
n
.
Definition
We call R the sample correlation coefficient.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 6 / 8
Example
I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7
I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then
7∑i=1
xi = 700,7∑
i=1
yi = 171.3,7∑
i=1
xiyi = 18, 624,
7∑i=1
x2i = 81, 200, and
7∑i=1
y2i = 4398.19.
I So
r =18, 624− (700)(171.3)
7√81, 200− (700)2
7
√4398.19− (171.3)2
7
≈ 0.9830.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8
Example
I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7
I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then
7∑i=1
xi = 700,7∑
i=1
yi = 171.3,7∑
i=1
xiyi = 18, 624,
7∑i=1
x2i = 81, 200, and
7∑i=1
y2i = 4398.19.
I So
r =18, 624− (700)(171.3)
7√81, 200− (700)2
7
√4398.19− (171.3)2
7
≈ 0.9830.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8
Example
I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7
I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then
7∑i=1
xi = 700,
7∑i=1
yi = 171.3,
7∑i=1
xiyi = 18, 624,
7∑i=1
x2i = 81, 200, and
7∑i=1
y2i = 4398.19.
I So
r =18, 624− (700)(171.3)
7√81, 200− (700)2
7
√4398.19− (171.3)2
7
≈ 0.9830.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8
Example
I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7
I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then
7∑i=1
xi = 700,
7∑i=1
yi = 171.3,
7∑i=1
xiyi = 18, 624,
7∑i=1
x2i = 81, 200, and
7∑i=1
y2i = 4398.19.
I So
r =18, 624− (700)(171.3)
7√81, 200− (700)2
7
√4398.19− (171.3)2
7
≈ 0.9830.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8
Example (cont’d)
I If the nitrogen levels are in a vector x and the wheat yields are in avector y, then the R command > cor(x, y) returns r , in this case0.9830173.
Dan Sloughter (Furman University) Sample Correlation March 10, 2006 8 / 8