1 The Vision Thing Power Thirteen Bivariate Normal Distribution

1

The Vision ThingPower Thirteen

Bivariate Normal Distribution

2

Outline

• Circles around the origin

• Circles translated from the origin

• Horizontal ellipses around the (translated) origin

• Vertical ellipses around the (translated) origin

• Sloping ellipses

3

x

y

x = 0, x2 =1

y = 0, y2 =1

x, y = 0

4

x

y

x = a, x2 =1

y = b, y2 =1

x, y = 0

a

b

5

x

y

x = 0, x2 > y

2

y = 0

x, y = 0

6

x

y

x = 0, x2 < y

2

y = 0

x, y = 0

7

x

y

x = a, x2 > y

2

y = b

x, y > 0

a

b

8

x

y

x = a, x2 > y

2

y = b

x, y < 0

a

b

9

Why? The Bivariate Normal Density and Circles

• f(x, y) = {1/[2xy]}*exp{(-1/[2(1-)]* ([(x-x)/x]2 -2([(x-x)/x] ([(y-y)/y] + ([(y-y)/y]2}

• If means are zero and the variances are one and no correlation, then

• f(x, y) = {1/2}exp{(-1/2 )*(x2 + y2), where f(x,y) = constant, k, for an isodensity

• ln2k =(-1/2)*(x2 + y2), and (x2 + y2)= -2ln2k=r2

10

Ellipses

• If x2 > y

2, f(x,y) = {1/[2xy]}*exp{(-1/2)* ([(x-x)/x]2 + ([(y-y)/y]2}, and x* = (x-x) etc.

• f(x,y) = {1/[2xy]}exp{(-1/2)* ([x*/x]2 + [y*/y]2) , where f(x,y) =constant, k, and ln{k [2xy]} = (-1/2) ([x*/x]2 + [y*/y]2 )and x2/c2 + y2/d2 = 1 is an ellipse

11

x

y

x = 0, x2 < y

2

y = 0

x, y < 0

Correlation and Rotation of the Axes

Y’

X’

12

Bivariate Normal: marginal & conditional

• If x and y are independent, then f(x,y) = f(x) f(y), i.e. the product of the marginal distributions, f(x) and f(y)

• The conditional density function, the density of y conditional on x, f(y/x) is the joint density function divided by the marginal density function of x: f(y/x) = f(x, y)/f(x)

Conditional Distribution

• f(y/x)= 1/[y ]exp{[-1/2(1-y2]* [y-y-

x-x)(y/x)]}

• the mean of the conditional distribution is: y + (x - x) )(y/x), i.e this is the expected value of y for a given value of x, x=x*:

• E(y/x=x*) = y + (x* - x) )(y/x)

• The variance of the conditional distribution is: VAR(y/x=x*) = x

2(1-)2

2/12 )1(2

14

x

y

x = a, x2 > y

2

y = b

x, y > 0

x

y

Regression line

intercept:y - x(y/x)

slope:(y/x)

15

Bivariate Regression: Another Perspective

• Regression line is the E(y/x) line if y and x are bivariate normal– intercept: y - x x/y)

– slope: x/y)

16

Example: Lab Six

0

1

2

3

4

5

6

-0.05 0.00 0.05 0.10

Series: GESample 1993:01 1996:12Observations 48

Mean 0.022218Median 0.019524Maximum 0.117833Minimum -0.058824Std. Dev. 0.043669Skewness 0.064629Kurtosis 2.231861

Jarque-Bera 1.213490Probability 0.545122

Rate of Return to GE stock

17

Example: Lab Six

0

2

4

6

8

10

12

-0.04 -0.02 0.00 0.02 0.04 0.06 0.08

Series: INDEXSample 1993:01 1996:12Observations 48

Mean 0.014361Median 0.017553Maximum 0.076412Minimum -0.044581Std. Dev. 0.025430Skewness -0.453474Kurtosis 3.222043

Jarque-Bera 1.743715Probability 0.418174

Rate of Return to S&P500 Index

18

Correlation Matrix

• GE INDEXGE 1.000000

0.636290 INDEX 0.636290 1.000000

•

19

Bivariate Regression: Another Perspective

• Regression line is the E(y/x) line if y and x are bivariate normal– intercept: y - x x/y)

– slope: x/y)

y = 0.022218

x = 0.014361

x/y) = (0.02543/0.043669) =

– intercept = 0.0064

– slope = 1.094

20

-0.10

-0.05

0.00

0.05

0.10

0.15

-0.05 0.00 0.05 0.10

INDEX

GE

Returns Generating Process For GE Stock and S&P 500 Index

21

Vs. 0.0064

Vs. 1.094

22

Bivariate Normal Distribution and the Linear probability Model

23

income

education

x = a, x2 > y

2

y = b

x, y > 0

mean income players

Meaneduc.

Players

MeanEduc

Non-Players

Mean income non

Non-Players

Players

24

income

education

x = a, x2 > y

2

y = b

x, y > 0

mean income players

Meaneduc.

Players

MeanEduc

Non-Players

Mean income Non-Players

Non-Players

Players

25

income

education

x = a, x2 > y

2

y = b

x, y > 0

mean income players

Meaneduc.

Players

MeanEduc

Non-Players


Non-Players

Players Discriminatingline

26

Discriminant Function, Linear Probability Function, and Decision Theory, Lab 6

• Expected Costs of Misclassification– E(C) = C(P/N)P(P/N)P(N)+C(N/P)P(N/P)P(P)

• Assume C(P/N) = C(N/P)

• Relative Frequencies P(N)=23/100~1/4, P(P)=77/100~3/4

• Equalize two costs of misclassification by setting fitted value of P(P/N), i.e.Bern to 3/4– E(C) = C(P/N)(3/4)(1/4)+C(N/P)(1/4)(3/4)

27

income

education

x = a, x2 > y

2

y = b

x, y > 0

mean income players

Meaneduc.

players

MeanEduc

Non-Players


Non-Players

Players Discriminatingline

Note: P(P/N) is area of the non-players distribution below (southwest) of the line

28

Set Bern = 3/4 = 1.39 -0.0216*education - 0.0105*income,solve for education as it depends on income and plot

297 non-players misclassified, as well as 14players misclassified

30

31

Decision Theory

• Moving the discriminant line, I.e. changing the cutoff value from 0.75 to 0.5, changes the numbers of those misclassified, favoring one population at the expense of another

• you need an implicit or explicit notion of the costs of misclassification, such as C(P/N) and C(N/P) to make the necessary judgement of where to draw the line

Documents

1 The Vision Thing Power Thirteen Bivariate Normal Distribution