Last Time Interpretation of Confidence Intervals Handling unknown μ and σ T Distribution Compute...

Preview:

Citation preview

Last Time

• Interpretation of Confidence Intervals

• Handling unknown μ and σ

• T Distribution

• Compute with TDIST & TINV

(Recall different organization)

(relative to NORMDIST & NORMINV)

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 420-427, 86-94

Approximate Reading for Next Class:

Pages 101-105 , 447-465, 511-516

Deeper look at Inference

Recall: “inference” = CIs and Hypo Tests

Main Issue: In sampling distribution

Usually σ is unknown, so replace with an estimate, s.

For n large, should be “OK”, but what about:

• n small?

• How large is n “large”?

nNX /,0~

Unknown SD

Then

So can write:

Replace by , then

has a distribution named:

“t-distribution with n-1 degrees of freedom”

nNX /,~

1,0~ N

n

X

sn

sX

t - Distribution

Notes:

4. Calculate t probs (e.g. areas & cutoffs),

using TDIST & TINV

Caution: these are set up differently from NORMDIST & NORMINV

EXCEL Functions

Summary:

Normal:

plug in: get out:

NORMDIST: cutoff area

NORMINV: area cutoff

(but TDIST is set up really differently)

EXCEL Functions

t distribution:

Area

2 tail:

plug in: get out:

TDIST: cutoff area

TINV: area cutoff

(EXCEL note: this one has the inverse)

t - Distribution

Application 1: Confidence Intervals

t - Distribution

Application 1: Confidence Intervals

Recall: mX

t - Distribution

Application 1: Confidence Intervals

Recall:

margin of error

mX

t - Distribution

Application 1: Confidence Intervals

Recall:

margin of error

from NORMINV

mX

t - Distribution

Application 1: Confidence Intervals

Recall:

margin of error

from NORMINV

or CONFIDENCE

mX

t - Distribution

Application 1: Confidence Intervals

Recall:

margin of error

from NORMINV

or CONFIDENCE

Using TINV?

mX

t - Distribution

Application 1: Confidence Intervals

Recall:

margin of error

from NORMINV

or CONFIDENCE

Using TINV? Careful need to standardize

mX

t - DistributionUsing TINV? Careful need to standardize

t - DistributionUsing TINV? Careful need to standardize

mXmXbyveredcoP ,95.0

t - DistributionUsing TINV? Careful need to standardize

mXmXbyveredcoP ,95.0

mXmXP

t - DistributionUsing TINV? Careful need to standardize

# spaces on number line

mXmXbyveredcoP ,95.0

mXmXP

mXP

t - DistributionUsing TINV? Careful need to standardize

# spaces on number line

Need to work into use TINV

mXmXbyveredcoP ,95.0

mXmXP

mXP ns

t - DistributionUsing TINV? Careful need to standardize

# spaces on number line

Need to work into use TINV

mXmXbyveredcoP ,95.0

mXmXP

mXP

ns

mns

XP

ns

t - Distribution

ns

mns

XP

95.0

t - Distribution

distribution

ns

mns

XP

95.0

nsX

t - Distribution

distribution

ns

mns

XP

95.0

nsm

nsX

t - Distribution

distribution

So want:

ns

mns

XP

95.0

nsm

nTINV )1,05.0( nsm

nsX

t - Distribution

distribution

So want:

i.e. want:

ns

mns

XP

95.0

nsm

nTINV )1,05.0(

ns

nTINVm )1,05.0(

nsm

nsX

t - Distribution

Class Example 15, Part Ihttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Old text book problem 7.24

t - Distribution

Class Example 15, Part Ihttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Old text book problem 7.24:

In a study of DDT poisoning, researchers fed several rats a measured amount. They measured the “absolutely refractory period” required for a nerve to recover after a stimulus. Measurements on 4 rats gave:

t - Distribution

Class Example 15, Part Ihttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

Old text book problem 7.24:

Measurements on 4 rats gave:

1.6 1.7 1.8 1.9

a) Find the mean refractory period, and the standard error of the mean

b) Give a 95% CI for the mean “absolutely refractory period” for all rats of this strain

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Note: small sample size (n = 4)

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Note: small sample size (n = 4),

population sd, σ, unknown

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Note: small sample size (n = 4),

population sd, σ, unknown,

so use sample sd, s

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Note: small sample size (n = 4),

population sd, σ, unknown,

so use sample sd, s,

and t distribution

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Center CI at Sample Mean

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Center CI at Sample Mean

Measure Sample Spread by S. D.

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Center CI at Sample Mean

Measure Sample Spread by S. D.

Divide by to get Standard Errorn

t - Distribution

Class Example 15, Part I

Data in cells B9:E9

Center CI at Sample Mean

Measure Sample Spread by S. D.

Divide by to get Standard Error

Which

answers (a)

n

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compute using TINV

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Recall:

d.f. = n – 1

= 4 – 1

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compare to old Normal CIs

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compare to old Normal CIs

Compute using

CONFIDENCE

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compare to old Normal CIs

T CIs are wider

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compare to old Normal CIs

T CIs are wider

(as expected)

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compare to old Normal CIs

Left End

t - Distribution

Class Example 15, Part I (b) 95% CI for μ

Data in cells B9:E9

CI Radius = Margin of Error

Compare to old Normal CIs

Left End

Right End

t - Distribution

Confidence Interval HW:

7.24 (a. Q-Q roughly linear, so OK, b. 43.17, 4.41, 0.987 c. [41.1, 45.2])

7.25

And now for something completely different

An extreme “sport” video:

t - Distribution

Application 2: Hypothesis Tests

t - Distribution

Application 2: Hypothesis Tests

Idea: Calculate P-values using TDIST

t - Distribution

Application 2: Hypothesis Tests

Idea: Calculate P-values using TDIST

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example

Recall Data in cells B9:E9

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example

Recall Data in cells B9:E9

As above: t – distribution appropriate

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example

Recall Data in cells B9:E9

As above: t – distribution appropriate

(small sample, and using s ≈ σ)

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example, Suppose that the mean “absolutely refractory period” is known to be 1.3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example, Suppose that the mean “absolutely refractory period” is known to be 1.3

(recall observed in data)

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example, Suppose that the mean “absolutely refractory period” is known to be 1.3. DDT poisoning should slow nerve recovery, and so increase this period.

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

For the above DDT poisoning example, Suppose that the mean “absolutely refractory period” is known to be 1.3. DDT poisoning should slow nerve recovery, and so increase this period. Do the data give good evidence for this supposition?

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

Let = population mean absolutely

refractory period for poisoned rats.

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

Let = population mean absolutely

refractory period for poisoned rats.

(checking strong evidence for

this)

3.1:0 H

3.1: AH

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26

Let = population mean absolutely

refractory period for poisoned rats.

(from before)

3.1:0 H

3.1: AH

75.1X

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

3.1|75.1 XP

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

3.1|75.1 XP

3.1|

3.175.1 nsns

XP

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

3.1|75.1 XP

3.1|

3.175.1 nsns

XP

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

Now use TDIST

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

Now use TDIST

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

Now use TDIST

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

Now use TDIST

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

Now use TDIST

Degrees of Freedom = n – 1 = 4 - 1

ns

tP3.175.1

3

t – Distribution Hypo Testing

E.g. Old Textbook Example 7.26 P-value = P{what saw or more conclusive | H0 – HA Bdry}

Now use TDIST

Tails

ns

tP3.175.1

3

t – Distribution Hypo TestingE.g. Old Textbook Example 7.26

From Class Example 27, part 2:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

P - value = 0.003

t – Distribution Hypo TestingE.g. Old Textbook Example 7.26

From Class Example 27, part 2:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg15.xls

P - value = 0.003

Interpretation: very strong evidence, for either yes-no or gray-level

t – Distribution Hypo TestingVariations:

• For “opposite direction” hypotheses:

:AH

t – Distribution Hypo TestingVariations:

• For “opposite direction” hypotheses:

P-value =

:AH

tP

t – Distribution Hypo TestingVariations:

• For “opposite direction” hypotheses:

P-value =

:AH

tP

t – Distribution Hypo TestingVariations:

• For “opposite direction” hypotheses:

P-value =

[wrong way for TDIST(…,1)]

:AH

tP

t – Distribution Hypo TestingVariations:

• For “opposite direction” hypotheses:

P-value =

Then use symmetry

:AH

tP

t – Distribution Hypo TestingVariations:

• For “opposite direction” hypotheses:

P-value =

Then use symmetry, i.e. put - into TDIST.

:AH

tP

t – Distribution Hypo TestingVariations:

• For 2-sided hypotheses

t – Distribution Hypo TestingVariations:

• For 2-sided hypotheses:

H0: μ =

H1: μ ≠

t – Distribution Hypo TestingVariations:

• For 2-sided hypotheses:

H0: μ =

H1: μ ≠

t – Distribution Hypo TestingVariations:

• For 2-sided hypotheses:

H0: μ =

H1: μ ≠

Use 2-tailed version of TDIST

t – Distribution Hypo TestingVariations:

• For 2-sided hypotheses:

H0: μ =

H1: μ ≠

Use 2-tailed version of TDIST,

i.e. TDIST(…,2)

t – Distribution Hypo Testing

HW: Interpret P-values:

(i) yes-no

(ii) gray-level

7.21e ((i)significant, (ii) significant, but not

very strongly so)

7.22e (0.0619, (i) not significant (ii) not sig.,

but nearly significant)

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

From Matthew Campbell

UNC Master’s Student

In Geological Sciences

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

(fossil beds up and down Eastern Seaboard)

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

• Dated (by fossil bed)

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

• Dated (by fossil bed)

• Biologically categorized

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

• Dated (by fossil bed)

• Biologically categorized

(family – genus – species, etc.)

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

• Dated (by fossil bed)

• Biologically categorized

Goal: study extinctions over long periods

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Data points:

• Fossilized shells

• Dated (by fossil bed)

• Biologically categorized

Goal: study extinctions over long periods

(via last time saw each)

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Oversmoothed:

nothing interesting

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Undersmoothed:

many bumps appear

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Undersmoothed:

many bumps appear

but not statistically significant

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Intermediate Smoothing

two bumps appear

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Intermediate Smoothing

two bumps appear

SiZer result: not statistically significant

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Matthew’s Comment:

Whoah, those are times

of mass extinctions

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Matthew’s Comment:

Whoah, those are times

of mass extinctions

Any way to show these are “really there”?

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Any way to show these

are “really there”?

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Any way to show these

are “really there”?

Standard Answer:

Get more data

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Any way to show these

are “really there”?

Challenge:

Took 100s of year to get these!

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Any way to show these

are “really there”?

Alternate Approach:

Refined from Genus level to Species

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Species level result

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Species level result:

Now both bumps are

significant

Research Corner

Another SiZer analysis:

Mollusk Extinction Data

Species level result:

Now both bumps are

significant

Consistent with Global Climactic Events

Variable Relationships

Chapter 2 in Text

Variable Relationships

Chapter 2 in Text

Idea: Look beyond single quantities

Variable Relationships

Chapter 2 in Text

Idea: Look beyond single quantities, to how quantities relate to each other.

Variable Relationships

Chapter 2 in Text

Idea: Look beyond single quantities, to how quantities relate to each other.

E.g. How do HW scores “relate”

to Exam scores?

Variable Relationships

Chapter 2 in Text

Idea: Look beyond single quantities, to how quantities relate to each other.

E.g. How do HW scores “relate”

to Exam scores?

Section 2.1: Useful graphical device:

Scatterplot

Plotting Bivariate Data

Toy Example: Ordered pairs

(1,2)

(3,1)

(-1,0)

(2,-1)

Plotting Bivariate Data

Toy Example: Ordered pairs

Captures relationship between X & Y

(1,2) as (X,Y)

(3,1)

(-1,0)

(2,-1)

Plotting Bivariate Data

Toy Example: Ordered pairs

Captures relationship between X & Y

(1,2) as (X,Y)

(3,1) e.g. (height, weight)

(-1,0)

(2,-1)

Plotting Bivariate Data

Toy Example: Ordered pairs

Captures relationship between X & Y

(1,2) as (X,Y)

(3,1) e.g. (height, weight)

(-1,0) e.g. (MT Score, Final Exam Score)

(2,-1)

Plotting Bivariate Data

Toy Example:

(1,2) Think in terms of:

(3,1)

(-1,0) X coordinates

(2,-1)

Plotting Bivariate Data

Toy Example:

(1,2) Think in terms of:

(3,1)

(-1,0) X coordinates

(2,-1) Y coordinates

Plotting Bivariate Data

Toy Example:

(1,2) Think in terms of:

(3,1)

(-1,0) X coordinates

(2,-1) Y coordinates

And plot in x,y plane, to see relationship

Plotting Bivariate Data

Toy Example:

(1,2)

(3,1)

(-1,0)

(2,-1)

Toy Scatterplot, Separate Points

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3 4

x

y

Plotting Bivariate Data

Toy Example:

(1,2)

(3,1)

(-1,0)

(2,-1)

Toy Scatterplot, Separate Points

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3 4

x

y

Plotting Bivariate Data

Toy Example:

(1,2)

(3,1)

(-1,0)

(2,-1)

Toy Scatterplot, Separate Points

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3 4

x

y

Plotting Bivariate Data

Toy Example:

(1,2)

(3,1)

(-1,0)

(2,-1)

Toy Scatterplot, Separate Points

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3 4

x

y

Plotting Bivariate Data

Sometimes:

Can see more

insightful patterns

by connecting

points

Toy Scatterplot, Connected points

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3 4

x

y

Plotting Bivariate Data

Sometimes:

Useful to switch off

points, and only

look at lines/curves

Toy Scatterplot, Lines Only

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2 -1 0 1 2 3 4

x

y

Plotting Bivariate Data

Common Name: “Scatterplot”

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Charts

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Charts

Scatter Button

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Charts

Scatter Button

Choose Dots

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Charts

Scatter Button

Choose Dots

(but note other options)

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Charts

Scatter Button

Choose Dots

Manipulate plot as done before for bar plots

Plotting Bivariate Data

Common Name: “Scatterplot”

A look under the hood in Excel:

Insert Tab

Charts

Scatter Button

Choose Dots

Manipulate plot as done before for bar plots

(e.g. titles, labels, colors, styles, …)

Scatterplot E.g.Class Example 16:

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Data from related Intro. Statistics Class

Scatterplot E.g.Class Example 16:

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Data from related Intro. Statistics Class

(actual scores)

Scatterplot E.g.Class Example 16:

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Data from related Intro. Statistics Class

(actual scores)

A. How does HW score predict Final Exam?

Scatterplot E.g.Class Example 16:

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Data from related Intro. Statistics Class

(actual scores)

A. How does HW score predict Final Exam?

xi = HW, yi = Final Exam

Scatterplot E.g.Class Example 16:

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Data from related Intro. Statistics Class

(actual scores)

A. How does HW score predict Final Exam?

xi = HW, yi = Final Exam

(Study Relationship

using scatterplot)

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

(Scatterplot View)

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

i. In top half of HW scores

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

i. In top half of HW scores:Better HW Better Final

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

i. In top half of HW scores:Better HW Better Final

ii. For lower HW

Scatterplot E.g.Class Example 16:

How does HW score predict Final Exam?

xi = HW, yi = Final Exam

i. In top half of HW scores:Better HW Better Final

ii. For lower HW:Final is more “random”

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Call X the “Explanatory Var.”

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Call X the “Explanatory Var.” or “Indep. Var.”

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Call X the “Explanatory Var.” or “Indep. Var.”

Call Y the “Response Var.”

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Call X the “Explanatory Var.” or “Indep. Var.”

Call Y the “Response Var.” or “Dep. Var.”

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Call X the “Explanatory Var.” or “Indep. Var.”

Call Y the “Response Var.” or “Dep. Var.”

(think of “Y as function of X”)

Scatterplots

Common Terminology:

When thinking about “X causes Y”,

Call X the “Explanatory Var.” or “Indep. Var.”

Call Y the “Response Var.” or “Dep. Var.”

(think of “Y as function of X”)

(although not always sensible)

Scatterplots

Note: Sometimes think about causation

Scatterplots

Note: Sometimes think about causation,

Other times: “Explore Relationship”

Scatterplots

Note: Sometimes think about causation,

Other times: “Explore Relationship”

HW: 2.9

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

(Replace Final above

with 1st Midterm)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

i. Better HW better Exam

(general upwards

tendency still

the same)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

i. Better HW better Exam

ii. Wider range MT1 scores

(for each range

of HW scores)

(relative to final scores)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

i. Better HW better Exam

ii. Wider range MT1 scores

iii. HW doesn’t predict MT1

(as well as HW

predicted the final)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

i. Better HW better Exam

ii. Wider range MT1 scores

iii. HW doesn’t predict MT1

iv. “Outliers” in scatterplot

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

i. Better HW better Exam

ii. Wider range MT1 scores

iii. HW doesn’t predict MT1

iv. “Outliers” in scatterplot

e.g. HW = 72, MT1 = 94

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

B. How does HW predict Midterm 1?

xi = HW, yi = MT1

i. Better HW better Exam

ii. Wider range MT1 scores

iii. HW doesn’t predict MT1

iv. “Outliers” in scatterplot may not be outliers in either individual variable

e.g. HW = 72, MT1 = 94

(bad HW, but good MT1?, fluke???)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

(Different choice of x and y, since

studying different relationship)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

(Study Relationship

using tool of

scatterplot)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

(don’t expect better MT1

to lead to better MT2)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

(again clear overall

upwards trend)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

iii. Wider range of MT2

(for each range of MT1)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

iii. Wider range of MT2

i.e. “not good predictor”

(MT1) (of MT2)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

iii. Wider range of MT2

i.e. “not good predictor”

iv. Interesting Outliers

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

iii. Wider range of MT2

i.e. “not good predictor”

iv. Interesting Outliers:MT1 = 100, MT2 = 56 (oops!)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

iii. Wider range of MT2

i.e. “not good predictor”

iv. Interesting Outliers:MT1 = 100, MT2 = 56

MT1 = 23, MT2 = 74 (woke up!)

Class Scores Scatterplotshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

C. How does MT1 predict MT2?

xi = MT1, yi = MT2

i. Idea: less “causation”, more “exploration”

ii. High MT1 High MT2

iii. Wider range of MT2

i.e. “not good predictor”

iv. Interesting Outliers:MT1 = 100, MT2 = 56

MT1 = 23, MT2 = 74

MT1 70s, MT2 90s (moved up!)

And now for something completely different

A thought provoking movie clip:

http://www.aclu.org/pizza/

Important Aspects of Relations

I. Form of Relationship

(Linear or not?)

Important Aspects of Relations

I. Form of Relationship

II. Direction of Relationship

(trending up or down?)

Important Aspects of Relations

I. Form of Relationship

II. Direction of Relationship

III. Strength of Relationship

(how much of data “explained”?)

I. Form of Relationship• Linear: Data approximately follow a line

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

But saw others with

“rough linear trend”

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

But saw others with

“rough linear trend”

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

But saw others with

“rough linear trend”

Interesting question:

Measure strength of

linear trend

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

• Nonlinear: Data follows different pattern

(non-linear)

I. Form of Relationship• Linear: Data approximately follow a line

Previous Class Scores Examplehttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

Final vs. High values of HW is “best”

• Nonlinear: Data follows different pattern

Nice Example: Bralower’s Fossil Data

http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago:

• Small shells from ocean floor cores

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago:

• Small shells from ocean floor cores

• Ratios of Isotopes of Strontium

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago:

• Small shells from ocean floor cores

• Ratios of Isotopes of Strontium

• Reflects Ice Ages, via Sea Level

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago:

• Small shells from ocean floor cores

• Ratios of Isotopes of Strontium

• Reflects Ice Ages, via Sea Level

(50 meter difference!)

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago:

• Small shells from ocean floor cores

• Ratios of Isotopes of Strontium

• Reflects Ice Ages, via Sea Level

(50 meter difference!)

• As function of time

Bralower’s Fossil Datahttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg17.xls

From T. Bralower, formerly of Geological Sci.

Studies Global Climate, millions of years ago:

• Small shells from ocean floor cores

• Ratios of Isotopes of Strontium

• Reflects Ice Ages, via Sea Level

(50 meter difference!)

• As function of time

• Clearly nonlinear relationship

II. Direction of Relationship

• Positive Association

II. Direction of Relationship

• Positive Association

X bigger Y bigger

II. Direction of Relationship

• Positive Association

X bigger Y bigger

E.g. Class Scores Data above

II. Direction of Relationship

• Positive Association

X bigger Y bigger

• Negative Association

II. Direction of Relationship

• Positive Association

X bigger Y bigger

• Negative Association

X bigger Y smaller

II. Direction of Relationship

• Positive Association

X bigger Y bigger

• Negative Association

X bigger Y smaller

E.g. X = alcohol consumption, Y = Driving Ability

Clear negative association

II. Direction of Relationship

• Positive Association

X bigger Y bigger

• Negative Association

X bigger Y smaller

Note: Concept doesn’t always apply:

II. Direction of Relationship

• Positive Association

X bigger Y bigger

• Negative Association

X bigger Y smaller

Note: Concept doesn’t always apply:

Bralower’s Fossil Data

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• Final Exam is “closely related to HW”

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• Final Exam is “closely related to HW”

• Midterm 1 less closely related to HW

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• Final Exam is “closely related to HW”

• Midterm 1 less closely related to HW

• Midterm 2 even less related to Midterm 1

III. Strength of Relationship

Idea: How close are points to lying on a line?

Revisit Class Scores Example:http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg16.xls

• Final Exam is “closely related to HW”

• Midterm 1 less closely related to HW

• Midterm 2 even less related to Midterm 1

Interesting Issue:

Measure this strength

Linear Relationship HW

HW:

2.11, 2.13, 2.15, 2.17