ma_chap5

  • Upload
    3rlang

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

  • 8/14/2019 ma_chap5

    1/9

    Regression Diagnostics

    Regression Diagnostic asks 3 questions:

    Are the assumptions of multiple regression

    complied with?

    Is the model adequate?

    Is there anything unusual about any data points?

  • 8/14/2019 ma_chap5

    2/9

    Checking for Non-violation of

    Assumptions

    Linearity of relationship between each X and Ycan be checked by scatter plot of Y against eachX.

    Normality of distribution of Y data points canbe checked by plotting a histogram of residuals.

    Independence of explanatory variables from

    each other can be checked by scatter matrix,Variance Inflation Factor and Durbin-Watsonstatistic.

  • 8/14/2019 ma_chap5

    3/9

    Diagnosis of Multi-collinearity

    Check by means of correlation matrix

    Significant Fbut non-significant t-ratios.

    Variance Inflation. Large changes in regression

    coefficients when variables are added or deleted.

    Variance Inflation Factor (VIF) > 4 or 5 suggests multi-

    collinearity; VIF > 10 is strong evidence that

    collinearity is affecting the regression coefficients. DurbinWatson statistic is another check for

    collinearity. (Normal value 0-4).

  • 8/14/2019 ma_chap5

    4/9

    Diagnosis of Violation of

    Assumptions

    Residual Plotsare used to check for:

    Variance not being constant across the

    explanatory variables.

    Fitted relationship not being linear.

    Random variation not having a Normal

    distribution.

  • 8/14/2019 ma_chap5

    5/9

    Fitted Values and Residuals

    Fitted values (Fits) are the estimates of Y as

    determined by the regression equation.

    Residuals (Resids) are the differences between

    each observed value and the corresponding

    fitted value.

  • 8/14/2019 ma_chap5

    6/9

    Residual Plots

    0

    50

    100

    1st

    Qtr

    3rd

    Qtr

    EastWest

    North

    20015010050

    60

    50

    40

    30

    20

    10

    0

    -10

    -20

    -30

    -40

    Fitted Value

    Residual

    Residuals Versus the Fitted Values

    (response is Crimrate)

    0

    50

    100

    1st

    Qtr

    3rd

    Qtr

    EastWest

    North

    45403530252015105

    60

    50

    40

    30

    20

    10

    0

    -10

    -20

    -30

    -40

    Observation Order

    Residual

    Residuals Versus the Order of the Data

    (response is Crimrate)

    50403020100-10-20-30-40

    10

    5

    0

    Residual

    Frequency

    Histogram of the Residuals(response is Crimrate)

    6050403020100-10-20-30-40

    2

    1

    0

    -1

    -2

    NormalScore

    Residual

    Normal Probability Plot of the Residuals

    (response is Crimrate)

  • 8/14/2019 ma_chap5

    7/9

    Abnormal Patterns in Residual Plots

    Figures a). and b).suggest non-linearrelationship between Xand Y.

    Fig. c). Suggestsautocorrelation.

    Fig. d). Suggests variance

    is not the same since thespread of Y values is fargreater for larger valuesof X.

  • 8/14/2019 ma_chap5

    8/9

    Checking Unusual Data Points

    Check for outliers long distance away from the rest of

    the data. They exercise leverage, which is checked by

    hi. It is considered large if more than 3 x p /n

    (p=number of predictors including the constant).Flagged by X in printout.

    Cooks Distance which measures the influence of a data

    point on the regression equation. Cooks D > 1

    requires careful checking; > 4 suggests potentiallyserious outliers.

  • 8/14/2019 ma_chap5

    9/9

    Patterns of Outliers

    a). Outlier is extreme in both X andY but not in pattern. Removal isunlikely to alter regression line.

    b). Outlier is extreme in both X andY as well as in the overall pattern.

    Inclusion will strongly influenceregression line

    c). Outlier is extreme for X nearlyaverage for Y.

    d). Outlier extreme in Y not in X.

    e). Outlier extreme in pattern, butnot inX or Y.