24
REGRESSIONSANALYS F8 Linda Wnstrm Statistiska institutionen, Stockholms universitet 1/23

REGRESSIONSANALYS - gauss.stat.su.se

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: REGRESSIONSANALYS - gauss.stat.su.se

REGRESSIONSANALYSF8

Linda Wänström

Statistiska institutionen, Stockholms universitet

1/23

Page 2: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Regressionsdiagnostik

I Statistiska tekniker för att upptäcka förhållanden som kanleda till felaktiga slutsatser

I upptäcka outliersI kontrollera regressionsantagandenI upptäcka multikolinjäritet

2/23

Page 3: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Regressionsdiagnostik

I Undersök datamaterialetI Min-, maxvärden, centralmått, spridningsmåttI Frekvenstabeller, diagramI Spridningsdiagram

I Undersök residualer

3/23

Page 4: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Residualer

bEi = Yi � bYi i = 1, 2, ..., nskattning av icke-observerad felterm Ei .

Antaganden: Ei ober � N(0, σ2)

4/23

Page 5: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

.

5/23

Page 6: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Residualer

Standardiserad residual:

zi =bEis

Studentized residual:

ri =bEi

sp1� hi

Jackknife residual:

r(�i ) =bEi

s(�i )p1� hi

5/23

Page 7: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Att upptäcka outliers

I Leverage, hi ,I Jackknife residual, r(�i )I Cook�s distance, di

6/23

Page 8: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Sas-kod

Proc reg;model Y=X1 X2 X3;output out=res predict=pred residual=resid rstudent=srescookd=cookd;run;

7/23

Page 9: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Att kontrollera antagandenaResidualplottar

I LinjäritetI HomoscedasticitetI Oberoende

8/23

Page 10: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Sas-kod

proc reg;model Y=X1 X2 X3;;plot rstudent*predicted;run;

9/23

Page 11: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Att kontrollera antagandenaResidualplottar

I Normal probability plot

10/23

Page 12: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

SAS-kod

Proc univariate;qqplot sres;histogram sres;run;

11/23

Page 13: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Att kontrollera antagandena

I Möjliga åtgärder

12/23

Page 14: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Multikolinjäritet

En eller �era X -variabler är en linjärkombination av en eller �eraandra X-variablerMått på multikolijnäritet: Variance In�ation Factor

VIFj =1

1� R2j, j = 1, ..., k

13/23

Page 15: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

ExempelObserv.    rum     yta     avgift    pris    ort

                            1      2.0     70.0     4.080    2990     1                  2      3.0     71.0     4.371    3250     1

                            3      4.0     90.0     5.135    4600     1                            4      4.0    100.5     5.281    4175     1                            5      2.0     62.0   3.635    2695     1                            6      5.0    130.0     6.156    5695     1                            7      3.0     85.0     4.958    3195     1                            8      2.0     66.0     4.091    2250     1

          9      3.0     77.0     4.492    3250     1                           10      4.0    108.0     6.611    4800     1                           11      5.0    108.0     6.611    4800     1                           12      4.0    104.0     5.808    3950     1                           13      1.0     37.0     2.108    1290     1                           14      5.0    120.0     6.738    4975     1                           15      4.0    116.0     5.618    4195     1

 16      3.0     81.0     5.374    3800     1                           17      3.0     75.0     4.375    2795     1                           18      2.0     70.5     4.173    2395     1                           19      2.0     74.5     3.911    1950     1                           20      4.0    111.5     5.854    4195     1                           21      2.0     57.0     2.993    1750     1                           22      2.0     73.5     4.377    2775     1                           23  4.0    101.0     5.892    4000     1                           24      3.0     88.0     4.547    3650     1                           25      4.0     90.0     5.125    4550     1                           26      4.0     99.0     6.376    4595     1

                         27      4.0    112.5     6.567    5200     1                           28      1.5     46.0     2.862    1995     1                           29      2.0     71.0     4.371    3250     1                           30      3.0  67.5     4.682    2695     1                           31      3.0     75.5     4.140     895     0                           32      2.0     59.0     3.256     750     0                           33      3.0     87.0     4.952    1085     0

                 34      5.0     90.5     5.093     795     0                           35      3.0     87.5     4.777    1200     0                           36      3.0     84.0     4.049     850     0                           37      3.0     90.5   5.008     775     0                           38      4.0     79.5     4.414    1395     0                           39      2.0     54.5     4.287     595     0                           40      3.0     75.0     3.041    1200     0

         41      3.0     68.0     4.023    1250     0                           42      3.0     68.4     2.341    1300     0                           43      2.0     74.0     3.657     595     0                           44      4.0    107.0     6.280    2130     0                           45      3.0     90.5     5.762     695     0                           46      4.0     84.5     4.973    1695     0                           47      2.0     55.0     3.381    1153     0

 48      5.0    121.0     6.998    2413     0                           49      4.0     90.5     5.093     795     0 14/23

Page 16: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Exempel

CORR­proceduren

                   5  Variabler:    rum      yta      avgift   pris     ort

                                        Enkel statistik

   Variabel        Ant.    Medelvärde       Stdavv.         Summa       Minimum       Maximum

   rum               49       3.17347       1.02851     155.50000       1.00000       5.00000   yta               49      83.76327      20.48608          4104      37.00000     130.00000   avgift            49       4.74892       1.17465     232.69700       2.10800       6.99800   pris              49          2597          1499        127271     595.00000          5695   ort               49       0.61224       0.49229      30.00000             0       1.00000

Pearsons korrelationskoefficienter, N = 49Sannol. > |r| under H0: Rho=0

                           rum           yta        avgift          pris           ort

          rum          1.00000       0.88100       0.81276       0.49775 ­0.02897                                      <.0001        <.0001        0.0003        0.8434

          yta          0.88100       1.00000       0.90131       0.60322       0.10247                        <.0001                 <.0001        <.0001        0.4836

          avgift       0.81276       0.90131       1.00000       0.61454       0.16949                        <.0001        <.0001                      <.0001        0.2443

          pris         0.49775       0.60322       0.61454       1.00000       0.78444                        0.0003        <.0001        <.0001                      <.0001

          ort ­0.02897       0.10247       0.16949       0.78444       1.00000 0.8434        0.4836        0.2443        <.0001

15/23

Page 17: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Exempelor t   =  0

2 3 4 5

r um

pris

60 80 100 120

yt a

pris

3 4 5 6

avgi f t

pris

16/23

Page 18: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Exempelor t   =  1

1 2 3 4 5

r um

pris

40 60 80 100 120

yt a

pris

3 4 5 6

avgi f t

pris

17/23

Page 19: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

ExempelREG­proceduren                                         Modell: MODEL1

 Beroendevariabel: pris

                             Antal lästa observationer            49                             Antal använda observationer          49

                                          Variansanalys

         Summa av         Medel­         Källa                    DF      kvadrater        kvadrat    F­värde    Sh. > F

         Modell                    4       97655116       24413779     105.43     <.0001         Fel                      44       10188597         231559         Korrigerad total         48      107843713

                   Rot MSE             481.20581    R­kvadrat          0.9055                   Beroende medel     2597.36735    Just. R­kvadr.     0.8969                   Koeff.var.           18.52668

                                      Parameterskattningar

                           Parameter­      Standard­                           Variations­       Variabel     DF      skattning            fel    t­värde    Pr > |t|     inflation

       Skärning      1 ­1864.68903      309.85235 ­6.02      <.0001              0       rum           1      363.37954      149.09717       2.44      0.0189        4.87451       yta           1       19.64981        9.70128 2.03      0.0489        8.18757

avgift        1       52.97751      140.19416       0.38      0.7073        5.62154       ort           1     2305.22889      149.95779      15.37      <.0001        1.12968

18/23

Page 20: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Exempel

REG­proceduren                                         Modell: MODEL1                                     Beroendevariabel: pris

                             Antal lästa observationer            49                  Antal använda observationer          49

                                          Variansanalys

                                           Summa av         Medel­         Källa                    DF      kvadrater        kvadrat    F­värde    Sh. > F

         Modell                    3       97622050       32540683     143.26     <.0001         Fel                      45       10221664         227148         Korrigerad total         48      107843713

                   Rot MSE            476.60055    R­kvadrat          0.9052                   Beroende medel     2597.36735    Just. R­kvadr.     0.8989                   Koeff.var.           18.34937

                                      Parameterskattningar

          Parameter­      Standard­                           Variations­       Variabel     DF      skattning            fel    t­värde    Pr > |t|      inflation

       Skärning      1 ­1842.96593      301.55921 ­6.11      <.0001              0       rum           1      371.46745      146.14085       2.54      0.0145        4.77407       yta           1       22.00067        7.37274       2.98      0.0046        4.82067       ort           1     2317.11928      145.21626      15.96      <.0001        1.07994

19/23

Page 21: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Exempel

20/23

Page 22: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

Exempel

­1400 ­1000 ­600 ­200 200 600 1000 1400

_RESI D

0

5

10

15Frequency

21/23

Page 23: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

ExempelREG­proceduren

                                         Modell: MODEL1                                     Beroendevariabel: pris

                                         Output­statistik

         Beroende­ Predikterat        ///Stdfel    Obs.  variabel       värde Medelv. predikt. 95% CL Mean        95% CL Predict    Residual

       1      2990        2757         121.5350      2512      3002      1766      3748  232.8646       2      3250        3151         123.3164      2902      3399      2159      4142   99.3965       3      4600        3940         129.1531      3680      4200      2946      4935  659.9162       4      4175        4171         103.9571      3962      4380      3189      5154    3.9092       5      2695        2581         118.4597      2343      2820      1592      3570  113.8700       6      5695        5192         175.2145      4839      5544      4169      6214  503.4219       7      3195        3459          89.1170    3279      3638      2482      4435 ­263.6129       8      2250        2669         116.3272      2435      2903      1681      3657 ­419.1327       9      3250        3283          97.3867      3086      3479      2303      4262 ­32.6075      10   4800        4336         118.1101      4098      4574      3347      5325  463.9041      11      4800        4708         168.3742      4368      5047      3689      5726   92.4367      12      3950        4248         107.2976      4032      4464  3264      5232 ­298.0932      13      1290        1660         185.1879      1287      2033  629.8052      2689 ­369.6457      14      4975        4972         152.0163      4665      5278      3964      5979    3.4286      15      4195        4512      155.0572      4200      4824      3503      5522 ­317.1013      16      3800        3371          88.5623      3192      3549      2394      4347  429.3898      17      2795        3239         104.6752      3028      3449      2256      4221 ­443.6062      18      2395        2768         122.6699      2521      3015      1777      3759 ­373.1357      19      1950        2856         135.0686      2584      3128      1858      3854 ­906.1384      20      4195        4413         132.3233      4147      4680      3417      5409 ­218.0982      21      1750        2471         130.7847      2208      2735      1476      3467 ­721.1266      22      2775        2834         131.4593      2569      3099      1838      3830 ­59.1377      23      4000        4182         104.0497      3973      4392      3200      5165 ­182.0912      24      3650        3525          95.6936      3332      3717      2546      4504  125.3850      25      4550        3940         129.1531      3680      4200      2946      4935  609.9162      26      4595        4138         104.4620      3928      4348      3155      5121  456.9102      27      5200        4435         137.0074      4159      4711      3436      5434  764.9011      28      1995        2043         159.9163      1721      2365      1031      3056 ­48.3855      29      3250        2779         123.9042      2530      3029      1787      3771  470.8639      30      2695        3074         142.6625      2786      3361      2072      4076 ­378.6012

    31  895.0000    932.4873         111.1700  708.5795      1156 ­53.2036      1918 ­37.4873      32  750.0000    198.0087         136.2085 ­76.3292  472.3466 ­800.3465      1196  551.9913      33      1085        1185         130.8383  921.9731  1449  190.0576      2181 ­100.4950      34  795.0000        2005         230.3257      1542      2469  939.2921      3072 ­1210      35      1200        1196         132.8599  928.9017      1464  199.9723      2193    3.5047      36  850.0000   1119         120.3779  877.0394      1362  129.4246      2110 ­269.4930      37  775.0000        1262         146.3595  967.7143      1557  258.3316      2267 ­487.4974      38      1395        1392         167.0760      1055      1728  374.7605      2409    3.0426      39  595.0000     99.0057         141.6157 ­186.2230  384.2343 ­902.3969      1100  495.9943      40      1200    921.4869         111.7048  696.5020      1146 ­64.4492      1907  278.5131      41      1250    767.4822         130.3855  504.8722      1030 ­227.7141      1763  482.5178      42      1300    776.2825         128.8350  516.7954      1036 ­218.0943      1771  523.7175      43  595.0000    528.0188         171.5947  182.4094  873.6282 ­492.2251      1548   66.9812      44      2130        1997         149.9633      1695      2299  990.6555      3003  133.0241

45  695.0000        1262         146.3595  967.7143      1557  258.3316      2267 ­567.4974      46      1695        1502         144.2662      1211      1793  499.0246      2505  193.0392      47      1153    110.0060         140.6392 ­173.2558  393.2678 ­890.8381      1111      1043      48      2413        2676         173.3471      2327      3026      1655      3698 ­263.4528      49  795.0000        1634         126.0094      1380      1888  641.0579      2627 ­838.9648

22/23

Page 24: REGRESSIONSANALYS - gauss.stat.su.se

Kap 14: Residualanalys

ExempelREG­proceduren

           Modell: MODEL1                                     Beroendevariabel: pris

                                        Output­statistik

Stdfel    Student.                          Cooks                      Obs. Residual    Residual ­2­1 0 1 2              D

                         1    460.8       0.505    |      |*     |       0.004                         2    460.4       0.216    |      |      |       0.001                         3    458.8       1.438  |      |**    |       0.041                         4    465.1     0.00840    |      |      |       0.000                         5    461.6       0.247    |      |      |       0.001                         6    443.2       1.136    |      |**    |     0.050                         7    468.2 ­0.563    |     *|      |       0.003                         8    462.2 ­0.907    |     *|      |       0.013                         9    466.5 ­0.0699    |      |      |       0.000

               10    461.7       1.005    |      |**    |       0.017                        11    445.9       0.207    |      |      |       0.002                        12    464.4 ­0.642    |     *|      |       0.006                        13   439.2 ­0.842    |     *|      |       0.031                        14    451.7     0.00759    |      |      |       0.000                        15    450.7 ­0.704    |     *|      |       0.015                        16    468.3       0.917    |      |*     |       0.008                        17    465.0 ­0.954    |     *|      |       0.012                        18    460.5 ­0.810    |     *|      |       0.012                        19    457.1 ­1.983    |   ***|      |       0.086                        20    457.9 ­0.476    |      |      |       0.005                        21    458.3 ­1.573    |   ***|      |       0.050                        22    458.1 ­0.129    |      |      |       0.000

                  23    465.1 ­0.392    |      |      |       0.002                        24    466.9       0.269    |      |      |       0.001                        25    458.8       1.329    |      |**    |       0.035                        26    465.0       0.983    |      |*     |       0.012                        27    456.5       1.676    |      |***   |       0.063                        28    449.0 ­0.108    |      |      |       0.000                        29    460.2       1.023    |      |**    |       0.019                        30    454.7 ­0.833    |     *|      |       0.017                        31    463.5 ­0.0809    |      |      |       0.000                        32    456.7       1.209    |      |**   |       0.032                        33    458.3 ­0.219    |      |      |       0.001                        34    417.3 ­2.901    | *****|      |       0.641                        35    457.7     0.00766    |      |      |       0.000

                     36    461.1 ­0.584    |     *|      |       0.006                        37    453.6 ­1.075    |    **|      |       0.030                        38    446.4     0.00682    |      |      |       0.000

  39    455.1       1.090    |      |**    |       0.029                        40    463.3       0.601    |      |*     |       0.005                        41    458.4       1.053    |      |**    |       0.022                        42    458.9  1.141    |      |**    |       0.026                        43    444.6       0.151    |      |      |       0.001                        44    452.4       0.294    |      |      |       0.002                        45    453.6 ­1.251    |    **|      |       0.041                        46    454.2       0.425    |      |      |       0.005                        47    455.4       2.290    |      |****  |       0.125                        48    444.0 ­0.593    |     *|      |       0.013                        49    459.6 ­1.825    |   ***|      |       0.063

23/23