17
December 1, 2015 Relationship Between Students’ Mathematics Performance and Their Access to Computers & Internet

Education Project

Embed Size (px)

Citation preview

Page 1: Education Project

December 1, 2015

Relationship Between Students’ Mathematics Performance

and Their Access to Computers & Internet

Page 2: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

II. Method

Analysis Steps

Data Collection

Data Summarization

Background

Simple vs Mixed

Empty & Full Model

Analysis Method

Method & Criteria

Desired Model

Model Selection

Student-level Residual

School-level Residual

Assumption Diagnostics

Agenda

Conclusion & Potential Future Analysis

I. Background

III. Selection

IV. Assumption

V. Conclusion

Page 3: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Background – Data CollectionI 1

Where are the data collected ?

I. Background

Programme of International

Student Assessment

- PISA

Survey Objects Objective

15-yeal-old students

Social

Cultural

Economic

Educational

Reading

Mathematics

Science

Mathematics

Hong Kong, Korea, Shanghai, Singapore and Taiwan

Page 4: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Background – Data SummarizationI 2

Data Summarization – Response Variables PV1MATH … PV5MATH

I. Background

“Plausible values are a representation of the range of abilities that a student might reasonably have. Instead of directly estimating a student’s ability θ, a probability distribution for a student’s θ is estimated. That is, instead of obtaining a point estimate for θ, a range of possible values for a student’s θ, with an associated probability for each of these values is estimated. Plausible values are random draws from this (estimated) distribution for a student’s θ” —— Wu and Adams, 2002

Random draws from the same distribution!

Min Max Mean

PV1MATH 183.99 924.84 573.89

PV2MATH 176.36 932.47 574.16

PV3MATH 195.75 888.85 573.97

PV4MATH 191.00 924.92 574.29

PV5MATH 181.11 868.21 574.45

Page 5: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Background – Data SummarizationI 3

Data Summarization – Predictors

I. Background Region SCHOOLID

001 002 003 004 005 …

Hong Kong, China 9 11 7 8 6 …

Korea 12 12 10 9 10 …

Shanghai, China 10 13 10 11 12 …

Singapore 11 11 11 10 10 …

Chinese Taipei 12 11 12 13 14 …

Total 54 58 50 51 52 …

Variable Name Description ValueIC01Q01 At Home - Desktop Computer 1 Yes, and I use it

2 Yes, but I don't use it

3 No

IC01Q02 At Home - Portable laptop 1 Yes, and I use it

2 Yes, but I don't use it

3 No

IC01Q04 At Home - Internet Connection 1 Yes, and I use it

2 Yes, but I don't use it

3 No

IC01Q05 At Home - Video games console 1 Yes, and I use it

2 Yes, but I don't use it

3 No

IC03Q01 First use of computers 1 6 years old or younger

2 7-9 years old

3 10-12 years old

4 13 years old or oler

5 Never

IC04Q01 First access to Internet 1 6 years old or younger

2 7-9 years old

3 10-12 years old

4 13 years old or oler

5 Never

HOMSCH ICT Use at Home for School-related Tasks (-2.44, 3.73)

HOMEPOS Home Possessions (-4.43, 4.07)

ICTATTNEGAttitudes towards Computers: Limitations of the Computer as a Tool for School Learning

(-2.16, 2.41)

ICTATTPOSAttitudes towards Computers: Computer as a Tool for School Learning

(-2.9, 1.3)

ICTHOME ICT Availability at Home (-4.02,2.78)

ICTSCH ICT Availability at School (-2.80, 2.83) 

Variable Name Description ValueIC08Q01 Out-of-school 8 - One player games. 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day5 Every day

IC08Q02 Out-of-school 8 - Collaborative games. 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day5 Every day

IC09Q06 Out-of-school 9 - Homework 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day5 Every day

IC09Q07 Out-of-school 9 - Share school material 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day5 Every day

IC10Q07 At School - Practice and drillig 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day5 Every day

IC10Q08 At School - Homework 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day5 Every day

IC10Q09 At School - Groupwork 1 Never or hardly ever

2 Once or twice a month

3 Once or twice a week4 Almost every day

    5 Every day

Variable Name Description ValueST37Q01 Maths Self-Efficacy - Using a <Train Timetable> 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other MissingST37Q02 Maths Self-Efficacy - Calculating TV discount 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other MissingST37Q03 Maths Self-Efficacy - Calculatiing Square Metres of Tiles 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other Missing

ST37Q04 Maths Self-Efficacy - Understanding Graphs in Newspapers 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other MissingST37Q05 Maths Self-Efficacy - Solving Equation 1 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other MissingST37Q06 Maths Self-Efficacy - Distance to Scale 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other MissingST37Q07 Maths Self-Efficacy - Solving Equation 2 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

other MissingST37Q08 Maths Self-Efficacy - Calculating Petrol Consumption Rate 1 Very confident

2 Confident3 Not vey confident4 Not at all confident

    other Missing

Variable Name Description Value

ST42Q02 Maths Self-Concept - Not Good at Maths 1 Strongly agree

negative opinion 2 Agree

3 Disagree

4 Strongly disagree

other Missing

ST42Q04 Maths Self-Concept - Get Good <Grades> 1 Strongly agree

2 Agree

3 Disagree

4 Strongly disagree

other Missing

ST42Q06 Maths Self-Concept - Learn Quickly 1 Strongly agree

2 Agree

3 Disagree

4 Strongly disagree

other Missing

ST42Q07 Maths Self-Concept - One of Best Subjects 1 Strongly agree

2 Agree

3 Disagree

4 Strongly disagree

other Missing

ST42Q09 Maths Self-Concept - Understand Difficult Work 1 Strongly agree

2 Agree

3 Disagree

4 Strongly disagree

    other Missing

Variable Name Description Value

MATEFF Math Self-Efficacy; Sum of revers values ST37Q01-ST37Q08 (8, 32)

Integer

MATCONMath Self-concept; Sum of ST42Q02 and reverse value of ST42Q06 ST42Q07 ST42Q09

(5, 20)Integer

MTSUP Mathematics Teacher's Support (-2.86, 1.84)INTMAT Mathematics Interest (-1.78, 2.29)MATBEH Mathematics Behavior (-2.14, 4.42)PERSEV Perseverance (-4.05, 3.53)ESCS Social economic status (-3.88, -2.40)WEALTH Wealth (-5.04, 3.13)

PARED Highest parental education in years (3, 16)Integer

Variables’ Nature Variables’ Name

IT related

IC01Q01, IC01Q02, IC01Q04, IC01Q05, IC03Q01, IC04Q01, HOMSCH, HOMEPOS, ICTATTNEG, ICTATTPOS, ICTHOME, ICTSCH

Math related MATEFF, MATCON, MTSUP, INTMAT, MATBEH

Characteristics & SES PERSEV, ESCS, WEALTH, PARED

OtherIC08Q01, IC08Q02, IC09Q06, IC09Q07, IC10Q07, IC10Q08, IC10Q09

Page 6: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Method – Simple vs Mixed II 4

Simple Linear Regression vs. Mixed Linear Regression

II. Method

Population

Schools

Students

Simple Linear Regression

Mixed Linear Regression

SLR would systematically underestimate the standard errors and therefore lead to reporting nonsignificant results as significant.

Page 7: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Method – Empty vs Full II 5

Decomposition of the variance in the empty model

II. Method

Level-1 , where Yij represents the math performance of student i, β0j is the school’s mean and εij is the student residual.

Level-2 , where γ 00 is the grand mean and u0j is a random effect, representing school j’s departure from the overall intercept.

Reduced .

Assumption

If ICC=0, a mixed linear regression would be mathematically equal to simple linear regression.

Considering school as a random effect is reasonable.

Between-School

Variance (τ2)Within-School Variance (σ2) ICC (ρ)

PV1MATH 4924.36 6332.00 0.43747

PV2MATH 4969.53 6363.53 0.43850

PV3MATH 4970.36 6355.25 0.43886

PV4MATH 4954.66 6369.43 0.43753

PV5MATH 4939.93 6379.95 0.43639

Page 8: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Method – Empty vs Full II 6

Mixed linear regression model with independent variables

II. Method

Level-1 , where Yij represents the math performance of student i, β0j is the school’s mean, βkj is the parameter for kth predictor, xik is the corresponding value for kth predictor of student i, and εij is the student residual.

Level-2 , … where γij is the grand mean, u0j is a random effect representing school j’ departure from the overall intercept, γk0 is the fixed parameter for kth predictor.

Page 9: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Method – Empty vs Full II 7

Mixed linear regression model with independent variables

II. Method

Covariance Par CovP1 CovP2

Intercept (τ2) 12098 -

Residual (σ2) - 4629.72

MTSUP

INTMAT

MATBEH

PERSEV

MATEFF

MATCON

ESCS

WEALTH

PARED

Effect Estimate StdErr t-value p-value

Intercept 332.73 11.7278 28.37 <.0001

IC01Q01_1 6.7786 2.4343 2.78 0.0054

IC01Q01_2 1.9977 3.0927 0.65 0.5183

IC01Q01_3 0 . . .

IC01Q02_1 4.7618 2.1474 2.22 0.0266

IC01Q02_2 5.395 2.4153 2.23 0.0255

IC01Q02_3 0 . . .

IC01Q04_1 24.679 4.0019 6.17 <.0001

IC01Q04_2 -0.7676 5.616 -0.14 0.8913

IC01Q04_3 0 . . .

IC01Q05_1 -6.9334 2.044 -3.39 0.0007

IC01Q05_2 -2.2165 2.4426 -0.91 0.3642

IC01Q05_3 0 . . .

IC03Q01 -7.3692 1.4954 -4.93 <.0001

IC04Q01 -1.0315 1.4877 -0.69 0.4881

HOMSCH 4.1807 1.3411 3.12 0.0018

HOMEPOS 15.6536 2.3361 6.7 <.0001

ICTATTNEG -5.9194 0.893 -6.63 <.0001

ICTATTPOS -3.9975 0.865 -4.62 <.0001

ICTHOME -9.8205 1.489 -6.6 <.0001

ICTSCH -3.2019 0.8978 -3.57 0.0004

Effect Estimate StdErr t-value p-valueIC08Q01 0.8161 0.7356 1.11 0.2673IC08Q02 -0.976 0.6672 -1.46 0.1435IC09Q06 2.0749 1.1003 1.89 0.0594IC09Q07 -5.1988 0.9643 -5.39 <.0001IC10Q07 -6.3855 1.2294 -5.19 <.0001IC10Q08 -4.3954 1.2332 -3.56 0.0004IC10Q09 -4.0328 1.3344 -3.02 0.0025MTSUP -3.3892 0.8347 -4.06 <.0001INTMAT -4.6086 1.1261 -4.09 <.0001MATBEH 3.4924 1.0623 3.29 0.001PERSEV -4.6687 0.9823 -4.75 <.0001MATEFF 8.0484 0.2012 40.01 <.0001MATCON 5.1632 0.3125 16.52 <.0001ESCS 13.4303 2.393 5.61 <.0001WEALTH -20.099 2.075 -9.69 <.0001

PARED -1.0796 0.5708 -1.89 0.0586

Currently we have 37 fix parameters (73 if we treat ordinal variables as categorical). Such number of parameters is too much to be described in detail.

Model Selection

Page 10: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Selection – Method & CriteriaIII 8

Model selection criteria and method

III. Selection

ESCS

WEALTH

PARED

Method

Criteria

Forward

Backward

Stepwise

LASSO

Attention: We use SAS Procedure: PROC GLMSELECT.This is designed for Simple Linear Regression and may create biased result.

AIC

BIC

A subset of most influential predictors that lead to smallest AIC and BIC.

Page 11: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Selection – Method & CriteriaIII 9

Model selection criteria and method

III. Selection

WEALTH

PARED

Stepwise Categorical

AIC

Stepwise Categorical

BIC

Stepwise Continuous

AIC

Stepwise Continuous

BIC

LASSO Categorical

AIC

LASSO Categorical

BIC

LASSO Continuous

BIC

LASSO Continuous

BIC SumIC01Q01 1 1 1 1 0 0 0 0 4IC01Q02 1 1 1 1 0 0 0 0 4IC01Q04 1 1 1 1 1 1 1 1 8IC01Q05 1 1 1 1 1 1 1 1 8IC03Q01 1 1 1 1 1 1 1 1 8IC04Q01 1 1 0 0 0 0 0 0 2IC08Q01 1 1 1 1 0 0 0 0 4IC08Q02 1 1 1 1 1 1 1 1 8IC09Q06 1 1 1 1 0 0 0 1 5IC09Q07 1 1 1 1 0 0 0 0 4IC10Q07 1 1 1 1 1 1 1 1 8IC10Q08 1 1 1 1 1 1 1 1 8IC10Q09 1 1 0 0 1 1 0 0 4HOMSCH 1 1 1 1 0 0 0 1 5HOMEPOS 1 1 1 1 0 0 0 1 5ICTATTNEG 1 1 1 1 1 1 1 1 8ICTATTPOS 1 1 1 1 0 0 0 1 5ICTHOME 1 1 1 1 0 0 0 1 5ICTSCH 1 1 1 1 1 1 1 1 8MTSUP 1 1 1 1 1 1 1 1 8INTMAT 1 1 1 1 0 0 0 0 4MATBEH 1 1 1 1 0 0 0 1 5PERSEV 1 1 1 1 0 0 0 1 5MATEFF 1 1 1 1 1 1 1 1 8MATCON 1 1 1 1 1 1 1 1 8ESCS 1 1 1 1 1 1 1 1 8WEALTH 1 1 1 1 0 0 0 1 5PARED 0 0 0 0 0 0 0 0 0Sum 27 27 25 25 13 13 12 20 -

Stepwise tends to select nearly all variables while LASSO only select about half of them.

AIC and BIC tend to give the same selection result, especially for stepwise method.

Description AIC BIC Intercept Variance Residual Variance

Full CategoricalSignificant predictors when running full mixed effect model;all categorical variables are treated as categorical 93587.7 93597.0 12391 4629.04

Full ContinuousSignificant predictors when running full mixed effect model;ordinal variables are treated as continuous 93672.7 93682.0 12080 4627.23

LASSO CategoricalA subset of predictors selected based on LASSO;all categorical variables are treated as categorical 93825.2 93834.6 13343 4831.31

LASSO ContinuousA subset of predictors selected based on LASSO;ordinal variables are treated as continuous 93961.2 93970.6 13392 4856.26

Page 12: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Selection – Desired ModelIII 10

Full categorical model description

III. Selection

WEALTH

PARED

Effect Estimate StdErr t-value p-value

Intercept 125.25 16.7619 7.47 <.0001IC01Q01_1 6.4997 2.4335 2.67 0.0076IC01Q01_2 1.741 3.0968 0.56 0.574IC01Q01_3 0 . . .IC01Q02_1 5.8199 2.1472 2.71 0.0067IC01Q02_2 6.2927 2.4142 2.61 0.0092IC01Q02_3 0 . . .IC01Q04_1 20.7217 4.0905 5.07 <.0001IC01Q04_2 -4.1907 5.675 -0.74 0.4603IC01Q04_3 0 . . .IC01Q05_1 -7.4712 2.0368 -3.67 0.0002IC01Q05_2 -1.9111 2.4465 -0.78 0.4347IC01Q05_3 0 . . .IC04Q01_1 67.9177 12.4331 5.46 <.0001IC04Q01_2 66.5147 12.2855 5.41 <.0001IC04Q01_3 59.1226 12.2498 4.83 <.0001IC04Q01_4 48.7161 12.3197 3.95 <.0001IC04Q01_5 0 . . .HOMSCH 4.937 1.1556 4.27 <.0001HOMEPOS 17.3709 2.2114 7.86 <.0001ICTATTNEG -6.0695 0.892 -6.8 <.0001ICTATTPOS -3.8257 0.8625 -4.44 <.0001ICTHOME -9.6127 1.491 -6.45 <.0001ICTSCH -3.4807 0.9017 -3.86 0.0001

Effect Estimate StdErr t-value p-value

IC08Q01_1 7.0929 4.1351 1.72 0.0863IC08Q01_2 10.7873 4.2503 2.54 0.0112IC08Q01_3 15.0561 4.2248 3.56 0.0004IC08Q01_4 6.5526 4.6899 1.4 0.1624IC08Q01_5 0 . . .IC09Q07_1 18.9441 5.8068 3.26 0.0011IC09Q07_2 11.8654 5.6051 2.12 0.0343IC09Q07_3 6.7707 5.5886 1.21 0.2257IC09Q07_4 5.6896 6.0801 0.94 0.3494IC09Q07_5 0 . . .IC10Q08_1 27.9892 10.4353 2.68 0.0073IC10Q08_2 25.5901 10.5098 2.43 0.0149IC10Q08_3 17.3405 10.4641 1.66 0.0975IC10Q08_4 16.3155 11.2176 1.45 0.1459IC10Q08_5 0 . . .IC10Q09_1 42.84 11.3737 3.77 0.0002IC10Q09_2 38.1787 11.4457 3.34 0.0009IC10Q09_3 34.2868 11.5187 2.98 0.0029IC10Q09_4 24.776 12.5759 1.97 0.0489IC10Q09_5 0 . . .

Effect Estimate StdErr t-value p-value

MTSUP -3.5957 0.8351 -4.31 <.0001INTMAT -4.8686 1.125 -4.33 <.0001MATBEH 3.0672 1.0622 2.89 0.0039PERSEV -4.6498 0.9831 -4.73 <.0001MATEFF 8.0155 0.2013 39.82 <.0001MATCON 5.2564 0.3112 16.89 <.0001ESCS 9.9571 1.3094 7.6 <.0001WEALTH -20.324 2.0693 -9.82 <.0001

Page 13: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Assumption – Student-level ResidualIV 11

Assumption diagnostic for Student-level residuals (εij)

IV. Assumption

WEALTH

PARED

Normality Assumption Diagnostic

Scatterplot: 8,000 points locate evenly around the mean.

Q-Q plot: A great alignment between residuals and normal quantile.

Normality Assumption Diagnostic - Graphic Normality Assumption Diagnostic – Summary Statistics

The normality assumption for Student-level residuals has not been violated!

Page 14: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

Assumption – School-level ResidualIV 12

Assumption diagnostic for School-level residuals (u0j)

IV. Assumption

WEALTH

PARED

Normality Assumption Diagnostic

The normality assumption for School-level residuals has not been violated!

Page 15: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

ConclusionV 13

Conclusions

V. Conclusion

WEALTH

PARED

The overall analysis of PV1MATH … PV5MATH indicates a significant relationship between students’ mathematics performance and student’s access to IT products.Predictor Description EffectsIC01Q04 At Home - Internet Connection +IC04Q01 First access to Internet +IC01Q05 At Home - Video games console -IC08Q01 Out-of-school 8 - One-player games. -IC10Q08 At School - Homework -HOMEPOS Home Possessions +

ICTATTNEGAttitudes towards Computers: Limitations of the Computer as a Tool for School Learning

-

ICTATTPOS Attitudes towards Computers: Computer as a Tool for School Learning -

ICTHOME ICT Availability at Home -ICTSCH ICT Availability at School -MTSUP Mathematics Teacher's Support -MATBEH Mathematics Behavior +MATEFF Math Self-Efficacy +MATCON Math Self-concept +PERSEV Perseverance -ESCS Social economic status +WEALTH Wealth -

IC01Q04 & IC04Q01: Students’ access to Internet can benefit their mathematic performance

IC01Q05: Playing video games has a negative effect for students’ math study

ICTATTNEG & ICTATTPOS: Attitude towards computers (pos. and neg.) tend to worsen students’ math performance

MATEFF & MATCON: Strong efficacy and concepts lead to better performance

PERSEV: Hardworking has a negative relationship with math performance?!

ESCS &WEALTH : Contradiction…

Page 16: Education Project

I. Background

II. Method

III. Selection

IV. Assumption

V. Conclusion

ConclusionV 14

Potential future analysis

V. Conclusion

WEALTH

PARED

Consider Region as a Factor: Either run regression models for a specific region or including region as a third-level random effect

Cross Validation: Compare the accuracy rate for each candidate model

Random Slopes: Consider school related predictors as random

Page 17: Education Project

Q&AThank You