December 1, 2015
Relationship Between Students’ Mathematics Performance
and Their Access to Computers & Internet
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
II. Method
Analysis Steps
Data Collection
Data Summarization
Background
Simple vs Mixed
Empty & Full Model
Analysis Method
Method & Criteria
Desired Model
Model Selection
Student-level Residual
School-level Residual
Assumption Diagnostics
Agenda
Conclusion & Potential Future Analysis
I. Background
III. Selection
IV. Assumption
V. Conclusion
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Background – Data CollectionI 1
Where are the data collected ?
I. Background
Programme of International
Student Assessment
- PISA
Survey Objects Objective
15-yeal-old students
Social
Cultural
Economic
Educational
Reading
Mathematics
Science
Mathematics
Hong Kong, Korea, Shanghai, Singapore and Taiwan
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Background – Data SummarizationI 2
Data Summarization – Response Variables PV1MATH … PV5MATH
I. Background
“Plausible values are a representation of the range of abilities that a student might reasonably have. Instead of directly estimating a student’s ability θ, a probability distribution for a student’s θ is estimated. That is, instead of obtaining a point estimate for θ, a range of possible values for a student’s θ, with an associated probability for each of these values is estimated. Plausible values are random draws from this (estimated) distribution for a student’s θ” —— Wu and Adams, 2002
Random draws from the same distribution!
Min Max Mean
PV1MATH 183.99 924.84 573.89
PV2MATH 176.36 932.47 574.16
PV3MATH 195.75 888.85 573.97
PV4MATH 191.00 924.92 574.29
PV5MATH 181.11 868.21 574.45
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Background – Data SummarizationI 3
Data Summarization – Predictors
I. Background Region SCHOOLID
001 002 003 004 005 …
Hong Kong, China 9 11 7 8 6 …
Korea 12 12 10 9 10 …
Shanghai, China 10 13 10 11 12 …
Singapore 11 11 11 10 10 …
Chinese Taipei 12 11 12 13 14 …
Total 54 58 50 51 52 …
Variable Name Description ValueIC01Q01 At Home - Desktop Computer 1 Yes, and I use it
2 Yes, but I don't use it
3 No
IC01Q02 At Home - Portable laptop 1 Yes, and I use it
2 Yes, but I don't use it
3 No
IC01Q04 At Home - Internet Connection 1 Yes, and I use it
2 Yes, but I don't use it
3 No
IC01Q05 At Home - Video games console 1 Yes, and I use it
2 Yes, but I don't use it
3 No
IC03Q01 First use of computers 1 6 years old or younger
2 7-9 years old
3 10-12 years old
4 13 years old or oler
5 Never
IC04Q01 First access to Internet 1 6 years old or younger
2 7-9 years old
3 10-12 years old
4 13 years old or oler
5 Never
HOMSCH ICT Use at Home for School-related Tasks (-2.44, 3.73)
HOMEPOS Home Possessions (-4.43, 4.07)
ICTATTNEGAttitudes towards Computers: Limitations of the Computer as a Tool for School Learning
(-2.16, 2.41)
ICTATTPOSAttitudes towards Computers: Computer as a Tool for School Learning
(-2.9, 1.3)
ICTHOME ICT Availability at Home (-4.02,2.78)
ICTSCH ICT Availability at School (-2.80, 2.83)
Variable Name Description ValueIC08Q01 Out-of-school 8 - One player games. 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day5 Every day
IC08Q02 Out-of-school 8 - Collaborative games. 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day5 Every day
IC09Q06 Out-of-school 9 - Homework 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day5 Every day
IC09Q07 Out-of-school 9 - Share school material 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day5 Every day
IC10Q07 At School - Practice and drillig 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day5 Every day
IC10Q08 At School - Homework 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day5 Every day
IC10Q09 At School - Groupwork 1 Never or hardly ever
2 Once or twice a month
3 Once or twice a week4 Almost every day
5 Every day
Variable Name Description ValueST37Q01 Maths Self-Efficacy - Using a <Train Timetable> 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other MissingST37Q02 Maths Self-Efficacy - Calculating TV discount 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other MissingST37Q03 Maths Self-Efficacy - Calculatiing Square Metres of Tiles 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other Missing
ST37Q04 Maths Self-Efficacy - Understanding Graphs in Newspapers 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other MissingST37Q05 Maths Self-Efficacy - Solving Equation 1 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other MissingST37Q06 Maths Self-Efficacy - Distance to Scale 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other MissingST37Q07 Maths Self-Efficacy - Solving Equation 2 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other MissingST37Q08 Maths Self-Efficacy - Calculating Petrol Consumption Rate 1 Very confident
2 Confident3 Not vey confident4 Not at all confident
other Missing
Variable Name Description Value
ST42Q02 Maths Self-Concept - Not Good at Maths 1 Strongly agree
negative opinion 2 Agree
3 Disagree
4 Strongly disagree
other Missing
ST42Q04 Maths Self-Concept - Get Good <Grades> 1 Strongly agree
2 Agree
3 Disagree
4 Strongly disagree
other Missing
ST42Q06 Maths Self-Concept - Learn Quickly 1 Strongly agree
2 Agree
3 Disagree
4 Strongly disagree
other Missing
ST42Q07 Maths Self-Concept - One of Best Subjects 1 Strongly agree
2 Agree
3 Disagree
4 Strongly disagree
other Missing
ST42Q09 Maths Self-Concept - Understand Difficult Work 1 Strongly agree
2 Agree
3 Disagree
4 Strongly disagree
other Missing
Variable Name Description Value
MATEFF Math Self-Efficacy; Sum of revers values ST37Q01-ST37Q08 (8, 32)
Integer
MATCONMath Self-concept; Sum of ST42Q02 and reverse value of ST42Q06 ST42Q07 ST42Q09
(5, 20)Integer
MTSUP Mathematics Teacher's Support (-2.86, 1.84)INTMAT Mathematics Interest (-1.78, 2.29)MATBEH Mathematics Behavior (-2.14, 4.42)PERSEV Perseverance (-4.05, 3.53)ESCS Social economic status (-3.88, -2.40)WEALTH Wealth (-5.04, 3.13)
PARED Highest parental education in years (3, 16)Integer
Variables’ Nature Variables’ Name
IT related
IC01Q01, IC01Q02, IC01Q04, IC01Q05, IC03Q01, IC04Q01, HOMSCH, HOMEPOS, ICTATTNEG, ICTATTPOS, ICTHOME, ICTSCH
Math related MATEFF, MATCON, MTSUP, INTMAT, MATBEH
Characteristics & SES PERSEV, ESCS, WEALTH, PARED
OtherIC08Q01, IC08Q02, IC09Q06, IC09Q07, IC10Q07, IC10Q08, IC10Q09
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Method – Simple vs Mixed II 4
Simple Linear Regression vs. Mixed Linear Regression
II. Method
Population
Schools
Students
Simple Linear Regression
Mixed Linear Regression
SLR would systematically underestimate the standard errors and therefore lead to reporting nonsignificant results as significant.
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Method – Empty vs Full II 5
Decomposition of the variance in the empty model
II. Method
Level-1 , where Yij represents the math performance of student i, β0j is the school’s mean and εij is the student residual.
Level-2 , where γ 00 is the grand mean and u0j is a random effect, representing school j’s departure from the overall intercept.
Reduced .
Assumption
If ICC=0, a mixed linear regression would be mathematically equal to simple linear regression.
Considering school as a random effect is reasonable.
Between-School
Variance (τ2)Within-School Variance (σ2) ICC (ρ)
PV1MATH 4924.36 6332.00 0.43747
PV2MATH 4969.53 6363.53 0.43850
PV3MATH 4970.36 6355.25 0.43886
PV4MATH 4954.66 6369.43 0.43753
PV5MATH 4939.93 6379.95 0.43639
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Method – Empty vs Full II 6
Mixed linear regression model with independent variables
II. Method
Level-1 , where Yij represents the math performance of student i, β0j is the school’s mean, βkj is the parameter for kth predictor, xik is the corresponding value for kth predictor of student i, and εij is the student residual.
Level-2 , … where γij is the grand mean, u0j is a random effect representing school j’ departure from the overall intercept, γk0 is the fixed parameter for kth predictor.
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Method – Empty vs Full II 7
Mixed linear regression model with independent variables
II. Method
Covariance Par CovP1 CovP2
Intercept (τ2) 12098 -
Residual (σ2) - 4629.72
MTSUP
INTMAT
MATBEH
PERSEV
MATEFF
MATCON
ESCS
WEALTH
PARED
Effect Estimate StdErr t-value p-value
Intercept 332.73 11.7278 28.37 <.0001
IC01Q01_1 6.7786 2.4343 2.78 0.0054
IC01Q01_2 1.9977 3.0927 0.65 0.5183
IC01Q01_3 0 . . .
IC01Q02_1 4.7618 2.1474 2.22 0.0266
IC01Q02_2 5.395 2.4153 2.23 0.0255
IC01Q02_3 0 . . .
IC01Q04_1 24.679 4.0019 6.17 <.0001
IC01Q04_2 -0.7676 5.616 -0.14 0.8913
IC01Q04_3 0 . . .
IC01Q05_1 -6.9334 2.044 -3.39 0.0007
IC01Q05_2 -2.2165 2.4426 -0.91 0.3642
IC01Q05_3 0 . . .
IC03Q01 -7.3692 1.4954 -4.93 <.0001
IC04Q01 -1.0315 1.4877 -0.69 0.4881
HOMSCH 4.1807 1.3411 3.12 0.0018
HOMEPOS 15.6536 2.3361 6.7 <.0001
ICTATTNEG -5.9194 0.893 -6.63 <.0001
ICTATTPOS -3.9975 0.865 -4.62 <.0001
ICTHOME -9.8205 1.489 -6.6 <.0001
ICTSCH -3.2019 0.8978 -3.57 0.0004
Effect Estimate StdErr t-value p-valueIC08Q01 0.8161 0.7356 1.11 0.2673IC08Q02 -0.976 0.6672 -1.46 0.1435IC09Q06 2.0749 1.1003 1.89 0.0594IC09Q07 -5.1988 0.9643 -5.39 <.0001IC10Q07 -6.3855 1.2294 -5.19 <.0001IC10Q08 -4.3954 1.2332 -3.56 0.0004IC10Q09 -4.0328 1.3344 -3.02 0.0025MTSUP -3.3892 0.8347 -4.06 <.0001INTMAT -4.6086 1.1261 -4.09 <.0001MATBEH 3.4924 1.0623 3.29 0.001PERSEV -4.6687 0.9823 -4.75 <.0001MATEFF 8.0484 0.2012 40.01 <.0001MATCON 5.1632 0.3125 16.52 <.0001ESCS 13.4303 2.393 5.61 <.0001WEALTH -20.099 2.075 -9.69 <.0001
PARED -1.0796 0.5708 -1.89 0.0586
Currently we have 37 fix parameters (73 if we treat ordinal variables as categorical). Such number of parameters is too much to be described in detail.
Model Selection
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Selection – Method & CriteriaIII 8
Model selection criteria and method
III. Selection
ESCS
WEALTH
PARED
Method
Criteria
Forward
Backward
Stepwise
LASSO
Attention: We use SAS Procedure: PROC GLMSELECT.This is designed for Simple Linear Regression and may create biased result.
AIC
BIC
A subset of most influential predictors that lead to smallest AIC and BIC.
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Selection – Method & CriteriaIII 9
Model selection criteria and method
III. Selection
WEALTH
PARED
Stepwise Categorical
AIC
Stepwise Categorical
BIC
Stepwise Continuous
AIC
Stepwise Continuous
BIC
LASSO Categorical
AIC
LASSO Categorical
BIC
LASSO Continuous
BIC
LASSO Continuous
BIC SumIC01Q01 1 1 1 1 0 0 0 0 4IC01Q02 1 1 1 1 0 0 0 0 4IC01Q04 1 1 1 1 1 1 1 1 8IC01Q05 1 1 1 1 1 1 1 1 8IC03Q01 1 1 1 1 1 1 1 1 8IC04Q01 1 1 0 0 0 0 0 0 2IC08Q01 1 1 1 1 0 0 0 0 4IC08Q02 1 1 1 1 1 1 1 1 8IC09Q06 1 1 1 1 0 0 0 1 5IC09Q07 1 1 1 1 0 0 0 0 4IC10Q07 1 1 1 1 1 1 1 1 8IC10Q08 1 1 1 1 1 1 1 1 8IC10Q09 1 1 0 0 1 1 0 0 4HOMSCH 1 1 1 1 0 0 0 1 5HOMEPOS 1 1 1 1 0 0 0 1 5ICTATTNEG 1 1 1 1 1 1 1 1 8ICTATTPOS 1 1 1 1 0 0 0 1 5ICTHOME 1 1 1 1 0 0 0 1 5ICTSCH 1 1 1 1 1 1 1 1 8MTSUP 1 1 1 1 1 1 1 1 8INTMAT 1 1 1 1 0 0 0 0 4MATBEH 1 1 1 1 0 0 0 1 5PERSEV 1 1 1 1 0 0 0 1 5MATEFF 1 1 1 1 1 1 1 1 8MATCON 1 1 1 1 1 1 1 1 8ESCS 1 1 1 1 1 1 1 1 8WEALTH 1 1 1 1 0 0 0 1 5PARED 0 0 0 0 0 0 0 0 0Sum 27 27 25 25 13 13 12 20 -
Stepwise tends to select nearly all variables while LASSO only select about half of them.
AIC and BIC tend to give the same selection result, especially for stepwise method.
Description AIC BIC Intercept Variance Residual Variance
Full CategoricalSignificant predictors when running full mixed effect model;all categorical variables are treated as categorical 93587.7 93597.0 12391 4629.04
Full ContinuousSignificant predictors when running full mixed effect model;ordinal variables are treated as continuous 93672.7 93682.0 12080 4627.23
LASSO CategoricalA subset of predictors selected based on LASSO;all categorical variables are treated as categorical 93825.2 93834.6 13343 4831.31
LASSO ContinuousA subset of predictors selected based on LASSO;ordinal variables are treated as continuous 93961.2 93970.6 13392 4856.26
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Selection – Desired ModelIII 10
Full categorical model description
III. Selection
WEALTH
PARED
Effect Estimate StdErr t-value p-value
Intercept 125.25 16.7619 7.47 <.0001IC01Q01_1 6.4997 2.4335 2.67 0.0076IC01Q01_2 1.741 3.0968 0.56 0.574IC01Q01_3 0 . . .IC01Q02_1 5.8199 2.1472 2.71 0.0067IC01Q02_2 6.2927 2.4142 2.61 0.0092IC01Q02_3 0 . . .IC01Q04_1 20.7217 4.0905 5.07 <.0001IC01Q04_2 -4.1907 5.675 -0.74 0.4603IC01Q04_3 0 . . .IC01Q05_1 -7.4712 2.0368 -3.67 0.0002IC01Q05_2 -1.9111 2.4465 -0.78 0.4347IC01Q05_3 0 . . .IC04Q01_1 67.9177 12.4331 5.46 <.0001IC04Q01_2 66.5147 12.2855 5.41 <.0001IC04Q01_3 59.1226 12.2498 4.83 <.0001IC04Q01_4 48.7161 12.3197 3.95 <.0001IC04Q01_5 0 . . .HOMSCH 4.937 1.1556 4.27 <.0001HOMEPOS 17.3709 2.2114 7.86 <.0001ICTATTNEG -6.0695 0.892 -6.8 <.0001ICTATTPOS -3.8257 0.8625 -4.44 <.0001ICTHOME -9.6127 1.491 -6.45 <.0001ICTSCH -3.4807 0.9017 -3.86 0.0001
Effect Estimate StdErr t-value p-value
IC08Q01_1 7.0929 4.1351 1.72 0.0863IC08Q01_2 10.7873 4.2503 2.54 0.0112IC08Q01_3 15.0561 4.2248 3.56 0.0004IC08Q01_4 6.5526 4.6899 1.4 0.1624IC08Q01_5 0 . . .IC09Q07_1 18.9441 5.8068 3.26 0.0011IC09Q07_2 11.8654 5.6051 2.12 0.0343IC09Q07_3 6.7707 5.5886 1.21 0.2257IC09Q07_4 5.6896 6.0801 0.94 0.3494IC09Q07_5 0 . . .IC10Q08_1 27.9892 10.4353 2.68 0.0073IC10Q08_2 25.5901 10.5098 2.43 0.0149IC10Q08_3 17.3405 10.4641 1.66 0.0975IC10Q08_4 16.3155 11.2176 1.45 0.1459IC10Q08_5 0 . . .IC10Q09_1 42.84 11.3737 3.77 0.0002IC10Q09_2 38.1787 11.4457 3.34 0.0009IC10Q09_3 34.2868 11.5187 2.98 0.0029IC10Q09_4 24.776 12.5759 1.97 0.0489IC10Q09_5 0 . . .
Effect Estimate StdErr t-value p-value
MTSUP -3.5957 0.8351 -4.31 <.0001INTMAT -4.8686 1.125 -4.33 <.0001MATBEH 3.0672 1.0622 2.89 0.0039PERSEV -4.6498 0.9831 -4.73 <.0001MATEFF 8.0155 0.2013 39.82 <.0001MATCON 5.2564 0.3112 16.89 <.0001ESCS 9.9571 1.3094 7.6 <.0001WEALTH -20.324 2.0693 -9.82 <.0001
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Assumption – Student-level ResidualIV 11
Assumption diagnostic for Student-level residuals (εij)
IV. Assumption
WEALTH
PARED
Normality Assumption Diagnostic
Scatterplot: 8,000 points locate evenly around the mean.
Q-Q plot: A great alignment between residuals and normal quantile.
Normality Assumption Diagnostic - Graphic Normality Assumption Diagnostic – Summary Statistics
The normality assumption for Student-level residuals has not been violated!
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
Assumption – School-level ResidualIV 12
Assumption diagnostic for School-level residuals (u0j)
IV. Assumption
WEALTH
PARED
Normality Assumption Diagnostic
The normality assumption for School-level residuals has not been violated!
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
ConclusionV 13
Conclusions
V. Conclusion
WEALTH
PARED
The overall analysis of PV1MATH … PV5MATH indicates a significant relationship between students’ mathematics performance and student’s access to IT products.Predictor Description EffectsIC01Q04 At Home - Internet Connection +IC04Q01 First access to Internet +IC01Q05 At Home - Video games console -IC08Q01 Out-of-school 8 - One-player games. -IC10Q08 At School - Homework -HOMEPOS Home Possessions +
ICTATTNEGAttitudes towards Computers: Limitations of the Computer as a Tool for School Learning
-
ICTATTPOS Attitudes towards Computers: Computer as a Tool for School Learning -
ICTHOME ICT Availability at Home -ICTSCH ICT Availability at School -MTSUP Mathematics Teacher's Support -MATBEH Mathematics Behavior +MATEFF Math Self-Efficacy +MATCON Math Self-concept +PERSEV Perseverance -ESCS Social economic status +WEALTH Wealth -
IC01Q04 & IC04Q01: Students’ access to Internet can benefit their mathematic performance
IC01Q05: Playing video games has a negative effect for students’ math study
ICTATTNEG & ICTATTPOS: Attitude towards computers (pos. and neg.) tend to worsen students’ math performance
MATEFF & MATCON: Strong efficacy and concepts lead to better performance
PERSEV: Hardworking has a negative relationship with math performance?!
ESCS &WEALTH : Contradiction…
I. Background
II. Method
III. Selection
IV. Assumption
V. Conclusion
ConclusionV 14
Potential future analysis
V. Conclusion
WEALTH
PARED
Consider Region as a Factor: Either run regression models for a specific region or including region as a third-level random effect
Cross Validation: Compare the accuracy rate for each candidate model
Random Slopes: Consider school related predictors as random
Q&AThank You