Analysis of Case-Cohort DesignsThe methods of analysis including different weighting schemes used in estimation.
J Clin Epidemiol (1999) Vol. 52, No. 12
Date : 2017/8/8Presenter:THJ
Nested case-control study (巢式病例對照研究)
• A variation of a case-control study in which only a subset of non-cases(controls) from the cohort are compared to the incident cases. ▫ Matching (For a 1:m nested design, m controls are selected
at random from those available at that time point.)▫ Often used when the exposure of interest is difficult or
expensive to obtain and when the outcome is rare.▫ Reduce selection and information bias.▫ Conditional logistic regression. (with stratification by time)
Case-Cohort study (病例世代研究)
• Similar to a NCC study in that the cases and non-cases are within a parent cohort identified at time t1, after baseline (t0). The cohort members were assessed for risk factrors at any time prior to t1.▫ Most useful in analyzing time to failure in a large cohort in
which failure is rare. ▫ Non-cases are randomly selected from the parent
cohort.▫ No matching is performed.▫ Weighted Cox proportional hazards regression model.
• Consider a cohort where time of entry into the cohort is known and the individual followed up until failure or censoring. At time of entry into the cohort an individual may be included in a random sample of the cohort (termed the “subcohort”) with probability α.
However, the potential to assess covariates for all members of the cohort must exist, as one does not know in advance which individuals will fail.
Case-Cohort Analysis• Weighted Cox regression model• Partial likelihood Pseudo-likelihood
If person i fails at time tj then the contribution to the partial likelihood, assuming no tied failure times is
Three different weighting methods
42313
Unweighted Random sample denominator
Weighted denominator
Robust estimation of the variance• SEs are usually underestimated.• The score contributions from the pseudo-likelihood maximization
are not independent owing to the method of sampling.
• Barlow and Prentice give a formula for computing the estimated change in if the ith individual is deleted. This change in is denoted and is a p-dimensional vector for p covariates. This robust variance estimate is
Example• Welch nickel refinery workers and subsequent
development of nasal cancer. ▫ 56 nasal cancer cases among the 683 employees.▫ 4 individuals are censored before the first failure▫ Analytic data: sample size=679, α=0.2▫ 135 subjects in the subcohort (9 cases). The remaining 47 cases
were outside the subcohort. 182 evaluations for about 27% of the cohort.
SAS code http://lib.stat.cmu.edu/general/robphreg
SAS code
Example results
Simulation StudyExample Continued…• For each sampling scheme, 200 samples were drawn from the full
cohort.▫ Case cohort: Samples of 10%~90% were drawn.▫ Nested CC: sampling was performed using 1~70 controls per case.
(Self &Prentice)(Barlow)
Discussion & Conclusion• The NCC design has some advantages over case-cohort
designs:▫ more readily understood and the method of analysis is
simple.▫ If one could assume that the controls chosen were a random
sample, it would not be necessary to enumerate the entire population.
▫ The usual standard errors are correct, and the parameter estimates unbiased.
Discussion & Conclusion• Case-cohort sampling from the full cohort was more
efficient than using a comparable NCC design.
• It may be extended to complicated sampling designs in which direct computation of the correct asymptotic variance is difficult.
• The optimal weighting scheme is still unclear.
• Case-cohort designs are recommend when flexibility is desired.