Analysis of Case-Cohort Designs
The methods of analysis including
different weighting schemes used
J Clin Epidemiol (1999) Vol. 52, No. 12
Date : 2017/8/8
Nested case-control study
• A variation of a case-control study in which only a subset
of non-cases(controls) from the cohort are compared to
the incident cases.
▫ Matching (For a 1:m nested design, m controls are selected
at random from those available at that time point.)
▫ Often used when the exposure of interest is difficult or
expensive to obtain and when the outcome is rare.
▫ Reduce selection and information bias.
▫ Conditional logistic regression. (with stratification by time)
Case-Cohort study (病例世代研究)
• Similar to a NCC study in that the cases and non-cases
are within a parent cohort identified at time t1, after
baseline (t0). The cohort members were assessed for risk
factrors at any time prior to t1.
▫ Most useful in analyzing time to failure in a large cohort in
which failure is rare.
▫ Non-cases are randomly selected from the parent
▫ No matching is performed.
▫ Weighted Cox proportional hazards regression model.
• Consider a cohort where time of entry into the cohort is known and
the individual followed up until failure or censoring. At time of entry
into the cohort an individual may be included in a random sample
of the cohort (termed the “subcohort”) with probability α.
However, the potential to assess covariates for all members of the cohort must
exist, as one does not know in advance which individuals will fail.
• Weighted Cox regression model
• Partial likelihood Pseudo-likelihood
If person i fails at time tj then the contribution to the partial
likelihood, assuming no tied failure times is
Three different weighting methods
Unweighted Random sample
Robust estimation of the variance
• SEs are usually underestimated.
• The score contributions from the pseudo-likelihood maximization
are not independent owing to the method of sampling.
• Barlow and Prentice give a formula for computing the
estimated change in if the ith individual is deleted.
This change in is denoted and is a p-dimensional
vector for p covariates. This robust variance estimate is
• Welch nickel refinery workers and subsequent
development of nasal cancer.
▫ 56 nasal cancer cases among the 683 employees.
▫ 4 individuals are censored before the first failure
▫ Analytic data: sample size=679, α=0.2
▫ 135 subjects in the subcohort (9 cases). The remaining 47 cases
were outside the subcohort. 182 evaluations for about 27% of the
SAS code http://lib.stat.cmu.edu/general/robphreg
• For each sampling scheme, 200 samples were drawn from the full
▫ Case cohort: Samples of 10%~90% were drawn.
▫ Nested CC: sampling was performed using 1~70 controls per case.
Discussion & Conclusion
• The NCC design has some advantages over case-cohort
▫ more readily understood and the method of analysis is
▫ If one could assume that the controls chosen were a random
sample, it would not be necessary to enumerate the entire
▫ The usual standard errors are correct, and the parameter
Discussion & Conclusion
• Case-cohort sampling from the full cohort was more
efficient than using a comparable NCC design.
• It may be extended to complicated sampling designs in
which direct computation of the correct asymptotic
variance is difficult.
• The optimal weighting scheme is still unclear.
• Case-cohort designs are recommend when flexibility is