• Analysis of Case-Cohort Designs The methods of analysis including different weighting schemes used in estimation.

J Clin Epidemiol (1999) Vol. 52, No. 12

Date : 2017/8/8 Presenter：THJ

• Nested case-control study (巢式病例對照研究)

• A variation of a case-control study in which only a subset of non-cases(controls) from the cohort are compared to the incident cases. ▫ Matching (For a 1:m nested design, m controls are selected

at random from those available at that time point.) ▫ Often used when the exposure of interest is difficult or

expensive to obtain and when the outcome is rare. ▫ Reduce selection and information bias. ▫ Conditional logistic regression. (with stratification by time)

• Case-Cohort study (病例世代研究)

• Similar to a NCC study in that the cases and non-cases are within a parent cohort identified at time t1, after baseline (t0). The cohort members were assessed for risk factrors at any time prior to t1. ▫ Most useful in analyzing time to failure in a large cohort in

which failure is rare. ▫ Non-cases are randomly selected from the parent

cohort. ▫ No matching is performed. ▫ Weighted Cox proportional hazards regression model.

• • Consider a cohort where time of entry into the cohort is known and the individual followed up until failure or censoring. At time of entry into the cohort an individual may be included in a random sample of the cohort (termed the “subcohort”) with probability α.

However, the potential to assess covariates for all members of the cohort must exist, as one does not know in advance which individuals will fail.

• Case-Cohort Analysis • Weighted Cox regression model • Partial likelihood Pseudo-likelihood

If person i fails at time tj then the contribution to the partial likelihood, assuming no tied failure times is

• Three different weighting methods

4 2 3 1 3

Unweighted Random sample denominator

Weighted denominator

• Robust estimation of the variance • SEs are usually underestimated. • The score contributions from the pseudo-likelihood maximization

are not independent owing to the method of sampling.

• Barlow and Prentice give a formula for computing the estimated change in if the ith individual is deleted. This change in is denoted and is a p-dimensional vector for p covariates. This robust variance estimate is

̂ ̂ ̂

• Example • Welch nickel refinery workers and subsequent

development of nasal cancer. ▫ 56 nasal cancer cases among the 683 employees. ▫ 4 individuals are censored before the first failure ▫ Analytic data: sample size=679, α=0.2 ▫ 135 subjects in the subcohort (9 cases). The remaining 47 cases

were outside the subcohort. 182 evaluations for about 27% of the cohort.

• SAS code http://lib.stat.cmu.edu/general/robphreg

• SAS code

• Example results

• Simulation Study Example Continued… • For each sampling scheme, 200 samples were drawn from the full

cohort. ▫ Case cohort: Samples of 10%~90% were drawn. ▫ Nested CC: sampling was performed using 1~70 controls per case.

(Self &Prentice) (Barlow)

• Discussion & Conclusion • The NCC design has some advantages over case-cohort

designs: ▫ more readily understood and the method of analysis is

simple. ▫ If one could assume that the controls chosen were a random

sample, it would not be necessary to enumerate the entire population.

▫ The usual standard errors are correct, and the parameter estimates unbiased.

• Discussion & Conclusion • Case-cohort sampling from the full cohort was more

efficient than using a comparable NCC design.

• It may be extended to complicated sampling designs in which direct computation of the correct asymptotic variance is difficult.

• The optimal weighting scheme is still unclear.

• Case-cohort designs are recommend when flexibility is desired.

