Nonparametric analysis of recurrent events with incomplete observation gapsJinheum KimUniversity of SuwonKorea
Joint with Yang-Jin Kim, Eun Hee Choi, Chung Mo Nam 2011-12-17
Joint 2011 Taipei Symposium 1
Recurrent event data Recurrent events arise frequently in clinical
trials in which patients are followed longi-tudinally and response of interest is tran-sient
Examples Transient ischemic attacks in patients with
cerebrovascular disease Seizures in epileptic patients Tumor recurrences in cancer patients
2011-12-17Joint 2011 Taipei Symposium
2
Motivating example Young Traffic Offenders Program(YTOP)
data
What is YTOP? Conducted at the University of Missouri-Co-
lumbia Health Sciences Center and other medical centers since late 1987
One-day educational intervention for people under 24 convicted of speeding by 20+ mph over the posted speed limit
2011-12-17 Joint 2011 Taipei Symposium
3
YTOP: objective
To evaluate the effect of the program on reducing the rates of speeding violation of young drivers
2011-12-17Joint 2011 Taipei Symposium
4
YTOP: characteristics Data were collected on 192 young drivers about their
speeding violation information since they obtained their driving license
29 subjects(13: YTOP, 16: non-YTOP) received suspensions These subjects dropped out of the study when the suspen-
sion began, and then came back to the study after complet-ing suspension
These intermittent dropouts are called observation gaps The observation gap is determined by the starting and ter-
minating time of the suspension BUT, the observation gaps of YTOP data are incomplete
due to missing terminating times
2011-12-17 Joint 2011 Taipei Symposium
5
Previous works & their limitations
2011-12-17Joint 2011 Taipei Symposium
6
Farmer et al.(2000, Brain Injury): two-sample rank test Ignore the detailed information about the conviction
process Can assess only the long-term impact
Sun et al.(2001, JASA): nonparametric & semi-parametric methods Treat YTOP data as recurrent event data Evaluate both short- and long-term effects applying the
conviction process information Ignore the observation gaps
Previous works & their limitations
2011-12-17Joint 2011 Taipei Symposium
7
Zhao & Sun(2006, CSDA) Similar to Sun et al. (2001, JASA), but consider
the observation gaps Suspension periods: NOT deterministic, BUT vary-
ing and possibly unknown
Kim & Jhun(2008, StatMed) Treat incomplete observation gaps as terminating
times being interval-censored Only use the first suspension and ignore all the
other suspensions after the first one.
Objective of this talk Use the multivariate-interval censored
data to estimate the distribution of the terminating time
Propose a non-parametric test to com-pare the conviction rates of two groups
Evaluate the proposed test via simula-tion & reanalyze the YTOP data
2011-12-17Joint 2011 Taipei Symposium
8
Illustrative data: notation
2011-12-17Joint 2011 Taipei Symposium
9
●: conviction time
X: starting time of the suspension
?: terminating times of the suspension (unknown)
■ : end of study time
Diagram10
??
? ??
IC data: objective
2011-12-17Joint 2011 Taipei Symposium
11
Want to estimate the distribu-tion of the terminating time us-ing the interval censored data
Two types of IC data
2011-12-17Joint 2011 Taipei Symposium
12
Univariate IC data cases II, III, & V having one observa-
tion gaps eg: (5,9)], (7, ), (2,10]
Multivariate IC data case IV having two observation gaps eg: (2,5]. BUT, …
Multivariate IC data
2011-12-17Joint 2011 Taipei Symposium
13
How can we define the origin time of the second terminating time?
Constructing IC data
2011-12-17Joint 2011 Taipei Symposium
14
Set the origin to be the first conviction time after the previous suspension eg: For the second suspension of case IV,
Hence, estimate the distribution of the terminating time using
,9(7 55 2 4]
(2,4](5,9], (7, ), (2,10],(2,5]
EM-ICM
2011-12-17Joint 2011 Taipei Symposium
15
Hybrid algorithm(Wellner & Zhan, 1997, JASA)
Use R package interval
Notation & assumption the number of occurrences of the event
up to time
function indicating if subject is under observation at time
cumulative mean (frequency) function (CMF)
independent2011-12-17Joint 2011 Taipei Symposium
16
( ) :iN t,t , ,1i n
( ) :iY t
& :i iN Y
ti
( ) { ( )}:i it N tE
Data
: the th recurrent event time of subject
: 0-1 indicator whether or not sub-ject has experienced observation gaps
: starting time of the th observa-tion gap of subject
2011-12-17Joint 2011 Taipei Symposium
17
{( , , , 1, , ; 1, ,) ; },1,ik i ij i it Y h i n k n mj D
iYikt
ijh
k
j
i
i
i
Unobservable terminating times
BUT, the terminating time of each observation gaps is NOT avail-able, that is, interval-censored
So, need to estimate the distribu-tion of the terminating time of the observation gap
2011-12-17Joint 2011 Taipei Symposium
18
IC data
2011-12-17Joint 2011 Taipei Symposium
19
: minimum of greater than
: minimum of greater than
can be right-censored
{( , ], 1, , ; 1, , }ij ij iL R i n j m I
0ijt
1ijt, 1i jh
ijh
ijR
ikt
0 1 0,ij ij ij ij ij ijRL h t t t
ikt
Terminating time distribu-tion
2011-12-17Joint 2011 Taipei Symposium
20
: the terminating time for the th observation gap of subject
: the mid-point of the th equivalence class
: mass of at
ijSj
l( 1, , )l l qa
laijS( )l ij lP Sf a
i
Redefine a risk set
Replace by the esti-mated expected risk set defined as
2011-12-17Joint 2011 Taipei Symposium
21
0*
1 1
11
ˆ( ) ( )( ) 1
ˆ( ) ( )
i
i
m qij ij l ij l
i i m qj l
ij ij r ij rrj
J t I t t a R fY t Y
J t I L a R f
( )iY t
Equivalence classes
2011-12-17Joint 2011 Taipei Symposium
22
Case II: (L R]Case III: (L Case IV: (L R] (L R]Case V: (L R]Combine L R L L R R R
Equivalence classes: (2,4], (5,5], (7,9] Mid-points: 3, 5, 8
Assume with terminating time
1 2 3, ( 5)( 3) ,, ( 8)P S Pf S f P S f :S
1 2 3 1ff f
Unadjusted(naive) risk set23
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0 1 1
1 1 1 1 1 1 1 1 0 0
1 1 1 0 0 1 1 0 0 1 1
1 1 1 0 0 0 0 0 0 0 1 1
Modified(proposed) risk set24
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 p 1 1 1 0 1 1 1 1 1 0 q q r r r 1 1 1
1 2
1 ,pf
f f
1
1 2 3
ff
qf f
1 3
21
2
f ff
rff
Estimated CMF Assuming that all subjects have the common CMF,
total number of events observed at time
total number of subjects at risk at time
2011-12-17Joint 2011 Taipei Symposium
25
*
*0
( )ˆ ( )( )
t dN st
Y s
1 2( ) ( ) ( ) ( ),nt tt t
* *
1
(( :) ) ( )n
i ii
Y t dd t tN N
* *
1
( :) ( )n
ii
Y tY t
t
t
Representation of estimated CMF
distinct event times across all individuals
Similar to the Nelson-Aalen estimator from survival analysis
Cook & Lawless(2007)
2011-12-17Joint 2011 Taipei Symposium
26
*
*:
( )ˆ ( )( )
jj t t
j
j
dN tt
Y t
1 2 :dt tt
Asymptotic properties Following Lin et al.(2001, JRSSB),
Consistent Asymptotically normal with consistently es-
timated variance,
Approximate pointwise CI:
2011-12-17Joint 2011 Taipei Symposium
27
* *2
* *1 :
( ) ( )ˆˆ { ( )} { [ ( ) ]}( ) ( )
j
ni j j
i ji j t t j j
Y t dN tvar t dN t
Y t Y t
1
/ 2ˆ ˆˆ( ) {log ( )exp(lo }g )t z var t
Comparison of two groups : the CMFs for group 1 and 2
: the number of subjects in the th group with
: the estimated CMF for the th group
2011-12-17Joint 2011 Taipei Symposium
28
g
g
1 2( ), ( )t t
1 2,n n1 2n nn
1 2ˆ ˆ( ), ( )t t
Two-sample test statistic Hypothesis:
Test statistic:
: maximum follow-up time across two groups
: non-negative predictable function
2011-12-17Joint 2011 Taipei Symposium
29
0 1 2: ( ) ( ), 0tH t t
1 20ˆ ˆ( ){ ( ) ( )}wT w s d s d s
( 0)
( )w t
Asymptotic distribution Following Cook et al.(1996, BCS) & Lin et
al.(2001, JRSSB), Asymptotically normal with consistently es-
timated variance,
Hence, asymptotically
2011-12-17Joint 2011 Taipei Symposium
30
*22
*01 1
( ) ˆˆ ( [ ( ) { ( ) )}]( )
) (gn
gkw gk
g k gg
Y svar T w s dN s d s
Y s
2 2ˆ ( ) (/ ~ 1)w wT var T
Simulation: data generationGenerate the first event time from an exponential distribu-tion with a hazard rate of
Generate
: probability of having an observation gap
2011-12-17Joint 2011 Taipei Symposium
31
1 ~ (1, )i B p
1it
p
i
Simulation: data generation If generate the starting time of
the first observation gap from
If generate the duration tine from an exponential distribution with a hazard rate of 2 resulting in the terminat-ing time
So, we get & (if ) 2011-12-17Joint 2011 Taipei Symposium
32
1 1( , 005)i iU t t
1iv
1ih
1it 1ih
1 1i ih v
1 1,i
1 1,i
1 1i
Simulation: the first cycle of data genera-tion
2011-12-17Joint 2011 Taipei Symposium
33
t
0 1it
0 w/ observation gap
1it
0 w/o observation gap1it
1 1i ih v
2it
1 2i it t
1ih
1iv 2it
1 1 2i i ih v t
?
Two-sample tests: design params
Event rate of each group
Weight: =0.05, 0.1, 0.2, 0.4 Fixed censoring at 5 Sample size: Replication: 1,000
2011-12-17Joint 2011 Taipei Symposium
34
( ) 1w t
1 1.5 2 1.5,1.4,1.3,1.2,1.1
50,100n
p
Two-sample tests: results(n=100)Tests
Proposed Naive Sun et al. Farmer et al.
0.05 1.5 0.047 0.055 0.047 0.0471.4 0.271 0.273 0.263 0.1901.3 0.755 0.740 0.740 0.5931.2 0.978 0.972 0.979 0.918
0.1 1.5 0.060 0.055 0.056 0.0391.4 0.246 0.231 0.248 0.1821.3. 0.704 0.668 0.677 0.5381.2 0.981 0.966 0.980 0.907
0.4 1.5 0.062 0.049 0.068 0.0541.4 0.190 0.151 0.179 0.1381.3. 0.633 0.440 0.576 0.4231.2 0.941 0.793 0.914 0.772
35
2p
YTOP data: estimated CMFs
2011-12-17Joint 2011 Taipei Symposium
36
YTOP data: p-values
2011-12-17Joint 2011 Taipei Symposium
37
Tests CovariateYTOP Gender
Proposed 0.836 0.135Naive 0.735 0.161Sun et al. 0.456 0.008Farmer et al. 0.113 0.533
Summary Propose the expected risk set to incorporate
the observation gaps
Our two-sample test shows to satisfy the level & has the higher power than the other tests
No significant difference in traffic conviction rates between YTOP and non-YTOP partici-pants & between male and female
2011-12-17Joint 2011 Taipei Symposium
38
Extend to …
2011-12-17Joint 2011 Taipei Symposium
39
A regression model to include a time-de-pendent covariate where denotes the time at which a sub-ject undergoes the traffic education pro-gram
A case to allow a termination event such as death
( ) ( ),Z t I T t T
0( ) { ( ) | ( )} ( ) exp{ ( )}i i i it E N t z t t z t
Thank you!40