Download pptx - Joint with Yang-Jin Kim, Eun Hee Choi , Chung Mo Nam

Nonparametric analysis of recurrent events with incomplete observation gapsJinheum KimUniversity of SuwonKorea

Joint with Yang-Jin Kim, Eun Hee Choi, Chung Mo Nam 2011-12-17

Joint 2011 Taipei Symposium 1

Recurrent event data Recurrent events arise frequently in clinical

trials in which patients are followed longi-tudinally and response of interest is tran-sient

Examples Transient ischemic attacks in patients with

cerebrovascular disease Seizures in epileptic patients Tumor recurrences in cancer patients

2011-12-17Joint 2011 Taipei Symposium

2

Motivating example Young Traffic Offenders Program(YTOP)

data

What is YTOP? Conducted at the University of Missouri-Co-

lumbia Health Sciences Center and other medical centers since late 1987

One-day educational intervention for people under 24 convicted of speeding by 20+ mph over the posted speed limit

2011-12-17 Joint 2011 Taipei Symposium

3

YTOP: objective

To evaluate the effect of the program on reducing the rates of speeding violation of young drivers


4

YTOP: characteristics Data were collected on 192 young drivers about their

speeding violation information since they obtained their driving license

29 subjects(13: YTOP, 16: non-YTOP) received suspensions These subjects dropped out of the study when the suspen-

sion began, and then came back to the study after complet-ing suspension

These intermittent dropouts are called observation gaps The observation gap is determined by the starting and ter-

minating time of the suspension BUT, the observation gaps of YTOP data are incomplete

due to missing terminating times

2011-12-17 Joint 2011 Taipei Symposium

5

Previous works & their limitations


6

Farmer et al.(2000, Brain Injury): two-sample rank test Ignore the detailed information about the conviction

process Can assess only the long-term impact

Sun et al.(2001, JASA): nonparametric & semi-parametric methods Treat YTOP data as recurrent event data Evaluate both short- and long-term effects applying the

conviction process information Ignore the observation gaps

Previous works & their limitations


7

Zhao & Sun(2006, CSDA) Similar to Sun et al. (2001, JASA), but consider

the observation gaps Suspension periods: NOT deterministic, BUT vary-

ing and possibly unknown

Kim & Jhun(2008, StatMed) Treat incomplete observation gaps as terminating

times being interval-censored Only use the first suspension and ignore all the

other suspensions after the first one.

Objective of this talk Use the multivariate-interval censored

data to estimate the distribution of the terminating time

Propose a non-parametric test to com-pare the conviction rates of two groups

Evaluate the proposed test via simula-tion & reanalyze the YTOP data


8

Illustrative data: notation


9

●: conviction time

X: starting time of the suspension

?: terminating times of the suspension (unknown)

■ : end of study time

Diagram10

??

? ??

IC data: objective


11

Want to estimate the distribu-tion of the terminating time us-ing the interval censored data

Two types of IC data


12

Univariate IC data cases II, III, & V having one observa-

tion gaps eg: (5,9)], (7, ), (2,10]

Multivariate IC data case IV having two observation gaps eg: (2,5]. BUT, …

Multivariate IC data


13

How can we define the origin time of the second terminating time?

Constructing IC data


14

Set the origin to be the first conviction time after the previous suspension eg: For the second suspension of case IV,

Hence, estimate the distribution of the terminating time using

,9(7 55 2 4]

(2,4](5,9], (7, ), (2,10],(2,5]

EM-ICM


15

Hybrid algorithm(Wellner & Zhan, 1997, JASA)

Use R package interval

Notation & assumption the number of occurrences of the event

up to time

function indicating if subject is under observation at time

cumulative mean (frequency) function (CMF)

independent2011-12-17Joint 2011 Taipei Symposium

16

( ) :iN t,t , ,1i n

( ) :iY t

& :i iN Y

ti

( ) { ( )}:i it N tE

Data

: the th recurrent event time of subject

: 0-1 indicator whether or not sub-ject has experienced observation gaps

: starting time of the th observa-tion gap of subject


17

{( , , , 1, , ; 1, ,) ; },1,ik i ij i it Y h i n k n mj D

iYikt

ijh

k

j

i

i

i

Unobservable terminating times

BUT, the terminating time of each observation gaps is NOT avail-able, that is, interval-censored

So, need to estimate the distribu-tion of the terminating time of the observation gap


18

IC data


19

: minimum of greater than

: minimum of greater than

can be right-censored

{( , ], 1, , ; 1, , }ij ij iL R i n j m I

0ijt

1ijt, 1i jh

ijh

ijR

ikt

0 1 0,ij ij ij ij ij ijRL h t t t

ikt

Terminating time distribu-tion


20

: the terminating time for the th observation gap of subject

: the mid-point of the th equivalence class

: mass of at

ijSj

l( 1, , )l l qa

laijS( )l ij lP Sf a

i

Redefine a risk set

Replace by the esti-mated expected risk set defined as


21

0*

1 1

11

ˆ( ) ( )( ) 1

ˆ( ) ( )

i

i

m qij ij l ij l

i i m qj l

ij ij r ij rrj

J t I t t a R fY t Y

J t I L a R f

( )iY t

Equivalence classes


22

Case II: (L R]Case III: (L Case IV: (L R] (L R]Case V: (L R]Combine L R L L R R R

Equivalence classes: (2,4], (5,5], (7,9] Mid-points: 3, 5, 8

Assume with terminating time

1 2 3, ( 5)( 3) ,, ( 8)P S Pf S f P S f :S

1 2 3 1ff f

Unadjusted(naive) risk set23

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 0 0 0 1 1

1 1 1 1 1 1 1 1 0 0

1 1 1 0 0 1 1 0 0 1 1

1 1 1 0 0 0 0 0 0 0 1 1

Modified(proposed) risk set24

1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 p 1 1 1 0 1 1 1 1 1 0 q q r r r 1 1 1

1 2

1 ,pf

f f

1

1 2 3

ff

qf f

1 3

21

2

f ff

rff

Estimated CMF Assuming that all subjects have the common CMF,

total number of events observed at time

total number of subjects at risk at time


25

*

*0

( )ˆ ( )( )

t dN st

Y s

1 2( ) ( ) ( ) ( ),nt tt t

* *

1

(( :) ) ( )n

i ii

Y t dd t tN N

* *

1

( :) ( )n

ii

Y tY t

t

t

Representation of estimated CMF

distinct event times across all individuals

Similar to the Nelson-Aalen estimator from survival analysis

Cook & Lawless(2007)


26

*

*:

( )ˆ ( )( )

jj t t

j

j

dN tt

Y t

1 2 :dt tt

Asymptotic properties Following Lin et al.(2001, JRSSB),

Consistent Asymptotically normal with consistently es-

timated variance,

Approximate pointwise CI:


27

* *2

* *1 :

( ) ( )ˆˆ { ( )} { [ ( ) ]}( ) ( )

j

ni j j

i ji j t t j j

Y t dN tvar t dN t

Y t Y t

1

/ 2ˆ ˆˆ( ) {log ( )exp(lo }g )t z var t

Comparison of two groups : the CMFs for group 1 and 2

: the number of subjects in the th group with

: the estimated CMF for the th group


28

g

g

1 2( ), ( )t t

1 2,n n1 2n nn

1 2ˆ ˆ( ), ( )t t

Two-sample test statistic Hypothesis:

Test statistic:

: maximum follow-up time across two groups

: non-negative predictable function


29

0 1 2: ( ) ( ), 0tH t t

1 20ˆ ˆ( ){ ( ) ( )}wT w s d s d s

( 0)

( )w t

Asymptotic distribution Following Cook et al.(1996, BCS) & Lin et

al.(2001, JRSSB), Asymptotically normal with consistently es-

timated variance,

Hence, asymptotically


30

*22

*01 1

( ) ˆˆ ( [ ( ) { ( ) )}]( )

) (gn

gkw gk

g k gg

Y svar T w s dN s d s

Y s

2 2ˆ ( ) (/ ~ 1)w wT var T

Simulation: data generationGenerate the first event time from an exponential distribu-tion with a hazard rate of

Generate

: probability of having an observation gap


31

1 ~ (1, )i B p

1it

p

i

Simulation: data generation If generate the starting time of

the first observation gap from

If generate the duration tine from an exponential distribution with a hazard rate of 2 resulting in the terminat-ing time

So, we get & (if ) 2011-12-17Joint 2011 Taipei Symposium

32

1 1( , 005)i iU t t

1iv

1ih

1it 1ih

1 1i ih v

1 1,i

1 1,i

1 1i

Simulation: the first cycle of data genera-tion


33

t

0 1it

0 w/ observation gap

1it

0 w/o observation gap1it

1 1i ih v

2it

1 2i it t

1ih

1iv 2it

1 1 2i i ih v t

?

Two-sample tests: design params

Event rate of each group

Weight: =0.05, 0.1, 0.2, 0.4 Fixed censoring at 5 Sample size: Replication: 1,000


34

( ) 1w t

1 1.5 2 1.5,1.4,1.3,1.2,1.1

50,100n

p

Two-sample tests: results(n=100)Tests

Proposed Naive Sun et al. Farmer et al.

0.05 1.5 0.047 0.055 0.047 0.0471.4 0.271 0.273 0.263 0.1901.3 0.755 0.740 0.740 0.5931.2 0.978 0.972 0.979 0.918

0.1 1.5 0.060 0.055 0.056 0.0391.4 0.246 0.231 0.248 0.1821.3. 0.704 0.668 0.677 0.5381.2 0.981 0.966 0.980 0.907

0.4 1.5 0.062 0.049 0.068 0.0541.4 0.190 0.151 0.179 0.1381.3. 0.633 0.440 0.576 0.4231.2 0.941 0.793 0.914 0.772

35

2p

YTOP data: estimated CMFs


36

YTOP data: p-values


37

Tests CovariateYTOP Gender

Proposed 0.836 0.135Naive 0.735 0.161Sun et al. 0.456 0.008Farmer et al. 0.113 0.533

Summary Propose the expected risk set to incorporate

the observation gaps

Our two-sample test shows to satisfy the level & has the higher power than the other tests

No significant difference in traffic conviction rates between YTOP and non-YTOP partici-pants & between male and female


38

Extend to …


39

A regression model to include a time-de-pendent covariate where denotes the time at which a sub-ject undergoes the traffic education pro-gram

A case to allow a termination event such as death

( ) ( ),Z t I T t T

0( ) { ( ) | ( )} ( ) exp{ ( )}i i i it E N t z t t z t

Thank you!40