Parametric Survial Model E

Embed Size (px)

DESCRIPTION

SMI

Citation preview

  • AS305 Statistical Methods For Insurance

    ESTIMATION OF PARAMETRIC SURVIVAL MODELS

    General Description

    estimating the parameters of our familiar univariate model S(x) or S(t). In turn, we will explore this estimation problem under both complete data and incomplete data samples.

    The second kind of parametric model to be considered is the one which involves concomitant variables. select model S(t;x), more general model S(t;z), which arises in a clinical setting

    emphasis on least-squares and maximum likelihood methods. Properties of the estimators will be examined, along with some hypothesis testing of the chosen model and its parameters.

    UNIVARIATE MODELS, COMPLETE DATA

    We will distinguish between samples in which exact times of death are known, and those in which times of death have been grouped.

    We will use S(t) for the underlying 'survival model to be estimated, rather than S(x), to suggest that complete data samples generally exist in a clinical setting.

    Exact Times of Death

    Suppose a sample of n lives, all existing at t = 0, produces observed times of death t1, t2,, tn, assumed to be independent.

    We wish to use this data to estimate S(t) as a parametric survival model. In other words, we adopt a particular mathematical function of the variable t, depending on one or more parameters as well, and then use the sample data to estimate these unknown parameters, using a particular estimation procedure.

    The simplest parametric models to use are those which depend on only one parameter, such as the exponential distribution with S(t) = e-t. How can we estimate from our sample data?

  • Let us calculate

    which is the mean time of death of the sample.

    if T has an exponential distribution

    the moment estimator of .

    the maximum likelihood estimator of

    The likelihood for the ith death is the PDF for death at time ti; .

    The overall likelihood is given by

    and the log-likelihood is

    Then , which leads to

    As MLE of , we know it is asymptotically unbiased, with asymptotic variance given by

    As an alternative to the method of moments, we might equate the distribution and sample medians.

    The median of the distribution is that value of t, t say, for which S(t ) = , or exp(-t )= under the exponential distribution. Solving for t we find that t = ,

    and our estimate of results from equating t and tmed. Thus producing

    21ln

    medt2

    1ln

    A sample of 10 laboratory mice produces the times of death (in days) 3, 4,5,7,7,8,10,10, 10, 12. Assuming the operative survival model to be exponential, estimate by both the method of moments and the method of medians. The sample mean is = 7.6 and the sample median is = 7.5

    Then by the method of moments,

    and by the method of medians

    t t~

  • For a parametric model with two parameters, the method of moments equates the sample and distribution means, and the sample and distribution second (central) moments.

    For our sample of times of death ti, i = 1,2,, n, the sample second (central) moment, S2 say, is given by

    where is the sample mean.

    respectively, in terms of the two unknown parameters of the distribution, we would find estimates of those parameters by solving simultaneously the equations = and 2 = s2.

    t

    t

    The two-parameter gamma distribution is defined by its PDF as f(t) = 1/{()} t -1 exp(-t/), t > 0, > 0, > 0, and its mean and variance are = and 2 = 2. Using the sample data of previous example estimate : and by the method of moments.

    MLE The method of maximum likelihood may also be used for models with two

    parameters. Here we would take the partial derivative of In L with respect to each

    parameter, set each derivative to zero, and solve the resulting pair of equations simultaneously.

    Frequently these equations are not convenient ones in the unknown parameters, and a numerical solution must be obtained.

    EXAMPLE Estimate the Weibull parameters k and n from the sample data of previous example by the method of maximum likelihood.

    [sol] hence to derive pdf of weibullf(t) = S(t) . (t) = k tn . exp [ - ].Then the log-likelihood is

    Differentiating, we obtain

    Which need to be solved numerically.

    1

    1

    nktn

  • So (t), observed survival junction representation of the survival pattern observed for the sample to which it relates.

    (t) is the parametric model with parameters estimated from the dataS

  • Approach to estimate S(t)

    From the parameters value that minimize

    differentiate SS with respect to each parameter, set each derivative to zero, and solve for the least-squares estimates of the parameters.

    Example fit a uniform distribution (with unknown ) to the data of previous example

    Sometime transformation is needed example: Fit the data of previous example to an exponential distribution, estimating by the method of least squares. could be difficult to solve

    To avoid this we will fit lnS(ti) = - ti to ln{[ S0(ti) + S(ti-1)]} instead

  • Grouped Times of Death

    This study design was described in Section 4.4, along with techniques for estimating a tabular survival model from such data. We are now interested in the estimation of parametric models, and we will consider the methods of maximum likelihood and least squares for this purpose.

    MLE

    Suppose n persons are alive at t = 0, and we observe their deaths in k non-overlapping intervals of equal length.

    Let di be the number of deaths observed in (i, i+ 1]. The probability of death in (i, i+ 1] for a person alive at t = 0 is i|qo = S(i) - S( i + 1), so the contribution of the (i+ 1)th interval to the likelihood is

    The overall likelihood is

    Then S(i) - S(i+ 1) is written in terms of the unknown parameters of the chosen parametric model, and the parameters are found by maximizing L

    Example Fit an exponential model to the grouped data below estimating by maximum likelihood.

  • LSE

    We recognize that S(i) - S(i+ 1) = S(i) qi is the multinomial probability that a person in the original sample of n at t = 0 will die in (i, i+1].

    Then E [Di] = n[S(i) - S(i+1)] is the expected number of deaths in in (i, i+1].

    di is the actual number of such deaths, least-squares estimates of the parameters of S(t) can be found found by

    minimizing

    From the grouped data of previous example, estimate the uniform distribution parameter by unweighted least squares.

    Generate =5

    Censoring for censored data, such as data in the presence of claims limit, treat it like grouped data