Xlstatdose Help

Embed Size (px)

Citation preview

  • Users Manual

    XLSTAT-Dose

    Copyright 2003, Addinsoft

    http://www.addinsoft.com

  • 2

    Table of Contents

    Introduction.............................................................................................. 4 Evaluating XLSTAT-Dose ......................................................................... 5 Installing XLSTAT-Dose............................................................................ 6

    Minimum system requirements............................................................ 6 Installing XLSTAT-Dose...................................................................... 6 Starting XLSTAT-Dose ....................................................................... 7

    Using XLSTAT-Dose ................................................................................ 8 Selecting Data ................................................................................... 8

    Dose effects analysis................................................................................ 9 Description ........................................................................................ 9 Elements of the dialog box .................................................................. 9 Elements of the Mortality dialog box ...................................................12 Results .............................................................................................13 Missing values ..................................................................................15 Example ...........................................................................................15 To know more about it .......................................................................15

    Parallel four parameter fit .........................................................................16 Description .......................................................................................16 Elements of the dialog box .................................................................17 Results .............................................................................................19 Missing values ..................................................................................19 Example ...........................................................................................19 To know more about it .......................................................................20

  • 3

  • 4

    Introduction

    XLSTAT-Dose is an essential add-on for the XLSTAT-Pro users who want to model the effect of concentration or dilution. XLSTAT-Dose has been designed for those who work in areas, including:

    Agronomics (phytopathology, insecticides, herbicides),

    Medical research,

    Pharmaceutical research.

    XLSTAT-Dose incorporates the specific functionalities required for fitting complex models in these areas. For example, in Dose effect analysis, one of the important factors that researchers must take into account is the natural mortality factor: i.e., when a dose of a chemical component is applied to a number of insects, some of them will die because of the effect of the component, others for entirely different reasons. XLSTAT-Dose enables researchers to make the distinction between the two categories.

    Addinsoft has integrated XLSTAT-Dose into the XLSTAT suite, the most comprehensive and effective set of statistical analytic tools available for Microsoft Excel. Particular attention was paid to keep XLSTAT-Dose as user-friendly as possible.

    For more information, please go to:

    http://www.xlstat.com

  • 5

    Evaluating XLSTAT-Dose

    As is the case for all tools developed by Addinsoft, you can test XLSTAT-Dose free-of-charge for a period of up to 30 days, using it a maximum of 30 times. Once the evaluation period is over and you are sure you are satisfied, all you have to do is contact Addinsoft to obtain a version that can be used over an unlimited period of time.

    To order XLSTAT-Dose, simply go to our web page http://www.xlstat.com/order-dose.htm or email us at [email protected]. One of our product consultants will contact you within 24 hours.

  • 6

    Installing XLSTAT-Dose

    Minimum system requirements

    Operating System: Windows 98, Me, NT, 2000, XP

    RAM : 64 Mb

    Processor: 200 MHz

    Disk Space: 5 Mb

    Microsoft Excel Version: Excel 97, 2000 or 2002

    XLSTAT-Pro Version: XLSTAT-Pro 5.1 v5 or higher

    XLSTAT-Dose is integrated in the XLSTAT suite, with XLSTAT-Pro as the core module. XLSTAT-Pro version 5.1 v5 must be installed before installing XLSTAT-Dose.

    Installing XLSTAT-Dose

    The very first step is to install XLSTAT-Pro on your computer, if it has not already been installed.

    If you obtained XLSTAT-Dose via the Internet:

    Execute the file that you downloaded,

    Follow the step-by-step instructions presented during the installation process.

    If you obtained XLSTAT-Dose on a CD-Rom:

    Insert the CD-ROM into your CD-Rom drive,

    The installation program will start automatically,

    Follow the step-by-step instructions presented during the installation process.

    To check that XLSTAT-Dose has been correctly installed, start XLSTA T-Pro, select options, click on Modules on the toolbar, and verify that XLSTAT-Dose appears on the list of modules installed.

    To open the Options dialog box, click the button located on the toolbar.

  • 7

    Starting XLSTAT-Dose

    To start XLSTAT-Dose,

    Use the XLSTAT-Dose menu located on the XLSTAT menu,

    Or click on the button located in the XLSTAT main toolbar, to display the XLSTAT-Dose toolbar.

  • 8

    Using XLSTAT-Dose

    Selecting Data

    All XLSTAT suite modules share a number of features, making them easier to master for the end-user.

    XLSTAT-Dose dialog boxes, results presentation, data selection, and data types control features are similar to those on the XLSTAT-Pro core module. For information on how to use other XLSTAT-Dose features, please refer to XLSTAT-Pros "Online Help" section.

  • 9

    Dose effects analysis Use this tool to analyze the effect of doses after an experiment. It enables you to fit four different models, automatically take the log of the doses, and take the natural mortality into account.

    Description

    The dose effects analysis tool allows to build models than can explain the effect of doses in a wide range of research areas. The effect of a dose is measured by a binary response variable, typically Died/Survived, Yes/No, which is often represented by 0 if the expected event did not happen and 1 if the event happened. It is frequent that a given dose is applied to more than one subject so instead of presenting the results in a binary format, the results are presented as sums of binary variables.

    Elements of the dialog box

    Empty dialog box

    The dialog box contains the following elements:

  • 10

    Left-hand side section:

    Response : Select the data that correspond to the response variable. The data can be either binary data (categorical or numerical, eg Yes/No, 0/1 ) representing whether an event occurred or not, or a quantitative variable counting the number of cases for the which the event occurred.

    Weights: Select the data that correspond to the weights of the observations. If the "Responses" are binary data, and if the weights are all "1", it wont be necessary to fill in the box as the default weight will automatically be 1. On the other hand, if the "Responses" are numerical discrete data, you are required to select the weights. Example: if the counts correspond to the number of insects that died for a given dose, then the corresponding "Weights" represent the number of the insects exposed to same dose during the experiment. The "Weight" for each observation must always be greater or equal to the "Counts".

    Quantitative variables: Select the quantitative (numerical, continuous or discrete) explanatory variable(s) that you want to include in the model, typically the dose variable. There can be one or more variables. If the "Take the log" option is activated, the log of the selected variables will be used automatically when estimating the model parameters.

    Qualitative variables: Select the qualitative explanatory variable(s) that you want to include in the model. There can be one or more qualitative variables, and can include two or more categories each. The variables can be binary variables (exposed to light yes/no) or multinomial variables (age category or citizenship, for example).

    Column labels: Select this option if the first row of the selected variables (response variable and explanatory variables) contains labels.

    Observation labels: Select the rows labels if available.

    Confidence interval (%): The value (between 1 and 99) used to determine the confidence range of the dose effects. Default value is 95.

    Right -hand side section:

    Range: Results are displayed from the cell of an existing sheet. Once you choose this option, select in the corresponding box the cell that will correspond to the top left corner of the results tables.

    Sheet: Results are displayed in a new sheet of the active workbook.

    Workbook: Results are displayed in a new workbook.

    OK: Click this button to start the computations.

    Cancel: Click this button to close the dialog box.

    Help: Click this button to activate the XLSTAT-Dose online help.

  • 11

    Take the log: Activate this option to enable XLSTAT-Dose to automatically take the log of the quantitative variables; in dose analysis, it has often proven that the log of the variable is more informative for the model than the variable itself.

    Residuals: Select this option to let XLSTAT-Dose display the tables corresponding to the analysis of predictions and residuals.

    Chart: Activate this option to display the chart. A chart is displayed only when there is one quantitative explanatory variable.

    Confidence intervals: activate this option if you want that the confidence intervals are displayed on the chart.

    Constraints: select here the constraint to put on the qualitative variables

    a1 = 0: the parameter corresponding to the first category is set to 0, for each qualitative variable.

    Sum (ai) = 0: the sum of the parameters corresponding to the categories is set to 0 for each qualitative variable.

    Model: XLSTAT-Dose offers you the possibility to fit four different models to the data; in dose analysis the most commonly used model is the Probit model (derived from the cumulative normal distribution function). You can also use the Logit model (logistic regression), the Complementary Log-log model, and Gompertz function. Below are the formulas corresponding to each of the models:

    o Probit: -

    -=

    X

    dxxb

    pp

    2exp

    2

    1 2

    o Logit: )exp(1)exp(X

    Xb

    bp

    +=

    o Gompertz: ( )[ ]Xbp --= expexp

    o Complementary Log-log: ( )[ ]Xbp expexp1 --=

    bX represents the linear combine of the explanatory variables

    NB: to fit the model, XLSTAT-Dose maximizes the likelihood function.

    Check mortality: Activate this option to enable XLSTAT-Dose to take info account the natural mortality of the subjects, and to avoid a bias in the model. This is particularly important when studying dose effects on species that have a life span comparable to the duration of the experiment (i.e. if it is likely that some individuals will die because of other factors than the

  • 12

    dose). "Natural" is somehow misleading as there are several types of non-natural mortality factors that could bias the model such as predation, food, or environment.

    Intercept=0: Select this option to constrain the model to have a constant term equal to 0.

    Convergence: The value used to determine when the likelihood value has converged. Default value is 105.

    Example of the dialog box once filled in

    Elements of the Mortality dialog box

    Empty dialog box

    The mortality dialog box contains the following elements:

  • 13

    Tested: Enter the number of individuals included in the experiment with dose 0 to evaluate the natural mortality. If no experiment is available, you may want to enter an appropriate value in both boxes to simulate a value coming from the available literature. Note: if you select the optimized option, you are not required to enter a value (the values would be used as a starting point for the optimization).

    Counts: Enter the number of individuals that died in the experiment with dose 0, to evaluate the natural mortality. If no experiment is available you may want to enter an appropriate value in both boxes (Tested and Counts) to simulate a value coming from the available literature Note: if you select the optimized option, you are not required to enter a value (the values would be used as a starting point for the optimization).

    NB: If some values are entered, then Tested must be greater than Counts.

    Natural mortality parameter:

    o Optimized: Select this option if you want XLSTAT-Dose to optimize the mortality parameter as well as the other model parameters.

    o Fixed: Select this option if you want XLSTAT-Dose to take into account the natural mortality parameter computed with the values entered in the Tested and Counts boxes. With this option, the mortality parameter will stay unchanged while the model parameters are optimized.

    Results:

    o Real values: Select this option if you want the results (charts and tables) to take the natural mortality into account.

    o Dose effect: Select this option if you want XLSTAT-Dose to isolate the natural mortality effect so that the charts and tables show only the dose effect.

    OK: Click on this button to start the computations.

    Cancel: Click on this button to close the dialog box.

    Help: Click on this button to activate XLSTAT-Dose online help.

    Results

    Summary Statistics for Quantitative variables: Table displaying the mean and the standard error for the quantitative explanatory variables.

  • 14

    Summary Statistics for Qualitative variables: Table displaying the number of different categories, the name of each category, and the respective frequency for all the qualitative explanatory variable.

    Model parameters: Table displaying the estimator for each parameter of the model. The standard error of the estimator, the corresponding Chi-square value and the corresponding probability are also included. If the probability is low, it means the parameter brings a significant amount of information to the model. If it is high, removing the corresponding variable would have little effect on the quality of the fit of the model.

    Natural Mortality: If the mortality option is activated in the main dialog box, and if the Fixed option has been selected in the mortality dialog box, the value displayed is the Counts/Tested ratio. If the Optimized option has been selected in the mortality dialog box, the value displayed is the optimized value.

    Goodness of fit:

    Observations: the total number of observations taken into account to estimate the model parameters (sum of the weights);

    Log likelihood: the logarithm of the likelihood function (the higher, the better the model). Note: the parameters estimators are computed by maximizing the likelihood function;

    Log likelihood (indep): the logarithm of the likelihood function corresponding to the independent model. Note: the independent model is the constant model where the probability is equal to the average probability for the event to occur computed for the selected data; the independent model can be interpreted as the case where no information is available; the greater the difference between the Log likelihood and the Log likelihood (indep), the more information the selected variables bring to the model

    Pearsons Chi-Sq: measures the Chi-square distance between the observed frequencies and the predicted frequencies. The lower the value, the better the fit;

    Pearsons DF: the degrees of freedom of the Chi-square distribution associated to the Pearsons Chi-Sq. (DF = sum of the weights number of parameters used in the model);

    Prob>Pearson's Chi-Sq: the probability corresponding to the Pearsons Chi-Sq. This value gives the probability of being wrong when saying that the explanatory variables bring significant information to explain the observed values;

    L.R. Chi-Sq: the Log ratio between the likelihood and the likelihood (indep) - the exact formula is 2.Log[likelihood indep / likelihood];

    DF (L.R. Chi-Sq): the degrees of freedom of the Chi-Square distribution corresponding to the L.R. Chi-Sq value;

    Prob>L.R. Chi-Sq: the probability corresponding to the L.R. Chi-Sq. This value gives the probability of being wrong when saying that the explanatory variables bring significant information compared to the independent model.

    R: the determination coefficient (R-Square) for the observed and predicted values. Not as well suited as for linear regression;

    R (McFadden): a modified R which is better suited for this kind of models. As the R the McFaddens R is contained between 0 and 1.

  • 15

    Predictions and residuals: Table giving for each observation, the input data and the outputs of the selected model, including the estimated probability (model and independent model), the residuals, and the reduced residuals.

    Chart: Displayed only if there is one quantitative explanatory variable. Displays the input data and the curve corresponding to the fitted model.

    Dose analysis with the fitted model: This table is displayed only when there is one quantitative variable and no qualitative variable. The table presents the various doses (and the corresponding confidence intervals) corresponding to different probabilities for the event to occur. These values are commonly called Effective Doses (EDs), or Log Effective Doses when the log of the dose is used in the model (LDs). Example: The ED50 is the dose which, based on the model, corresponds to a 50% probability for the event to occur. The Heterogeneity factor is computed when the Probability (Prob>Pearson's Chi-Sq) is lower than 0.1, in which case a second set of confidence intervals is added. These values cannot always be computed because of numerical constraints.

    Missing values

    If is some missing data are detected in the weights, the variable(s) to model or the explanatory variables, you can choose to either remove them or to estimate them. This is true for the observations used to build the model, and for the supplementary observations.

    If you choose to replace the missing data, the mean is used for the quantitative explanatory variables, the variable(s) to model and the weights. For the qualitative variables, the mode of the variable is used.

    Example

    An example of dose analysis based on experimental data is available on the Addinsoft web site. The experiment in question was designed to determine the effect of different insecticide doses on a group of insects. To consult the tutorial, please go to:

    http://www.xlstat.com/demo-dose.htm

    To know more about it

    Abbott W.S. (1925). A method for computing the effectiveness of an insecticide. Jour. Econ. Entomol. 18: 265-267.

    Finney D.J. (1971). Probit Analysis. 3rd ed., Cambridge, London and New-York.

    Tallarida R.J. (2000). Drug Synergism & Dose-Effect Data Analysis, CRC/Chapman & Hall, Boca Raton.

  • 16

    Parallel four parameter fit

    Use this tool to analyze the effect of a quantitative variable on a response variable using the four parameter logistic model. XLSTAT-Dose enables you to take into account some standard data while fitting the model.

    Description

    The Parallel four parameter fit tool relies on the four parameter logistic model to determine the optimal value of the parameters, using the least squares method, given a sample of data. If a standard sample has included in the analysis, it will take into account the constraints related to that sample.

    The four parameter logistic model writes:

    b

    cx

    aday

    +

    --=

    1 (1)

    where a, b, c, d are the parameters of the model, and where x corresponds to the explanatory variable and y to the response variable. The a and d are parameters respectively represent the lower and upper asymptotes, and b is the slope parameter. c is the abscissa of the mid-height point which ordinate is (a+b)/2. When a is lower than d, the curve decreases from d to a, and when a is greater than d, the curve increases from a to d.

    The parallel four parameter logistic model writes:

    b

    cx

    spcx

    st

    aday

    ++

    --=

    211

    (2)

    where st is 1 if the observation comes from the standard sample (STD), and 0 if not, and where sp is 1 if the observation is from the sample of interest (SOI), and 0 if not. This is a constrained model because the observations corresponding to the standard sample influence the optimization of the values of a, b, and d. From the above writing of the model, one can understand that this model generates two parallel curves, which only difference is the positioning of the curve, the shift being given by (c2-c1). If c2 is greater than c1, the curve corresponding to the sample of interest is shifted to the right of the curve corresponding to the standard sample, and vice-versa.

    XLSTAT-Dose allows to fit

    a) model (1) to a standard sample or to the sample of interest,

  • 17

    b) model (2) to both STD and SOI at the same time.

    If the Dixons test option is activated, XLSTAT-Dose can test for each sample (STD and SOI) if some outliers influence too much the fit of the model. In the (a) case, a Dixons test is performed once the model (1) is fitted. If an outlier is detected, it is removed, and the model is fitted again, and so on until no outlier is detected. In the (b) case, we first perform a Dixons test on the STD, than on the SOI, and then, the model (2) is fitted on the merge of the two samples, without the outliers.

    In the (b) case, and if the sum the of the sample sizes is greater than 9, a Fishers F test is performed to detect if the a, b and d parameters obtained for both models with the model (1) are not significantly different from those obtained with model (2).

    Elements of the dialog box

    Empty dialog box

    The XLSTAT-Dose dialog box contains the following elements:

    Left-hand side section:

    Response variable: Select the data that correspond to the response variable. The data must correspond to a quantitative variable.

    Quantitative variable : Select the quantitative explanatory variable that you want to include in the model, typically the concentration or dilution variable.

  • 18

    Indicator variable : Select the variable that describes whether the data belong to the standard sample (value 1), or to the sample of interest (value 0). Note that if you want the model not to take into account an observation, you only need to set the value of the indicator variable to 1.

    Column labels: Select this option if the first row of the selected variables (response variable and explanatory variable) contains labels.

    Observation labels: Select the rows labels if available.

    Right -hand side section:

    Range: Results are displayed from the cell of an existing sheet. Once you choose this option, select in the corresponding box the cell that will correspond to the top left corner of the results tables.

    Sheet: Results are displayed in a new sheet of the active workbook.

    Workbook: Results are displayed in a new workbook.

    OK: Click this button to start the computations.

    Cancel: Click this button to close the dialog box.

    Help: Click this button to activate the XLSTAT-Dose online help.

    Chart: Activate this option to display the chart. A chart is displayed only when there is one quantitative explanatory variable.

    Dixons test: Activate this option to display the chart. A chart is displayed only when there is one quantitative explanatory variable.

    More / Less: click on "More" to display the advanced options of the dialog box, or on "Less" to resize it to the initial size.

    Confidence interval (%): The value (90, 95, or 99) used to determine the confidence range for the Dixons test. Default value is 95.

    Conditions to stop: When XLSTAT-Dose performs the optimization of the parameters of the model, it will stop if one of the two following criteria is met.

    q Iterations: Computations stop when they have reached that number of iterations. Default value is 500.

    q Convergence: Iterations stop when the change of the optimization criterion (sum of square of residuals) from one iteration to the next, is lower than this value. Default value is 10e-5.

  • 19

    Results

    Data description: Table displaying the frequency, the mean, and the standard deviation of both the response and quantitative variable.

    Results for the model: Table displaying the theoretical equation of the model, the equation with the optimized parameters, the R of the model, and the corresponding sum of squares of residuals.

    Goodness of fit coefficients: Table displaying, for the fitted model, the coefficient of correlation (R), the coefficient of determination (R), and the corresponding sum of squares of residuals (SSR).

    Fisher's test assessing parallelism between curves: The Fishers F test is used to determine if one can consider that the standard sample and the sample of interest have significantly different a, b and d parameters or not. If the probability corresponding to the F value is lower than the significance level, then one can consider that the difference is significant. The significance level is determined using the default value set in the options "Calculations" panel.

    Model parameters: Table displaying the estimator and the standard error of the estimator for each parameter of the model.

    Ignored outliers: Table displaying the observations that have not been used when optimizing the model, because the Dixons test detected them as being outliers. The table displays for each observation, the input data, the prediction using the optimized model, and corresponding residual and standardized residual.

    Predictions and residuals: Table giving for each observation the input data and corresponding prediction, the residuals, and the standardized residuals.

    Charts: The data that been removed with the Dixons test are not displayed on any of the charts. On the first chart, in blue color are displayed the data and the curve corresponding to the standard sample, and in red color are displayed the data and the curve corresponding to the sample of interest. The standardized residuals versus the explanatory variable and versus the observed response variable are displayed on the second and third chart. The fourth chart is the histogram of the standardized residuals.

    Missing values

    Missing data are not accepted with this tool. However, if you want that some observations are not taken into account during the computations, you can set corresponding values of the indicator variable to 1.

    Example

    An example of parallel four parameter fit based on experimental data is available on the Addinsoft web site. The experiment in question was designed to determine the effect of the log

  • 20

    concentration of a chemical component on the optical density. To consult the tutorial, please go to:

    http://www.xlstat.com/demo-4pl.htm

    To know more about it

    Dixon W.J. (1953). Processing data for outliers, Biometrics, 9 : 74-89.

    Tallarida R.J. (2000). Drug Synergism & Dose-Effect Data Analysis, CRC/Chapman & Hall, Boca Raton.