42
Basic Methods in Theoretical Biology 1 Methodology 2 Mathematical toolkit 3 Models for processes 4 Model-based statistics http://www.bio.vu.nl/thb/course/tb/tb

Basic Methods in Theoretical Biology

  • Upload
    kasa

  • View
    40

  • Download
    2

Embed Size (px)

DESCRIPTION

Basic Methods in Theoretical Biology. 1 Methodology 2 Mathematical toolkit 3 Models for processes 4 Model-based statistics. http://www.bio.vu.nl/thb/course/tb/tb.pdf. Empirical cycle 1.1. Assumptions summarize insight 1.1. task of research: make all assumptions explicit - PowerPoint PPT Presentation

Citation preview

Page 1: Basic Methods  in  Theoretical Biology

Basic Methods in

Theoretical Biology

1 Methodology 2 Mathematical toolkit 3 Models for processes 4 Model-based statistics

http://www.bio.vu.nl/thb/course/tb/tb.pdf

Page 2: Basic Methods  in  Theoretical Biology

Empirical cycle 1.1

Page 3: Basic Methods  in  Theoretical Biology

Assumptions summarize insight 1.1

• task of research: make all assumptions explicit these should fully specify subsequent model formulations

• assumptions: interface between experimentalist theoretician

• discrepancy model predictions measurements: identify which assumption needs replacement

• models that give wrong predictions can be very useful to increase insight

• structure list of assumptions to replacebility (mind consistency!)

Page 4: Basic Methods  in  Theoretical Biology

Model: definition & aims 1.1

• model: scientific statement in mathematical language “all models are wrong, some are useful”

• aims: structuring thought; the single most useful property of models: “a model is not more than you put into it” how do factors interact? (machanisms/consequences) design of experiments, interpretation of results inter-, extra-polation (prediction) decision/management (risk analysis)

• observations/measurements: require interpretation, so involve assumptions best strategy: be as explicitly as possible in assumptions

Page 5: Basic Methods  in  Theoretical Biology

Model properties 1.1

• language errors: mathematical, dimensions, conservation laws

• properties: generic (with respect to application) realistic (precision; consistency with data) simple (math. analysis, aid in thinking) complex models are easy to make, difficult to test simple models that capture essence are difficult to make

plasticity in parameters (support, testability)

• ideals: assumptions for mechanisms (coherence, consistency) distinction action variables vs measured quantities need for core and auxiliary theory

Page 6: Basic Methods  in  Theoretical Biology

Modelling 1 1.1

• model: scientific statement in mathematical language “all models are wrong, some are useful”

• aims: structuring thought; the single most useful property of models: “a model is not more than you put into it” how do factors interact? (machanisms/consequences) design of experiments, interpretation of results inter-, extra-polation (prediction) decision/management (risk analysis)

Page 7: Basic Methods  in  Theoretical Biology

Modelling 2 1.1

• language errors: mathematical, dimensions, conservation laws

• properties: generic (with respect to application) realistic (precision) simple (math. analysis, aid in thinking) plasticity in parameters (support, testability)

• ideals: assumptions for mechanisms (coherence, consistency) distinction action variables/meausered quantities core/auxiliary theory

Page 8: Basic Methods  in  Theoretical Biology

Presumptions Laws 11.1

LawsTheoriesHypothesesPresumptions

decrease in demonstrated supportamount of support is always limitedProofs only exist in mathematics

role of abstract concepts

0 large

“facts” “general theories”

no predictions possible predictions possible

Page 9: Basic Methods  in  Theoretical Biology

Theories Models 1.1

Theory: set of coherent and consistent assumptions from which models can be derived for particular situations

Models may or may not represent theories it depends on the assumptions on which they are based

If a model itself is the assumption, it is only a description if it is inconsistent with data, and must be rejected, you have nothingIf a model that represents a theory must be rejected, a systematic search can start to assumptions that need replacementUnrealistic models can be very useful in guiding research to improve assumptions (= insight)Many models don’t need to be tested against data because they fail more important consistency testsTestability of models/theories comes in gradations

Page 10: Basic Methods  in  Theoretical Biology

Auxiliary theory 1.1

Quantities that are easy to measure (e.g. respiration, body weight) have contributions form several processes they are not suitable as variables in explenatory models

Variables in explenatory models are not directly measurable we need auxiliary theory to link core theory to measurements

Standard DEB model: isomorph with 1 reserve & 1 structure that feeds on 1 type of food

Page 11: Basic Methods  in  Theoretical Biology

Measurements typicallyinvolve interpretations, models 1.1

Given: “the air temperature in this room is 19 degrees Celsius” Used equipment: mercury thermometer

Assumption: the room has a temperature (spatially homogeneous)Actual measurement: height of mercury columnHeight of the mercury column temperature: model! How realistic is this model? What if the temperature is changing?

Task: make assumptions explicit and be aware of themQuestion: what is calibration?

Page 12: Basic Methods  in  Theoretical Biology

Complex models 1.1

• hardly contribute to insight

• hardly allow parameter estimation

• hardly allow falsification

Avoid complexity by

• delineating modules

• linking modules in simple ways

• estimate parameters of modules only

Page 13: Basic Methods  in  Theoretical Biology

Causation 1.1

Cause and effect sequences can work in chains A B C

But are problematic in networks A

B C Framework of dynamic systems allow for holistic approach

Page 14: Basic Methods  in  Theoretical Biology

Dimension rules 1.2

• quantities left and right of = must have equal dimensions

• + and – only defined for quantities with same dimension

• ratio’s of variables with similar dimensions are only dimensionless if addition of these variables has a meaning within the model context

• never apply transcendental functions to quantities with a dimension log, exp, sin, … What about pH, and pH1 – pH2?

• don’t replace parameters by their values in model representations y(x) = a x + b, with a = 0.2 M-1, b = 5 y(x) = 0.2 x + 5 What dimensions have y and x? Distinguish dimensions and units!

Page 15: Basic Methods  in  Theoretical Biology

Models with dimension problems 1.2

• Allometric model: y = a W b

y: some quantity a: proportionality constant W: body weight b: allometric parameter in (2/3, 1) Usual form ln y = ln a + b ln W Alternative form: y = y0 (W/W0 )b, with y0 = a W0

b

Alternative model: y = a L2 + b L3, where L W1/3

• Freundlich’s model: C = k c1/n

C: density of compound in soil k: proportionality constant c: concentration in liquid n: parameter in (1.4, 5) Alternative form: C = C0 (c/c0 )1/n, with C0 = kc0

1/n

Alternative model: C = 2C0 c(c0+c)-1 (Langmuir’s model)

Problem: No natural reference values W0 , c0

Values of y0 , C0 depend on the arbitrary choice

Page 16: Basic Methods  in  Theoretical Biology

Egg development time 1.2

Bottrell, H. H., Duncan, A., Gliwicz, Z. M. , Grygierek, E., Herzig, A., Hillbricht-Ilkowska, A., Kurasawa, H. Larsson, P., Weglenska, T. 1976 A review of some problems in zooplankton production studies.Norw. J. Zool. 24: 419-456

)))(ln(3414.0)ln(2193.03956.3exp( 2TTD

Kelvinin etemperatur t timedevelopmen egg

TD

2

2

)(ln

ln)dim(

ln

ln)dim(

ln)dim()))(ln()ln(exp(

K

tc

K

tb

taTcTbaD

Page 17: Basic Methods  in  Theoretical Biology

molecule

cell

individual

population

ecosystem

system earth

time

spac

e

Space-time scales 1.3

When changing the space-time scale, new processes will become important other will become less importantModels with many variables & parameters hardly contribute to insight

Each process has its characteristic domain of space-time scales

Page 18: Basic Methods  in  Theoretical Biology

Problematic research areas 1.3

Small time scale combined with large spatial scaleLarge time scale combined with small spatial scale

Reason: likely to involve models with large number of variables and parameters

Such models rarely contribute to new insight due to uncertainties in formulation and parameter values

Page 19: Basic Methods  in  Theoretical Biology

Different models can fit equally well 1.5

Length, mmO2 c

onsu

mpt

ion,

μl/

h

Two curves fitted:

a L2 + b L3

with a = 0.0336 μl h-1 mm-2

b = 0.01845 μl h-1 mm-3

a Lb

with a = 0.0156 μl h-1 mm-2.437

b = 2.437

Page 20: Basic Methods  in  Theoretical Biology

Plasticity in parameters 1.7

If plasticity of shapes of y(x|a) is large as function of a:

• little problems in estimating value of a from {xi,yi}i

(small confidence intervals)

• little support from data for underlying assumptions

(if data were different: other parameter value results, but still a good fit, so no rejection of assumption) A model can fit data well for wrong reasons

Page 21: Basic Methods  in  Theoretical Biology

Biodegradation of compounds 1.7

n-th order model Monod modelnkXX

dt

d

1)1(10 )1()(

nn ktnXtX

ktXtXn

0

0

)( kXt /0

}exp{)( 0

1

ktXtXn

n

akXaXt

nn

1

1)(

111

00

XK

XkX

dt

d

ktXtXKXtX }/)(ln{)(0 00

ktXtXXK

0

0

)(

}/exp{)( 0

0

KktXtXXK

aKkakXaXt ln)1()( 1100

; ;

X : conc. of compound, X0 : X at time 0 t : time k : degradation rate n : order K : saturation constant

kXt /0

Page 22: Basic Methods  in  Theoretical Biology

Biodegradation of compounds 1.7

n-th order model Monod model

scaled time scaled time

scal

ed c

onc.

scal

ed c

onc.

Page 23: Basic Methods  in  Theoretical Biology

Verification falsification 1.9

Verification cannot work because different models can fit data equally well

Falsification cannot work because models are idealized simplifications of reality “All models are wrong, but some are useful”

Support works to some extend

Usefulness works but depends on context (aim of model) a model without context is meaningless

Page 24: Basic Methods  in  Theoretical Biology

Model without dimension problem 1.2

Arrhenius model: ln k = a – T0 /Tk: some rate T: absolute temperaturea: parameter T0: Arrhenius temperature

Alternative form: k = k0 exp{1 – T0 /T}, with k0 = exp{a – 1}

Difference with allometric model: no reference value required to solve dimension problem

Page 25: Basic Methods  in  Theoretical Biology

Central limit theorems 2.6

The sum of n independent identically (i.i.) distributed random variables becomes normally distributed for increasing n.

The sum of n independent point processes tends to behave as a Poisson process for increasing n.

yy

YXZ yYPyzXPzZPdyyfyzfzfYXZ )()()(;)()()(

Number of events in a time interval is i.i. Poisson distributedTime intervals between subsequent events is i.i. exponentially distributed

Page 26: Basic Methods  in  Theoretical Biology

Sums of random variables 2.6

)λexp()λ()(

λ)(

)λexp(λ)(

1 yyn

yf

xxf

nY

X

)(Var)(Var;1

i

n

ii XnYXY

)λexp(!

λ)()(

)λexp(!

λ)(

ny

nyYP

xxXP

y

x

Exp

onen

tial p

rob

dens

Poi

sson

pro

b

Page 27: Basic Methods  in  Theoretical Biology

Normal probability density 2.6

2

2 σ

μ

2

1exp

πσ2

1)(

xxf X

μ'μ

2

1exp

π2

1)( 1- xxxf

nX

μ)/σ(x-

σ

σ95%

Page 28: Basic Methods  in  Theoretical Biology

Dynamic systems 3.2

Defined by simultaneous behaviour of input, state variable, outputSupply systems: input + state variables outputDemand systems: input state variables + outputReal systems: mixtures between supply & demand systemsConstraints: mass, energy balance equationsState variables: span a state space behaviour: usually set of ode’s with parametersTrajectory: map of behaviour state vars in state spaceParameters: constant, functions of time, functions of modifying variables compound parameters: functions of parameters

Page 29: Basic Methods  in  Theoretical Biology

Statistics 4.1

Deals with• estimation of parameter values, and confidence in these values• tests of hypothesis about parameter values differs a parameter value from a known value? differ parameter values between two samples?

Deals NOT with• does model 1 fit better than model 2 if model 1 is not a special case of model 2

Statistical methods assume that the model is given(Non-parametric methods only use some properties of the given model, rather than its full specification)

Page 30: Basic Methods  in  Theoretical Biology

Stochastic vs deterministic models 4.1

Only stochastic models can be tested against experimental data

Standard way to extend deterministic model to stochastic one: regression model: y(x| a,b,..) = f(x|a,b,..) + e, with e N(0,2)Originates from physics, where e stands for measurement error

Problem: deviations from model are frequently not measurement errorsAlternatives:• deterministic systems with stochastic inputs• differences in parameter values between individualsProblem: parameter estimation methods become very complex

Page 31: Basic Methods  in  Theoretical Biology

Stochastic vs deterministic models 4.1

Tossing a die can be modeled in two ways• Stochastically: each possible outcome has the same probability• Deterministically: detailed modelling of take off and bounching, with initial conditions; many parametersImperfect control of process makes deterministic model unpractical

Page 32: Basic Methods  in  Theoretical Biology

Large scatter 4.1

• complicates parameter estimation

• complicates falsification

Avoid large scatter by

• Standardization of factors that contribute to measurements

• Stratified sampling

Page 33: Basic Methods  in  Theoretical Biology

Kinds of statistics 4.1

Descriptive statistics sometimes useful, frequently boring

Mathematical statistics beautiful mathematical construct rarely applicable due to assumptions to keep it simple

Scientific statistics still in its childhood due to research workers being specialised upcoming thanks to increase of computational power (Monte Carlo studies)

Page 34: Basic Methods  in  Theoretical Biology

Tasks of statistics 4.1

Deals with• estimation of parameter values, and confidence of these values• tests of hypothesis about parameter values differs a parameter value from a known value? differ parameter values between two samples?

Deals NOT with• does model 1 fit better than model 2 if model 1 is not a special case of model 2

Statistical methods assume that the model is given(Non-parametric methods only use some properties of the given model, rather than its full specification)

Page 35: Basic Methods  in  Theoretical Biology

Independent observations 4.1

IIf

If X and Y are independent

Page 36: Basic Methods  in  Theoretical Biology

Statements to remember 4.1

• “proving” something statistically is absurd

• if you do not know the power of your test, you don’t know what you are doing while testing

• you need to specify the alternative hypothesis to know the power this involves knowledge about the subject (biology, chemistry, ..)

• parameters only have a meaning if the model is “true” this involves knowledge about the subject

Page 37: Basic Methods  in  Theoretical Biology

Nested models 4.5

2210)( xwxwwxy

xwwxy 10)( 0)( wxy 220)( xwwxy

Venn diagram

02 w 01 w

Page 38: Basic Methods  in  Theoretical Biology

Error of the first kind: reject null hypothesis while it is true

Error of the second kind: accept null hypothesis while the alternative hypothesis is true

Level of significance of a statistical test: = probability on error of the first kind

Power of a statistical test: = 1 – probability on error of the second kind

Testing of hypothesis 4.5

true false

accept 1 -

reject 1 -

null hypothesis

dec

isio

nNo certainty in statistics

Page 39: Basic Methods  in  Theoretical Biology

Parameter estimation 4.6

Most frequently used method: Maximization of (log) Likelihood

likelihood: probability of finding observed data (given the model), considered as function of parameter values

If we repeat the collection of data many times (same conditions, same number of data)the resulting ML estimate

Page 40: Basic Methods  in  Theoretical Biology

Profile likelihood 4.6

large sampleapproximation

95% conf interval

Page 41: Basic Methods  in  Theoretical Biology

Comparison of models 4.6

Akaike Information Criterion for sample size n and K parameters

12)θ(log2

Kn

nKL

12σlog 2

Kn

nKn

in the case of a regression model

You can compare goodness of fit of different models to the same databut statistics will not help you to choose between the models

Page 42: Basic Methods  in  Theoretical Biology

Confidence intervals 4.6

parameter

estimate

excluding

point 4

sd

excluding

point 4

estimate

including

point 4

sd

including

point 4

L, mm 6.46 1.08 3.37 0.096

rB,d-1 0.099 0.022 0.277 0.023

time, d

leng

th, m

m

ttrLLLtrLLLtL

B

B

smallfor)()exp()()(

00

0

10 LBr

95% conf intervals

correlations amongparameter estimatescan have big effectson sim conf intervals

excludespoint 4

includespoint 4

L