Upload
ashley-anderson
View
214
Download
0
Embed Size (px)
Citation preview
Assessing the Risk of Bias Critical appraisal of medical literature
2010. 9고려대학교 의과대학안 형 식
What is bias?
Systematic error or deviation from the truth may overestimate or underestimate biased studies lead to misleading results can’t measure the presence of bias, only the risk rigorous methods minimise the risk of bias question: “should I believe the results?”
Biases in randomised controlled trials
Bias - A process that tends to produce results that depart systematically from the true values existing in the study popula-tion Selection bias
Avoid with randomisation and concealment Performance bias
Avoid with standardisation of care and blinding Attrition bias
Avoid using all subjects intention to treat analysis Measurement bias
Avoid with blinding of outcome assessors and patients
Study sample
Control
Intervention
Experimental
Intervention
Allocation of subjects
Outcomes
Follow up
Selection biasRandomisation
Concealment
Performance biasStandardisation of care protocol
Blinding of care providers and patients
Measurement biasBlinding of outcome assessors
and patients
Follow up
Attrition biasDrop-outs?
Cross-over?
Is everyone accounted for?
Outcomes
Should we use quality scales?
> 30 available reliability and validity of many scales
not established different scales lead to conflicting con-
clusions may include criteria not related to bias no evidence for numerical weighting of
different elements how do readers interpret the score? not recommended for Cochrane re-
views
The Cochrane approach
describe the following for each study in detail:
random sequence generation allocation concealment blinding incomplete outcome data selective outcome reporting any other risks
empirical research shows that these compon-ents can have a significant effect on results, of-ten leading to exaggerated effects
For each study, in each domain
is there enough information to understand what happened? if not, rate unclear
what is your judgement: are you satisfied that the study is at a low risk of bias? yes indicates a low risk no indicates a high risk based on the context of your review, em-
pirical evidence of bias effect, likely direc-tion and magnitude of effect
Random sequence genera-tion
occurs at the start of a trial before allocation of
participants determines the order of allocation into inter-
vention and control groups avoids systematic differences between
groups accounts for known and unknown con-
founders minimises selection bias
Results of 32 comparative studies of anti-co-agulant therapy for MI patients (Chalmers et al, 1977)
Study design
Apparent
risk reduction Historical controls (18 studies, 9000 subjects)
62% ± 4
Concurrent non-random Controls (8 studies, 3000 subjects)
34% ± 7
Randomised (6 studies, 4000 subjects)
22% ± 8
Identifying a random se-quence
Adequate random number ta-
ble computer random
number generator coin toss shuffling cards or
envelopes throwing dice drawing lots
Not adequate date of birth day of visit ID or record number alternate allocation choice of the clinician
or participant test results intervention availabil-
ity
Allocation concealment
occurs at the start of the trial during allocation of participants
when a person is recruited to the study, no-one knows which group they will be allocated to
protects the random sequence: pre-vents
changing the order of recruitment, or deciding not to recruit
strongest empirical evidence showing this
is important to results minimises selection bias
Selection bias in trial with fore-knowledge of treatment allocation: Amniotomy or oxytocin for induction of labour (Bakose & Backstrom re-analysed by M Keirse)
* indicates an unfavourable cervix: Keirse hypothesised that such patients would be less likely to be entered in the trial if it were known that they would be allocated amniotomy.
This trial was described as the first “prospective randomised study “between amniotomy and oxytocin for induction of labour in a “totally unselected population” !!!
Bishop score at
entry to trial
Allocated oxytocin
(even date of
birth)
Allocated
amniotomy (odd
date of birth)
3 or less* 28 7
4 or 5 56 58
6 or more 29 45
Total 110 113
X 2
P value
16.1
<0.00025
Identifying allocation con-cealment
Adequate central allocation (phone, web,
pharmacy) sequentially numbered, identical
drug containers serially numbered, sealed, opaque
envelopes
Not adequate random sequence known to staff in
advance envelopes without all three safe-
guards non-random, predictable sequence
Blinding occurs during the intervention and
measurement of outcomes minimises performance bias
different treatment of the two groups participant expectations
minimises detection bias different measurement of outcomes between the two
groups subjective outcomes particularly vulnerable
can blind the participant, care provider, outcome asses-sor, other personnel – more than “double blinding”
check for intention and success of blinding
Schulz KF & Grimes DA 2002 Lancet
위약 효과 - 관절염 통증 완화
Identifying blinding
Adequate participants and key study personnel blinded blinding probably not broken outcomes not likely to be influenced
Not adequate any of the above not met
Allocation concealment vs blind-ing
Time
Randomisation
Concealment of allocation Blinding
Selection bias Performance bias
Incomplete outcome data
when complete outcome data is not available for all participants
can indicate attrition bias can have important impact when:
enough data is missing to affect the results: no. of partici-pants missing (dichotomous) or effect size (continuous)
the no. of people missing is not balanced between groups the reason for absence is related to the study outcomes
(e.g. moved away vs adverse event) two causes
loss of participants to follow up exclusion of participants by trialists
추적 탈락 (Losses-to-follow-up)어느 정도가 적당한가 ?
“5 : 20 의 법칙” * 5% 이하는 바이어스가 적을 것임 * 20% 이상은 타당도에 심각한 영향을 줌
→ 그러나 과도한 단순화
→ 비교되는 추적관찰 손실 비율 (Losses-to-follow-up)과 결과발생율 (outcome event rate) 에 의존함
* 추적관찰의 손실율이 결과발생율을 초과하지는 않아야 함
Intention-to-Treat 원칙 무작위화의 유지
원칙 : 대상이 무작위 할당 된후에는 애초에 배정된 군에 따라 분석할 것 , 실험중단 , 치료를 받지 않거나 , crossover 의 경우에도 무작위 할당 분석은 유지되어야 한다 .
예외 : 환자가 무작위 할당 전에 작성된 기준에 따라 맹검 재평가에서 부적당하다고 판명된 경우
Identifying incomplete outcome data
Adequate no missing data reasons for missing data not related to outcome missing data balanced across groups with similar reasons number of participants missing or plausible effect size not
enough to change observed effect
Not adequate any of the above criteria not met ‘as-treated’ analysis with substantial departure from alloc-
ated intervention missing data imputed using inappropriate methods
Selective outcome report-ing when outcomes are not reported as planned
outcomes missing new outcomes added (can be justified, e.g. ad-
verse events) unexpected statistics, subscales or subgroups reporting that cannot be used in a review
can indicate ‘within-study publication bias’ or ‘data mining’
difficult to determine compare methods to results refer to study protocol or trial register look for commonly used outcomes
Identifying selective outcome report-ing
Adequate protocol is available and all pre-specified outcomes
reported in the pre-specified way protocol not available but all expected outcomes are
reported most studies will be judged ‘unclear’ in this
category
Not adequate outcomes not reported as planned or expected limited information provided for some outcomes (e.g.
only direction of effect and significance)
Other potential problems
Adequate study appears to be free of other sources of risk
Not adequate issues specific to the study design
carryover in crossover trials comparability of groups in cluster-randomized tri-
als trial stopped early using data-dependent process
(including a formal stopping rule) extreme baseline imbalance possible fraud other problem
Relative risk reduction (RRR)
Absolute risk reduction (ARR)
NNT (number needed to treat)
NNT 의 의미에는 추적관찰기간이 내포되어 있다 .
NNThypothetical= NNTobserved X (observed time/hy-
pothetical time)
치료효과는 어느정도인가 ?
사건발생율= 33개월까지 장애
의 진행
대조군의
사건발생율
(위약군에서)
(CER = Control
event rate)
실험군의
사건발생율
(인터페론
투여군에서)
(EER =
Experimental
event rate)
상대위험
감소율
(Relative risk
reduction)
(RRR=
CER-EER
/CER
= 1-RR)
절대위험
감소율
(Absolute risk
reduction)
(ARR=
CER-EER )
Number
needed to
treat
(NNT
=1/ARR)
실제 임상시험례
(인터페론연구)
Lancet 1998; 352:
1491-7
50 % 39% (50%-39%)
/50%
= 22%
50%-39%
= 11%
1/11%=9
효과가 미약한
가상례
0.00050% 0.00039% (00050% -
0.00039%)
/0.00050%
=22%
0.00050%
- 0.00039%
= 0.00011%
1/0.00011%
= 909090
Interferon for multiple sclerosis
Risk of bias assessment in Cochrane re-views
Risk of bias sum-mary
Here ‘Blinding’ and ‘In-complete outcomes data’ have been assessed for two sets of outcomes
Risk of bias graph
The Newcastle-Ottawa Scale (NOS) for Assessing the Qual-ity of Nonrandomized Studies
in Meta-Analysis
Development
Applications
Current Develop-ments
Development: Item Selection
Newcastle quality assessment form Ottawa comprehensive list Panel review Critical review by experts
Development: Grouping Items
Cohort studies Selection of cohorts Comparability of cohorts Assessment of outcome
Case-Control studies Selection of case and controls Comparability of cases and controls Ascertainment of exposure
Development: Identifying Items
Identify ‘high’ quality choices with a ‘star’
A maximum of one ‘star’ for each h item within the ‘Selection’ and ‘Exposure/Outcome’ categories; maximum of two ‘stars’ for ‘Com-parability’
NEWCAS TLE - O TTAW A Q UALITY ASS ESS MENT SCALECO HORT S TUDIES
Note: A study can be awarded a ma ximum of one star for each numbered item within the Selection andOutcome categories. A maximum of two stars can be given for Comparability
Selection1) Representativeness of the exposed cohort
a) truly representative of the average _______________ (describe) in the community b ) somewhat representative of the average ______________ in the community c) selected group of users eg nurses, volunteersd) no description of the derivation of the cohort
2) Selection of the non exposed cohorta) drawn from the same community as the exposed cohort b) drawn from a different sourcec) no description of the derivation of the non exposed cohort
3) Ascertainment of exposurea) secure record (eg surgical records) b) structured interview c) written self reportd) no description
4) Demonstration that outcome of interest was not present at start of studya) yes b) no
Compara bility1) Comparability of cohorts on the basis of the design or analysis
a) study controls for _____________ (select the most important factor) b) study controls for any additional factor (This criteria could be modified to indicate specific
control for a second important factor.)
Outcome1) Assessment of outcome
a) independent blind assessment b ) record linkage c) self reportd) no description
2) Was follow-up long enough for outcomes to occura) yes (select an adequate follow up period for outcome of interest) b) no
3) Adequacy of follow up of cohortsa) complete follow up - all subjects accounted for
b ) sub jects lost to follow up unlikely to introduce bias - small number lost - > ____ % (select an adequate %) follow up, or description provided of those lost)
c) follow up rate < ____% (select an adequate %) and no description of those lostd) no statement
Newcastle-Ottawa Quality Assessment Scale: Cohort Studies
Selection (4)
Comparability (1)
Outcome (3)
A study can be awarded a maximum of one star for each numbered item within the Selection and out-come categories. A maximum of two stars can be given for Comparability
Selection1. Representativeness of the exposed cohort a) truly representative of the average ___________ (describe) in the commu-
nity b) somewhat representative of the average ___________ in the community c) selected group of users eg nurses, volunteers d) no description of the derivation of the cohort
2. Selection of the non exposed cohort a) drawn from the same community as the exposed cohort b) drawn from a different source c) no description of the derivation of the non exposed cohort
3. Ascertainment of exposure to implants a) secure record (eg surgical records) b) structured interview c) written self report d) no description
4. Demonstration that outcome of interest was not present at start of study a) yes b) no
In the case of mortality stud-ies, outcome of interest is still the presence of a dis-ease/ incident, rather than death; that is a statement of no history of disease or
incident earns a star
Comparability
1. Comparability of cohorts on the basis of the design or analysis
a) study controls for ___________ (select
the most important factor) b) study controls for any additional factor
(This criteria could be modified to indi-cate specific control for a second impor-tant factor.)
Outcome1. Assessment of outcome a) independent blind assessment b) record linkage c) self report d) no description
2. Was follow up long enough for outcomes to occur a) yes (select an adequate follow up period for outcome of
interest) b) no
3. Adequacy of follow up of cohorts a) complete follow up - all subjects accounted for b) subjects lost to follow up unlikely to introduce bias - small
number lost - > ___ % (select an adequate %) follow up, or description of those lost)
c) follow up rate < ___% (select an adequate %) and no de-scription of those lost d) no statement
N EWCAS TLE - O TTAW A Q UALITY ASS ESS MENT SCA LECAS E CON TRO L S TUD IES
Note: A study can be awarded a ma ximum of one star for each numbered item within the Selection andExposure categories. A maximum of two stars can be given for Comparability.
Selection
1) Is the case definition adequate?a) yes, with independent validation b) yes, eg record linkage or based on self reportsc) no description
2) Representativeness of the casesa) consecutive or obviously representative series of cases b) potential for selection biases or not stated
3) Selection of Controlsa) community controls b) hospital controlsc) no description
4) Definition of Controlsa) no history of disease (endpoint) b) no description of source
Compara bility
1) Comparability of cases and controls on the basis of the design or analysisa) study controls for _______________ (Select the most important factor.) b) study controls for any additional factor (This criteria could be modified to indicate specific
control for a second important factor.)
Exposure
1) Ascertainment of exposure
a) secure record (eg surgical records) b) structured interview where blind to case/control sta tus c) interview not blinded to case/control status
d) written self report or medical record only
e) no description
2) Same method of ascertainment for cases and controlsa) yes b) no
3) Non-Response ratea) same rate for both groups b) non respondents describedc) rate different and no designation
Newcastle-Ottawa Quality Assessment Scale: Case-Control Studies
Selection (4)
Comparability (1)
Exposure (3)
A study can be awarded a maximum of one star for each numbered item within the Selection and Expo-sure categories. A maximum of two stars can be given for Comparability
1. Is the case definition adequate? a) yes, with independent validation b) yes, eg record linkage or based on self reports c) no description
2. Representativeness of the cases a) consecutive or obviously representative series of cases b) potential for selection biases or not stated
3. Selection of Controls a) community controls b) hospital controls c) no description
4. Definition of Controls a) no history of disease (endpoint) b) no description of source
Selection
>1 person/record/time/process to extract information, or
reference to primary record source such as x-rays or medical/hospital records
e.g. ICD codes in database or self-report with no
reference to primary record or no description
Comparability
1. Comparability of cases and controls on the basis of the design or analysis
a) study controls for ___________ (select
the most important factor) b) study controls for any additional factor
(This criteria could be modified to indi-cate specific control for a second impor-tant factor.)
Exposure
1. Ascertainment of exposure a) secure record (eg surgical records) b) structured interview where blind to case/control status c) interview not blinded to case/control status d) written self report or medical record only e) no description
2. Same method of ascertainment for cases and controls a) yes b) no
3. Non-Response Rate a) same rate for both groups b) non respondents described c) rate different and no designation
Applications:
Assess quality of nonrandomized studies
Incorporate assessments in inter-pretation of meta-analytic results
Design, content and ease of use
Long Term Hormone Re-placement Therapy and Coronary Heart Disease
Events
• Clearly formulated question• Comprehensive data search• Unbiased selection and abstraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup analy-
ses if appropriate and possible• Prepare a structured report
Steps of a Cochrane Systematic Review
Objective
Is there a relationship between hormone replacement therapy and the incidence of coronary heart disease in post-menopausal women
Inclusion Criteria
Types of studies case-control, cohort or cross-sectional studies
Population postmenopausal women
Intervention women exposed to hormone replacement therapy (e-
strogen or estrogen + progesterone) ever, current, past
Outcomes coronary heart disease (events) fatal, non-fatal, both
• Clearly formulated question• Comprehensive data search• Unbiased selection and abstraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup analy-
ses if appropriate and possible• Prepare a structured report
Steps of a Cochrane Systematic Review
Search Strategy
Electronic Search of: MEDLINE (1966 to May 2000) Current Contents (to May 2000)
Other Data Sources: review of references cited in retrieved arti-
cles
• Clearly formulated question• Comprehensive data search• Unbiased selection and abstraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup analy-
ses if appropriate and possible• Prepare a structured report
Steps of a Cochrane Systematic Review
Data Extraction
2 independent reviewers selected trials 2 independent reviewers extracted data
using pre-determined forms study design population characteristics exposure to implants outcomes measures results
differences resolved by consensus
Results
16 case-control or cross-sectional 14 cohort
Quantification of Effects
Exposure (ever, current, past) Outcome (fatal, non-fatal, both) Effect estimates (EE)
• Relative Risk (RR) • Odds Ratio (OR)
Adjusted effect estimates Effects vs population, follow-up periods,
etc. (homogeneity)
• Clearly formulated question• Comprehensive data search• Unbiased selection and abstraction
process• Critical appraisal of data• Synthesis of data• Perform sensitivity and subgroup analy-
ses if appropriate and possible• Prepare a structured report
Steps of a Cochrane Systematic Review
Avila / 90
Cauley / 97
Grodstein / 96
Henderson / 91
Lafferty / 94
Wilson / 85
Wolf / 96
Lauritzen / 83
Ettinger / 96
Petitti / 87
Sourander / 98
Bush / 87
Criqui / 98
Folsom / 95
Cohort Star Template
Selection Comparability Outcome
Adam / 81
Beard / 89
Croft / 89
Grodstein / 97
Heckbert / 97
LaVecchia / 87
Mann / 94
Pfeffer / 78
Rosenberg / 76
Rosenberg / 80
Rosenberg / 93
Ross / 81
Sidney / 97
Szklo / 84
Talbott / 77
Thompson / 89
Case-Control Star Template
Selection Comparability Exposure
Adjusted Effect Estimates for Coronary Heart Dis-ease
(All Events) (HRT: Estrogen Current Use)Case-Control Studies
Selection Comparability Exposure
Rosenberg / 76
Talbott / 77
Pfeffer / 78
Rosenberg / 80
Heckbert / 87
LaVecchia / 87
Rosenberg / 93
Mann / 94
Grodstein / 97
Sidney / 97
0.01 0.1 1 10
Adjusted Effect Estimates for Coronary Heart Dis-ease
(All Events) (HRT: Estrogen Past Use)Case-Control Studies
Selection Comparability Exposure
Rosenberg / 80
Heckbert / 87
LaVecchia / 87
Grodstein / 97
Sidney / 97
0.1 1 10
Adjusted Effect Estimates for Coronary Heart Dis-ease
(All Events) (HRT: Estrogen Ever Use)Case-Control Studies
Selection Comparability Exposure
Pfeffer / 78
Rosenberg / 80
Ross / 81
Szklo / 84
Heckbert / 87
LaVecchia / 87
Beard / 89
Croft / 89
Thompson / 89
Rosenberg / 93
0.1 1 10
Adjusted Effect Estimates for Coronary Heart Dis-ease
(All Events) (HRT: Estrogen + Progestin Ever Use)Case-Control Studies
Selection Comparability Exposure
Heckbert / 87
Thompson / 89
Rosenberg / 93
0.1 1 10
Adjusted Effect Estimates for Coronary Heart Dis-ease
(All Events) (HRT: Estrogen Current Use)Cohort Studies
Selection Comparability Outcome
Bush / 87
Avila / 90
Folsom / 95
Grodstein / 96
Cauley / 97
Criqui / 98
Sourander / 98
0.01 0.1 1 10
Adjusted Effect Estimates for Coronary Heart Dis-ease
(All Events) (HRT: Estrogen Ever Use)Cohort Studies
Selection Comparability Outcome
Lauritzen / 83
Wilson / 85
Petitti / 87
Henderson / 91
Lafferty / 94
Folsom / 95
Ettinger / 96
Wolf / 96
0.01 0.1 1 10
Current Development: Validity
Face/content validity Criterion validity
compare to more comprehensive scales
compare to expert judgement
Construct validity external criteria
‘convergent validity’ ‘divergent validity’
internal structure ‘factorial validity’
Current Development: Reliabil-ity
Inter-rater reliability Intra-rater reliability
Future Development: Scoring
Identify threshold score distin-guishing between ‘good’ and ‘poor’ quality studies
The Newcastle-Ottawa Scale (NOS) for Assessing the Qual-ity of Nonrandomized Studies
in Meta-Analysiswww.lri.ca
NOS Quality Assessment Scales:Case-control studiesCohort studies
Manual for NOS Scales